A creative-in-training

Single Cell Phenotyping

Cancer is a readily discussed topic and research is abundant in identifying, classifying, diagnosing and treating cancer. The focus of this project is to develop a method and a process to deal with the first two parts of that list. There are several assumptions that have been made along the way to creating the pipeline that starts with raising cells to classifying them. Throughout the process these assumptions play an important role towards to final goal. The most important of which is the fact that cellular heterogeneity is important for cancer research.

The aim of this project is to:

1. Develop a workflow for studying heterogeneity

2. Provide analysis tools

3. Develop a method to understand and visualize heterogeneity

This work was carried out under the guidance of Professor Jens Rittscher and working along the Nijman Lab in the Target Discovery Institute and Oxford Institute of Biomedical Engineering.

How is the report structured?

The first section discussed cancer and provides a basic understanding of the problem that we are trying to tackle. It provides and initial insight into what is then discussed in the second section: tumour heterogeneity. Tumour heterogeneity represents the essence of this entire project. The ideas of subpopulations, and the role the phenotype plays are discussed which then lead onto the third section: "Why is the Phenotype important?".

The third section is the first part of the process and model that was created during this process. We were able to choose a form of analysis over the multiple options that were available, while taking into account several criteria the most important of which were reproducibility and upscaling. In this section, the workflow that I undertook to obtain the images that are used to get raw data is also detailed. The method of staining and then imaging are key in not only this process but for the final goal of making this process automated.

The fourth section then discusses the idea of the phenotypic landscape that is the final result of the project. This landscape is not only a visualization technique, but it also indicated the need for more data and increased efficiency in processing and analysing the data so that a clear and continuous landscape would be realised.

The fifth section introduces and details Principal Component Analysis(PCA), which is a core to this project. The following section is where I detail how PCA was actually used, and the creation of my method to use PCA as not only dimensionality reduction technique but also to gain useful information from it. I used Matlab to create an algorithm that processed over 20,000 samples and found key features that would be able to identify classify the population. The process could easily be used for large data and would be an asset to the final goal of automating the method.

After identifying the features that make the population unique, the process aims to identify the subpopulations in that population as per the idea of tumour heterogeneity discussed previously. Two disparate clustering algorithms were adapted in Python to help classify the population. Interesting and useful results came from both that then finally helped create a visualization method discussed in the ninth section.

Finally, in the conclusion, future work is discussed and what the next steps might be in fully automating the process that I have nurtured throughout the project.

5colour.png
Workflow.jpg