NCI 5R01-CA121225





In this R01 project, we propose to develop a new generation of computational tools named G-CELLIQ (Genomic CELLular Imaging Quantitator).
G-CELLIQ aims at filling the gap of integrated automatic data analysis tools for high-content screening (HCS) data analysis, especially RNA interference HCS.

G-CELLIQ provides an integrated cell image processing pipeline using advanced computational algorithms to extract contents of RNAi screening images, reducing the time required in processing by manual analysis and the variability in manual analysis.

G-CELLIQ includes novel classification-controlled feedback systems to refine cell boundaries and to increase the accuracy of the scoring method that reflect the mixture of different cell phenotypes in the screening.

G-CELLIQ provides an innovative and effective scoring method based on the fuzzy set-theoretic approach. The succinct score will allow researchers to easily comprehend the significance of the results and identify the genes of interest.

The integration of HCS technology with bioinformatics tools like G-CELLIQ has potential to make large-scale cell biology a tractable approach by generating functional information through the automated measurements of the temporal and spatial activities of genes and proteins in living cells.


PI: Stephen Wong, Ph.D., P.E., Department of Radiology, The Methodist Hospital Research Institute.
Co-PI: Norbert Perrimon, Ph.D., Department of Genetics, Harvard Medical School.


We have designed the over-all computational architecture of G-CELLIQ as shown in Figure 1, and all the available functions are integrated into a graphical user interface (GUI) as in Figure 2. The current version of software package includes the application of cell segmentation using two-step seeded watershed with over-segmentation correction, phenotype identification and classification using SVM based methods and gene function annotation based on comprehensive repeatability test and clustering analysis.

Figure 1 Workflow for G-CELLIQ

Figure 2 Current GUI for G-CELLIQ

To validate the capability of G-CELLIQ in large scale cell biology research, we have set up a pilot RNAi screen using 1,565 dsRNAs to inhibit the KP set (genes coding all known and predicted kinases and phosphatases) in Drosophila Kc187 cells. We selected 32 dsRNAs/wells at random from a dataset to illustrate the ability of G-CELLIQ. Each dsRNA/well was assigned a quantitative morphological score, and hierarchical clustering was used to group genes/wells. As shown in Figure 3, dsRNAs in this analysis clustered into two broad groups. One group of 19 conditions included 10/10 control conditions, as well as dsRNAs targeting the Insulin receptor (InR). Strikingly, the other large cluster of 13 conditions included 3/3 dsRNAs previously identified in a genome-wide screen for regulators of MAPK/ERK activation downstream of the EGF/EGFR activity. These results demonstrate that automated high-throughput image can discriminate distinct morphologies and be used to model functional relationships between signaling molecules.

Figure 3 Clustering of cellular morphologies results in the identification of functional relationships between genes.

The current version of G-CELLIQ package is available online, and we will continue to modify and expand its function based on the feedback from users. The function module on non-supervised novel phenotype identification is coming soon.


1. Yang, X., H. Li, and X. Zhou, Nuclei Segmentation Using Marker-Controlled Watershed, Tracking Using Mean-Shift, and Kalman Filter in Time-Lapse Microscopy. Circuits and Systems I: Regular Papers, IEEE Transactions on, 2006. 53(11): p. 2405-2414.

2. Xiong, G., et al., Automated segmentation of Drosophila RNAi fluorescence cellular images using deformable models. IEEE Transactions on Circuit and Sysrems, 2006. 53: p. 2415 - 2424.

3. Li, F.H., et al., High content image analysis for human H4 neuroglioma cells exposed to CuO nanoparticles. BMC Biotechnology, 2007. 7: p. 66.

4. Li, F.H., X. Zhou, and S.T.C. Wong, An automated feedback system with the hybrid model of scoring and classification for solving over-segmentation problems in RNAi high content screening. Journal of Microscopy, 2007. 226(2): p. 121 - 132.

5. Wang, J., et al., Cellular Phenotype Recognition for High-Content RNA Interference Genome-Wide Screening. Journal of Molecular Screening, 2008. 13(1): p. 29-39.

6. Yin, Z., et al., Using iterative cluster merging with improved gap statistics to perform online phenotype discovery in the context of high-throughput RNAi screens. BMC Bioinformatics, 2008. 9(1): p. 264.

7. Yan, P., et al., Automatic segmentation of RNAi fluorescent cellular images with interaction model. IEEE Transactions on Information Technology in Biomedicine, 2008. 12(1): p. 109 - 117.

8. Yin, Z., et al., Online phenotype discovery based on minimum classification error model. Pattern Recognition, 2009. 42(4): p. 509-522.

9. Wang, J., et al., An image score inference system for RNAi genome-wide screening based on fuzzy mixture regression modeling. Journal of Biomedical Informatics, 2009. 42(1): p. 32-40.

Software Link: