Experiments: 1. Phenotypic Visualization and Integrative Analysis of Cellular Morphometric Types

The proposed approach has been applied on the TCGA LGG cohort, including 215 WSIs from 209 patients, where the clinical annotation of 203 patients are available. For the quality control purpose, background and border portions of each whole slide image were detected and removed from the analysis.

The TCGA LGG cohort consists of ~80 million segmented nuclear regions, from which 2 million were randomly selected for construction of cellular morphometric types. The cellular morphometric context representation for each patient is a 64-dimensional vector, where each dimension represents the normalized frequency of a specific cellular morphometric type appearing in the WSIs of the patient. Initial integrative analysis is performed by linking individual cellular morphometric types to clinical outcomes and molecular data. Each cellular morphometric type is chosen as the predictor variable in the Cox proportional hazards (PH) regression model together with the age of the patient (implemented through the R survival package). For each cellular morphometric type, the frequencies are further correlated with the gene expression values across all patients. The top-ranked genes of positive correlation and negative correlation, respectively, are imported into the MSigDB for gene set enrichment analysis. Table 1 summarizes cellular morphometric types that best predict the survival distribution, and the corresponding enriched gene sets. Fig. 1 shows the top-ranked examples for these cellular morphemetric types.

As shown in Table 1, 8 out of 64 cellular morphometric types are clinically relevant to survival (FDR adjusted p-value <0.01) with statistical significance. The first four cellular morphometric types in Fig. 1 all have a hazard ratio>1, indicating that a higher frequency of these cellular morphometric types may lead to a worse prognosis. A common phenotypic property of these cellular morphometric types is the loss of chromatin content in the nuclear regions, which may be associated with poor prognosis of lower grade glioma. The last four cellular morphometric types in Fig. 1 all have a hazard ratio <1, indicating that a higher frequency of these cellular morphometric types may lead to a better prognosis.

Tab. 1 also indicates the enrichment of genes up-regulated in response to IFNG in cellular morphometric types #28, #29 and #52. In the glioma microenvironment, tumor cells and local T cells produce abnormally low levels of IFNG. IFNG acts on cell-surface receptors, and activates transcription of genes that offer potentials in the treatment of brain tumors by increasing tumor immunogenicity, disrupting proliferative mechanisms, and inhibiting tumor angiogenesis. The observations of IFNG as a positive survival factor confirms the prognostic effect of these cellular morphometric types: #28 -- negative correlation and worse prognosis; #29 and #52 -- positive correlation and better prognosis.
Other interesting observations include that three cellular morphometric types of better prognosis are enriched with genes up-regulated by IL6 via STAT3, and two cellular morphometric types of better prognosis are enriched with genes regulated by NF-kB in response to TNF and genes up-regulated in response to TGFB1, respectively.