Knapen DG, Hone Lopez S, de Groot DJA, de Haan JJ, de Vries EGE, Dienstmann R, de Jong S, Bhattacharya A, Fehrmann RSN
Abstract
Background
Bulk transcriptional profiles of early colorectal cancer (CRC) can fail to detect biological processes associated with disease-free survival (DFS) if the transcriptional patterns are subtle and/or obscured by other processes’ patterns. Consensus-independent component analysis (c-ICA) can dissect such transcriptomes into statistically independent transcriptional components (TCs), capturing both pronounced and subtle biological processes.
Methods
In this study we (1) integrated transcriptomes (n = 4228) from multiple early CRC studies, (2) performed c-ICA to define the TC landscape within this integrated data set, 3) determined the biological processes captured by these TCs, (4) performed Cox regression to identify DFS-associated TCs, (5) performed random survival forest (RSF) analyses with activity of DFS-associated TCs as classifiers to identify subgroups of patients, and 6) performed a sensitivity analysis to determine the robustness of our results
Results
We identify 191 TCs, 43 of which are associated with DFS, revealing transcriptional diversity among DFS-associated biological processes. A prominent example is the epithelial-mesenchymal transition (EMT), for which we identify an association with nine independent DFS-associated TCs, each with coordinated upregulation or downregulation of various sets of genes.
Conclusions
This finding indicates that early CRC may have nine distinct routes to achieve EMT, each requiring a specific peri-operative treatment strategy. Finally, we stratify patients into DFS patient subgroups with distinct transcriptional patterns associated with stage 2 and stage 3 CRC.