PathAI ML Models Based on Human Interpretable Features Predict Clinically-Relevant Molecular Phenotypes Across Cancers

Share Article

PathAI developed a machine learning (ML) model-based pipeline to predict tumor molecular phenotypes from hematoxylin and eosin (H&E) stained biopsies across 5 different tumor types.

PathAI, a global provider of AI-powered technology applied to pathology, reports on their research, published today in Nature Communications, describing the application of their ML-based platform that generates a fully interpretable, quantified description of the tumor and tumor microenvironment using human interpretable features (HIFs).

ML models were trained using over 1.6 million annotations of cancer cell and tissue histological features provided by board certified pathologists on 5,700 H&E stained samples. When applied to tumor tissue images from five different cancer types - skin cutaneous melanoma, stomach adenocarcinoma, breast cancer, lung adenocarcinoma, and lung squamous cell carcinoma - the ML models generated 607 pan-cancer, biologically-relevant HIFs. The HIFs ranged from simple quantified descriptions of cells or tissue regions to more complex multi-component spatial, or clustering, relationships. As examples, simple HIFs include the number of lymphocytes in cancer tissue, density of fibroblasts in cancer stroma, and area of necrosis, whereas the more complex spatial HIFs are the mean cluster size of fibroblasts in cancer-associated stroma, and proportion of cancer cells within 80 microns of a macrophage. The quantitative characterization of the tumor microenvironment generated provided a high-resolution description of its components and composition. HIFs were found to correlate with established RNA expression signatures, such as leukocyte infiltration, TGF-beta expression, IgG expression, and wound healing, and HIFs were shown to be predictive of molecular markers that are currently used for drug development clinical decision-making (PD-1, PD-L1, CTLA-4, TIGIT, and HRD) achieving AUROC ranging from 0.601 to 0.864 in held-out tissue source sites. Some associations discovered between HIFs and RNA expression validated previously identified discoveries (the density of tumor invading lymphocytes with PD1 and PD-L1 expression), whereas others (the morphology of necrotic tissue with TIGIT) were novel.

PathAI’s HIF-based approach can be validated by pathologists and further traced to patho-biologically relevant features, unlike alternative “black box” methods which are unable to elucidate such features. This interpretability not only provides insight into cancer biology and pathogenesis, but may also accelerate clinical adoption of this technology after appropriate clinical validation.

The complete article is freely available at

Diao, J.A., Wang, J.K., Chui W.F., et al. Human-interpretable image features derived from densely mapped cancer pathology slides predict diverse molecular phenotypes. Nature Communications 12 (2021) DOI:10.1038/s41467-021-21896-9

About PathAI

PathAI is a leading provider of AI-powered research tools and services for pathology. PathAI’s platform promises substantial improvements to the accuracy of diagnosis and the efficacy of treatment of diseases like cancer, leveraging modern approaches in machine and deep learning. Based in Boston, PathAI works with leading life sciences companies and researchers to advance precision medicine. To learn more, visit

Share article on social media or email:

View article via:

Pdf Print

Contact Author

Isabella Canuso
+1 6096821080
Email >

Christopher Allman-Bradshaw
Email >
Follow >
Visit website