A pathology foundation model for cancer diagnosis and prognosis prediction.

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • Additional Information
    • Source:
      Publisher: Nature Publishing Group Country of Publication: England NLM ID: 0410462 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 1476-4687 (Electronic) Linking ISSN: 00280836 NLM ISO Abbreviation: Nature Subsets: MEDLINE
    • Publication Information:
      Publication: Basingstoke : Nature Publishing Group
      Original Publication: London, Macmillan Journals ltd.
    • Subject Terms:
    • Abstract:
      Histopathology image evaluation is indispensable for cancer diagnoses and subtype classification. Standard artificial intelligence methods for histopathology image analyses have focused on optimizing specialized models for each diagnostic task 1,2 . Although such methods have achieved some success, they often have limited generalizability to images generated by different digitization protocols or samples collected from different populations 3 . Here, to address this challenge, we devised the Clinical Histopathology Imaging Evaluation Foundation (CHIEF) model, a general-purpose weakly supervised machine learning framework to extract pathology imaging features for systematic cancer evaluation. CHIEF leverages two complementary pretraining methods to extract diverse pathology representations: unsupervised pretraining for tile-level feature identification and weakly supervised pretraining for whole-slide pattern recognition. We developed CHIEF using 60,530 whole-slide images spanning 19 anatomical sites. Through pretraining on 44 terabytes of high-resolution pathology imaging datasets, CHIEF extracted microscopic representations useful for cancer cell detection, tumour origin identification, molecular profile characterization and prognostic prediction. We successfully validated CHIEF using 19,491 whole-slide images from 32 independent slide sets collected from 24 hospitals and cohorts internationally. Overall, CHIEF outperformed the state-of-the-art deep learning methods by up to 36.1%, showing its ability to address domain shifts observed in samples from diverse populations and processed by different slide preparation methods. CHIEF provides a generalizable foundation for efficient digital pathology evaluation for patients with cancer.
      (© 2024. The Author(s), under exclusive licence to Springer Nature Limited.)
    • References:
      Van der Laak, J., Litjens, G. & Ciompi, F. Deep learning in histopathology: the path to the clinic. Nat. Med. 27, 775–784 (2021). (PMID: 3399080410.1038/s41591-021-01343-4)
      Shmatko, A., Ghaffari Laleh, N., Gerstung, M. & Kather, J. N. Artificial intelligence in histopathology: enhancing cancer research and clinical oncology. Nat. Cancer 3, 1026–1038 (2022). (PMID: 3613813510.1038/s43018-022-00436-4)
      Song, A. H. et al. Artificial intelligence for digital and computational pathology. Nat. Rev. Bioeng. 1, 930–949 (2023). (PMID: 10.1038/s44222-023-00096-8)
      Campanella, G. et al. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat. Med. 25, 1301–1309 (2019). (PMID: 31308507741846310.1038/s41591-019-0508-1)
      Bejnordi, B. E. et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA 318, 2199–2210 (2017). (PMID: 10.1001/jama.2017.14585)
      Lu, M. Y. et al. Data-efficient and weakly supervised computational pathology on whole-slide images. Nat. Biomed. Eng. 5, 555–570 (2021). (PMID: 33649564871164010.1038/s41551-020-00682-w)
      Coudray, N. et al. Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat. Med. 24, 1559–1567 (2018). (PMID: 30224757984751210.1038/s41591-018-0177-5)
      Nasrallah, M. P. et al. Machine learning for cryosection pathology predicts the 2021 WHO classification of glioma. Med 4, 526–540 (2023). (PMID: 3742195310.1016/j.medj.2023.06.002)
      Tsai, P.-C. et al. Histopathology images predict multi-omics aberrations and prognoses in colorectal cancer patients. Nat. Commun. 14, 2102 (2023). (PMID: 370553931010220810.1038/s41467-023-37179-4)
      Yu, K.-H. et al. Classifying non-small cell lung cancer types and transcriptomic subtypes using convolutional neural networks. J. Am. Med. Inform. Assoc. 27, 757–769 (2020). (PMID: 32364237730926310.1093/jamia/ocz230)
      Yu, K.-H. et al. Association of omics features with histopathology patterns in lung adenocarcinoma. Cell Syst. 5, 620–627 (2017). (PMID: 29153840574646810.1016/j.cels.2017.10.014)
      Chen, R. J. et al. Pan-cancer integrative histology-genomic analysis via multimodal deep learning. Cancer Cell 40, 865–878 (2022). (PMID: 359445021039737010.1016/j.ccell.2022.07.004)
      Marostica, E. et al. Development of a histopathology informatics pipeline for classification and prediction of clinical outcomes in subtypes of renal cell carcinoma. Clin. Cancer Res. 27, 2868–2878 (2021). (PMID: 3372289610.1158/1078-0432.CCR-20-4119)
      Yu, K.-H. et al. Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features. Nat. Commun. 7, 12474 (2016). (PMID: 27527408499070610.1038/ncomms12474)
      Vanguri, R. S. et al. Multimodal integration of radiology, pathology and genomics for prediction of response to PD-(L)1 blockade in patients with non-small cell lung cancer. Nat. Cancer 3, 1151–1164 (2022). (PMID: 36038778958687110.1038/s43018-022-00416-8)
      Yu, K.-H. et al. Deciphering serous ovarian carcinoma histopathology and platinum response by convolutional neural networks. BMC Med. 18, 236 (2020). (PMID: 32807164743310810.1186/s12916-020-01684-w)
      Foersch, S. et al. Multistain deep learning for prediction of prognosis and therapy response in colorectal cancer. Nat. Med. 29, 430–439 (2023). (PMID: 3662431410.1038/s41591-022-02134-1)
      Kather, J. N. et al. Pan-cancer image-based detection of clinically actionable genetic alterations. Nat. Cancer 1, 789–799 (2020). (PMID: 33763651761041210.1038/s43018-020-0087-6)
      Echle, A. et al. Deep learning in cancer pathology: a new generation of clinical biomarkers. Br. J. Cancer 124, 686–696 (2021). (PMID: 3320402810.1038/s41416-020-01122-x)
      Ektefaie, Y. et al. Integrative multiomics-histopathology analysis for breast cancer classification. NPJ Breast Cancer 7, 147 (2021). (PMID: 34845230863018810.1038/s41523-021-00357-y)
      Yu, K.-H., Beam, A. L. & Kohane, I. S. Artificial intelligence in healthcare. Nat. Biomed. Eng. 2, 719–731 (2018). (PMID: 3101565110.1038/s41551-018-0305-z)
      Krishnan, R., Rajpurkar, P. & Topol, E. J. Self-supervised learning in medicine and healthcare. Nat. Biomed. Eng. 6, 1346–1352 (2022). (PMID: 3595364910.1038/s41551-022-00914-1)
      Zhou, Y. et al. A foundation model for generalizable disease detection from retinal images. Nature 622, 156–1638 (2023).
      Chen, C. et al. Fast and scalable search of whole-slide images via self-supervised deep learning. Nat. Biomed. Eng. 6, 1420–1434 (2022). (PMID: 36217022979237110.1038/s41551-022-00929-8)
      Wang, X. et al. RetCCL: clustering-guided contrastive learning for whole-slide image retrieval. Med. Image Anal. 83, 102645 (2023). (PMID: 3627009310.1016/j.media.2022.102645)
      Chen, R. J. et al. Towards a general-purpose foundation model for computational pathology. Nat. Med. 30, 850–862 (2024).
      Wagner, S. J. et al. Transformer-based biomarker prediction from colorectal cancer histology: a large-scale multicentric study. Cancer Cell 41, 1650–1661 (2023). (PMID: 376520061050738110.1016/j.ccell.2023.08.002)
      Huang, Z., Bianchi, F., Yuksekgonul, M., Montine, T. J. & Zou, J. A visual–language foundation model for pathology image analysis using medical twitter. Nat. Med. 29, 2307–2316 (2023). (PMID: 3759210510.1038/s41591-023-02504-3)
      Lu, M. Y. et al. A visual-language foundation model for computational pathology. Nat. Med. 30, 863–874 (2024).
      Wang, X. et al. Transformer-based unsupervised contrastive learning for histopathological image classification. Med. Image Anal. 81, 102559 (2022). (PMID: 3595241910.1016/j.media.2022.102559)
      Koziarski, M. et al. Diagset: a dataset for prostate cancer histopathological image classification. Sci. Rep. 14, 6780 (2024). (PMID: 385146611095803610.1038/s41598-024-52183-4)
      Yu, G. et al. Accurate recognition of colorectal cancer with semi-supervised deep learning on pathological images. Nat. Commun. 12, 6311 (2021). (PMID: 34728629856393110.1038/s41467-021-26643-8)
      Loménie, N. et al. Can AI predict epithelial lesion categories via automated analysis of cervical biopsies: the TissueNet challenge? J. Pathol. Inform. 13, 100149 (2022). (PMID: 36605109980802910.1016/j.jpi.2022.100149)
      Ilse, M., Tomczak, J. & Welling, M. Attention-based deep multiple instance learning. In Proc. 35th International Conference on Machine Learning (eds Dy, J. & Krause, A.) 2127–2136 (PMLR, 2018).
      Li, B., Li, Y. & Eliceiri, K. W. Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning. In Proc. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition 14313–14323 (IEEE, 2021).
      Fu, Y. et al. Pan-cancer computational histopathology reveals mutations, tumor composition and prognosis. Nat. Cancer 1, 800–810 (2020). (PMID: 3512204910.1038/s43018-020-0085-8)
      Petrini, I. et al. A specific missense mutation in GTF2I occurs at high frequency in thymic epithelial tumors. Nat. Genet. 46, 844–849 (2014). (PMID: 24974848570518510.1038/ng.3016)
      Carbone, M. et al. Biological mechanisms and clinical significance of BAP1 mutations in human cancer. Cancer Discov. 10, 1103–1120 (2020). (PMID: 32690542800675210.1158/2159-8290.CD-19-1220)
      Chakravarty, D. et al. OncoKB: a precision oncology knowledge base. JCO Precision Oncology 1, 1–16 (2017). (PMID: 10.1200/PO.17.00011)
      Louis, D. N. et al. The 2021 WHO Classification of Tumors of the Central Nervous System: a summary. Neuro-Oncology 23, 1231–1251 (2021). (PMID: 34185076832801310.1093/neuonc/noab106)
      Roetzer-Pejrimovsky, T. et al. The Digital Brain Tumour Atlas, an open histopathology resource. Sci. Data 9, 55 (2022). (PMID: 35169150884757710.1038/s41597-022-01157-0)
      Kim, K. et al. PAIP 2020: microsatellite instability prediction in colorectal cancer. Med. Image Anal. 89, 102886 (2023). (PMID: 3749481110.1016/j.media.2023.102886)
      Amin, M. B. et al. The Eighth Edition AJCC Cancer Staging Manual: continuing to build a bridge from a population-based to a more “personalized” approach to cancer staging. CA Cancer J. Clin. 67, 93–99 (2017). (PMID: 2809484810.3322/caac.21388)
      Achiam, J. et al. GPT-4 technical report. Preprint at https://doi.org/10.48550/arXiv.2303.08774 (2023).
      Team, G. et al. Gemini: a family of highly capable multimodal models. Preprint at https://doi.org/10.48550/arXiv.2312.11805 (2023).
      Azizi, S. et al. Robust and data-efficient generalization of self-supervised machine learning for diagnostic imaging. Nat. Biomed. Eng. 7, 756–779 (2023).
      Cancer Genome Atlas Research Network, J. et al. The Cancer Genome Atlas Pan-Cancer analysis project. Nat. Genet. 45, 1113–1120 (2013). (PMID: 10.1038/ng.2764)
      Lonsdale, J. et al. The genotype-tissue expression (GTEx) project. Nat. Genet. 45, 580–585 (2013). (PMID: 10.1038/ng.2653)
      Bulten, W. et al. Artificial intelligence for diagnosis and Gleason grading of prostate cancer: the PANDA challenge. Nat. Med. 28, 154–163 (2022).
      Yacob, F. et al. Weakly supervised detection and classification of basal cell carcinoma using graph-transformer on whole slide images. Sci Rep. 13, 7555 (2023). (PMID: 371609531016985210.1038/s41598-023-33863-z)
      Xu, F. et al. Predicting axillary lymph node metastasis in early breast cancer using deep learning on primary tumor biopsy slides. Front. Oncol. 11, 4133 (2021). (PMID: 10.3389/fonc.2021.759007)
      Weitz, P. et al. A multi-stain breast cancer histological whole-slide-image data set from routine diagnostics. Sci. Data 10, 562 (2023). (PMID: 376203571044976510.1038/s41597-023-02422-6)
      Wang, C.-W. et al. Histopathological whole slide image dataset for classification of treatment effectiveness to ovarian cancer. Sci. Data 9, 25 (2022). (PMID: 35087101879543310.1038/s41597-022-01127-6)
      Radford, A. et al. Learning transferable visual models from natural language supervision. In Proc. 38th International Conference on Machine Learning (eds Meila, M. & Zhang, T.) 8748–8763 (PMLR, 2021).
      Otsu, N. A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. Syst. 9, 62–66 (1979). (PMID: 10.1109/TSMC.1979.4310076)
      Kingma, D. P. & Ba, J. L. Adam: a method for stochastic optimization. In Proc. 3rd International Conference on Learning Representations (eds Bengio, Y. & LeCun, Y.) (ICLR, 2015).
      Loshchilov, I. & Hutter, F. SGDR: stochastic gradient descent with warm restarts. In Proc. 5th International Conference on Learning Representations 1769–1784 (ICLR, 2017).
      Stadler, C. B. et al. Proactive construction of an annotated imaging database for artificial intelligence training. J. Digit. Imaging 34, 105–115 (2021). (PMID: 3316921110.1007/s10278-020-00384-4)
      Lu, M. Y. et al. AI-based pathology predicts origins for cancers of unknown primary. Nature 594, 106–110 (2021). (PMID: 3395340410.1038/s41586-021-03512-4)
      Black, A. et al. PLCO: evolution of an epidemiologic resource and opportunities for future studies. Rev. Recent Clin. Trials 10, 238–245 (2015). (PMID: 26435289644492110.2174/157488711003150928130654)
      Shao, Z. et al. TransMIL: Transformer based correlated multiple instance learning for whole slide image classification. Adv. Neural Inf. Process. Syst. 34, 2136–2147 (2021).
      Liang, J. et al. Deep learning supported discovery of biomarkers for clinical prognosis of liver cancer. Nat. Mach. Intell. 5, 408–420 (2023). (PMID: 10.1038/s42256-023-00635-3)
      Courtiol, P. et al. Deep learning-based classification of mesothelioma improves prediction of patient outcome. Nat. Med. 25, 1519–1525 (2019). (PMID: 3159158910.1038/s41591-019-0583-3)
      Graham, S. et al. Hover-Net: simultaneous segmentation and classification of nuclei in multi-tissue histology images. Med. Image Anal. 58, 101563 (2019). (PMID: 3156118310.1016/j.media.2019.101563)
    • Publication Date:
      Date Created: 20240904 Date Completed: 20241024 Latest Revision: 20241107
    • Publication Date:
      20241108
    • Accession Number:
      10.1038/s41586-024-07894-z
    • Accession Number:
      39232164