Background: Detecting microsatellite instability (MSI) in colorectal cancer is crucial for clinical decision making, as it identifies patients with differential treatment response and prognosis. Universal MSI testing is recommended, but many patients remain untested. A critical need exists for broadly accessible, cost-efficient tools to aid patient selection for testing. Here, we investigate the potential of a deep learning-based system for automated MSI prediction directly from haematoxylin and eosin (H&E)-stained whole-slide images (WSIs). Methods: Our deep learning model (MSINet) was developed using 100 H&E-stained WSIs (50 with microsatellite stability [MSS] and 50 with MSI) scanned at 40× magnification, each from a patient randomly selected in a class-balanced manner from the pool of 343 patients who underwent primary colorectal cancer resection at Stanford University Medical Center (Stanford, CA, USA; internal dataset) between Jan 1, 2015, and Dec 31, 2017. We internally validated the model on a holdout test set (15 H&E-stained WSIs from 15 patients; seven cases with MSS and eight with MSI) and externally validated the model on 484 H&E-stained WSIs (402 cases with MSS and 77 with MSI; 479 patients) from The Cancer Genome Atlas, containing WSIs scanned at 40× and 20× magnification. Performance was primarily evaluated using the sensitivity, specificity, negative predictive value (NPV), and area under the receiver operating characteristic curve (AUROC). We compared the model's performance with that of five gastrointestinal pathologists on a class-balanced, randomly selected subset of 40× magnification WSIs from the external dataset (20 with MSS and 20 with MSI). Findings: The MSINet model achieved an AUROC of 0·931 (95% CI 0·771–1·000) on the holdout test set from the internal dataset and 0·779 (0·720–0·838) on the external dataset. On the external dataset, using a sensitivity-weighted operating point, the model achieved an NPV of 93·7% (95% CI 90·3–96·2), sensitivity of 76·0% (64·8–85·1), and specificity of 66·6% (61·8–71·2). On the reader experiment (40 cases), the model achieved an AUROC of 0·865 (95% CI 0·735–0·995). The mean AUROC performance of the five pathologists was 0·605 (95% CI 0·453–0·757). Interpretation: Our deep learning model exceeded the performance of experienced gastrointestinal pathologists at predicting MSI on H&E-stained WSIs. Within the current universal MSI testing paradigm, such a model might contribute value as an automated screening tool to triage patients for confirmatory testing, potentially reducing the number of tested patients, thereby resulting in substantial test-related labour and cost savings. Funding: Stanford Cancer Institute and Stanford Departments of Pathology and Biomedical Data Science.
ASJC Scopus subject areas