There can be significant uncertainty when identifying cervical lymph node (LN) metastases in patients with oropharyngeal squamous cell carcinoma (OPSCC) despite the use of modern imaging modalities such as positron emission tomography (PET) and computed tomography (CT) scans. Grossly involved LNs are readily identifiable during routine imaging, but smaller and less PET-avid LNs are harder to classify. We trained a convolutional neural network (CNN) to detect malignant LNs in patients with OPSCC and used quantitative measures of uncertainty to identify the most reliable predictions. Our dataset consisted of images of 791 LNs from 129 patients with OPSCC who had preoperative PET/CT imaging and detailed pathological reports after neck dissections. These LNs were segmented on PET/CT imaging and then labeled according to the pathology reports. An AlexNet-like CNN was trained to classify LNs as malignant or benign. We estimated epistemic and aleatoric uncertainty by using dropout variational inference and test-time augmentation, respectively. CNN performance was stratified according to the median epistemic and aleatoric uncertainty values calculated using the validation cohort. Our model achieved an area under the receiver operating characteristic (ROC) curve (AUC) of 0.99 on the testing dataset. Sensitivity and specificity were 0.94 and 0.90, respectively. Epistemic and aleatoric uncertainty values were statistically larger for false negative and false positive predictions than for true negative and true positive predictions (p < 0.001). Model sensitivity and specificity were 1.0 and 0.98, respectively, for cases with epistemic uncertainty lower than the median value of the incorrect predictions in the validation dataset. For cases with higher epistemic uncertainty, sensitivity and specificity were 0.67 and 0.41, respectively. Model sensitivity and specificity were 1.0 and 0.98, respectively, for cases with aleatoric uncertainty lower than the median value of the incorrect predictions in the validation dataset. For cases with higher aleatoric uncertainty, sensitivity and specificity were 0.67 and 0.37, respectively. We used a CNN to predict the malignant status of LNs in patients with OPSCC with high accuracy, and we showed that uncertainty can be used to quantify a prediction's reliability. Assigning measures of uncertainty to predictions could improve the accuracy of LN classification by efficiently identifying instances where expert evaluation is needed to corroborate a model's prediction.
- convolutional neural network
- oropharyngeal cancer
- radiation oncology
ASJC Scopus subject areas
- Radiological and Ultrasound Technology
- Radiology Nuclear Medicine and imaging