Using machine learning to predict radiation pneumonitis in patients with stage I non-small cell lung cancer treated with stereotactic body radiation therapy

Gilmer Valdes, Timothy D. Solberg, Marina Heskel, Lyle Ungar, Charles B. Simone

Research output: Contribution to journalArticle

33 Citations (Scopus)

Abstract

To develop a patient-specific 'big data' clinical decision tool to predict pneumonitis in stage I non-small cell lung cancer (NSCLC) patients after stereotactic body radiation therapy (SBRT). 61 features were recorded for 201 consecutive patients with stage I NSCLC treated with SBRT, in whom 8 (4.0%) developed radiation pneumonitis. Pneumonitis thresholds were found for each feature individually using decision stumps. The performance of three different algorithms (Decision Trees, Random Forests, RUSBoost) was evaluated. Learning curves were developed and the training error analyzed and compared to the testing error in order to evaluate the factors needed to obtain a cross-validated error smaller than 0.1. These included the addition of new features, increasing the complexity of the algorithm and enlarging the sample size and number of events. In the univariate analysis, the most important feature selected was the diffusion capacity of the lung for carbon monoxide (DLCO adj%). On multivariate analysis, the three most important features selected were the dose to 15 cc of the heart, dose to 4 cc of the trachea or bronchus, and race. Higher accuracy could be achieved if the RUSBoost algorithm was used with regularization. To predict radiation pneumonitis within an error smaller than 10%, we estimate that a sample size of 800 patients is required. Clinically relevant thresholds that put patients at risk of developing radiation pneumonitis were determined in a cohort of 201 stage I NSCLC patients treated with SBRT. The consistency of these thresholds can provide radiation oncologists with an estimate of their reliability and may inform treatment planning and patient counseling. The accuracy of the classification is limited by the number of patients in the study and not by the features gathered or the complexity of the algorithm.

Original languageEnglish (US)
Pages (from-to)6105-6120
Number of pages16
JournalPhysics in Medicine and Biology
Volume61
Issue number16
DOIs
StatePublished - Jul 27 2016

Fingerprint

Radiation Pneumonitis
Non-Small Cell Lung Carcinoma
Radiotherapy
Sample Size
Pneumonia
Lung Volume Measurements
Decision Trees
Learning Curve
Machine Learning
Bronchi
Carbon Monoxide
Trachea
Counseling
Multivariate Analysis

Keywords

  • machine learning, Decision Trees
  • non-small cell lung cancer
  • radiation pneumonitis
  • Random Forests
  • RUSBoost
  • stereotactic body radiation therapy (SBRT)

ASJC Scopus subject areas

  • Radiological and Ultrasound Technology
  • Medicine(all)
  • Radiology Nuclear Medicine and imaging

Cite this

Using machine learning to predict radiation pneumonitis in patients with stage I non-small cell lung cancer treated with stereotactic body radiation therapy. / Valdes, Gilmer; Solberg, Timothy D.; Heskel, Marina; Ungar, Lyle; Simone, Charles B.

In: Physics in Medicine and Biology, Vol. 61, No. 16, 27.07.2016, p. 6105-6120.

Research output: Contribution to journalArticle

Valdes, Gilmer ; Solberg, Timothy D. ; Heskel, Marina ; Ungar, Lyle ; Simone, Charles B. / Using machine learning to predict radiation pneumonitis in patients with stage I non-small cell lung cancer treated with stereotactic body radiation therapy. In: Physics in Medicine and Biology. 2016 ; Vol. 61, No. 16. pp. 6105-6120.
@article{21a328a9ab31476b9096d87ab65ac637,
title = "Using machine learning to predict radiation pneumonitis in patients with stage I non-small cell lung cancer treated with stereotactic body radiation therapy",
abstract = "To develop a patient-specific 'big data' clinical decision tool to predict pneumonitis in stage I non-small cell lung cancer (NSCLC) patients after stereotactic body radiation therapy (SBRT). 61 features were recorded for 201 consecutive patients with stage I NSCLC treated with SBRT, in whom 8 (4.0{\%}) developed radiation pneumonitis. Pneumonitis thresholds were found for each feature individually using decision stumps. The performance of three different algorithms (Decision Trees, Random Forests, RUSBoost) was evaluated. Learning curves were developed and the training error analyzed and compared to the testing error in order to evaluate the factors needed to obtain a cross-validated error smaller than 0.1. These included the addition of new features, increasing the complexity of the algorithm and enlarging the sample size and number of events. In the univariate analysis, the most important feature selected was the diffusion capacity of the lung for carbon monoxide (DLCO adj{\%}). On multivariate analysis, the three most important features selected were the dose to 15 cc of the heart, dose to 4 cc of the trachea or bronchus, and race. Higher accuracy could be achieved if the RUSBoost algorithm was used with regularization. To predict radiation pneumonitis within an error smaller than 10{\%}, we estimate that a sample size of 800 patients is required. Clinically relevant thresholds that put patients at risk of developing radiation pneumonitis were determined in a cohort of 201 stage I NSCLC patients treated with SBRT. The consistency of these thresholds can provide radiation oncologists with an estimate of their reliability and may inform treatment planning and patient counseling. The accuracy of the classification is limited by the number of patients in the study and not by the features gathered or the complexity of the algorithm.",
keywords = "machine learning, Decision Trees, non-small cell lung cancer, radiation pneumonitis, Random Forests, RUSBoost, stereotactic body radiation therapy (SBRT)",
author = "Gilmer Valdes and Solberg, {Timothy D.} and Marina Heskel and Lyle Ungar and Simone, {Charles B.}",
year = "2016",
month = "7",
day = "27",
doi = "10.1088/0031-9155/61/16/6105",
language = "English (US)",
volume = "61",
pages = "6105--6120",
journal = "Physics in Medicine and Biology",
issn = "0031-9155",
publisher = "IOP Publishing Ltd.",
number = "16",

}

TY - JOUR

T1 - Using machine learning to predict radiation pneumonitis in patients with stage I non-small cell lung cancer treated with stereotactic body radiation therapy

AU - Valdes, Gilmer

AU - Solberg, Timothy D.

AU - Heskel, Marina

AU - Ungar, Lyle

AU - Simone, Charles B.

PY - 2016/7/27

Y1 - 2016/7/27

N2 - To develop a patient-specific 'big data' clinical decision tool to predict pneumonitis in stage I non-small cell lung cancer (NSCLC) patients after stereotactic body radiation therapy (SBRT). 61 features were recorded for 201 consecutive patients with stage I NSCLC treated with SBRT, in whom 8 (4.0%) developed radiation pneumonitis. Pneumonitis thresholds were found for each feature individually using decision stumps. The performance of three different algorithms (Decision Trees, Random Forests, RUSBoost) was evaluated. Learning curves were developed and the training error analyzed and compared to the testing error in order to evaluate the factors needed to obtain a cross-validated error smaller than 0.1. These included the addition of new features, increasing the complexity of the algorithm and enlarging the sample size and number of events. In the univariate analysis, the most important feature selected was the diffusion capacity of the lung for carbon monoxide (DLCO adj%). On multivariate analysis, the three most important features selected were the dose to 15 cc of the heart, dose to 4 cc of the trachea or bronchus, and race. Higher accuracy could be achieved if the RUSBoost algorithm was used with regularization. To predict radiation pneumonitis within an error smaller than 10%, we estimate that a sample size of 800 patients is required. Clinically relevant thresholds that put patients at risk of developing radiation pneumonitis were determined in a cohort of 201 stage I NSCLC patients treated with SBRT. The consistency of these thresholds can provide radiation oncologists with an estimate of their reliability and may inform treatment planning and patient counseling. The accuracy of the classification is limited by the number of patients in the study and not by the features gathered or the complexity of the algorithm.

AB - To develop a patient-specific 'big data' clinical decision tool to predict pneumonitis in stage I non-small cell lung cancer (NSCLC) patients after stereotactic body radiation therapy (SBRT). 61 features were recorded for 201 consecutive patients with stage I NSCLC treated with SBRT, in whom 8 (4.0%) developed radiation pneumonitis. Pneumonitis thresholds were found for each feature individually using decision stumps. The performance of three different algorithms (Decision Trees, Random Forests, RUSBoost) was evaluated. Learning curves were developed and the training error analyzed and compared to the testing error in order to evaluate the factors needed to obtain a cross-validated error smaller than 0.1. These included the addition of new features, increasing the complexity of the algorithm and enlarging the sample size and number of events. In the univariate analysis, the most important feature selected was the diffusion capacity of the lung for carbon monoxide (DLCO adj%). On multivariate analysis, the three most important features selected were the dose to 15 cc of the heart, dose to 4 cc of the trachea or bronchus, and race. Higher accuracy could be achieved if the RUSBoost algorithm was used with regularization. To predict radiation pneumonitis within an error smaller than 10%, we estimate that a sample size of 800 patients is required. Clinically relevant thresholds that put patients at risk of developing radiation pneumonitis were determined in a cohort of 201 stage I NSCLC patients treated with SBRT. The consistency of these thresholds can provide radiation oncologists with an estimate of their reliability and may inform treatment planning and patient counseling. The accuracy of the classification is limited by the number of patients in the study and not by the features gathered or the complexity of the algorithm.

KW - machine learning, Decision Trees

KW - non-small cell lung cancer

KW - radiation pneumonitis

KW - Random Forests

KW - RUSBoost

KW - stereotactic body radiation therapy (SBRT)

UR - http://www.scopus.com/inward/record.url?scp=84984698490&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84984698490&partnerID=8YFLogxK

U2 - 10.1088/0031-9155/61/16/6105

DO - 10.1088/0031-9155/61/16/6105

M3 - Article

C2 - 27461154

AN - SCOPUS:84984698490

VL - 61

SP - 6105

EP - 6120

JO - Physics in Medicine and Biology

JF - Physics in Medicine and Biology

SN - 0031-9155

IS - 16

ER -