Abstract

Purpose/objective: The aim of this study is to predict early distant failure in early stage non-small cell lung cancer (NSCLC) treated with stereotactic body radiation therapy (SBRT) using clinical parameters by machine learning algorithms. Materials/methods: The dataset used in this work includes 81 early stage NSCLC patients with at least 6. months of follow-up who underwent SBRT between 2006 and 2012 at a single institution. The clinical parameters (n = 18) for each patient include demographic parameters, tumor characteristics, treatment fraction schemes, and pretreatment medications. Three predictive models were constructed based on different machine learning algorithms: (1) artificial neural network (ANN), (2) logistic regression (LR) and (3) support vector machine (SVM). Furthermore, to select an optimal clinical parameter set for the model construction, three strategies were adopted: (1) clonal selection algorithm (CSA) based selection strategy; (2) sequential forward selection (SFS) method; and (3) statistical analysis (SA) based strategy. 5-cross-validation is used to validate the performance of each predictive model. The accuracy was assessed by area under the receiver operating characteristic (ROC) curve (AUC), sensitivity and specificity of the system was also evaluated. Results: The AUCs for ANN, LR and SVM were 0.75, 0.73, and 0.80, respectively. The sensitivity values for ANN, LR and SVM were 71.2%, 72.9% and 83.1%, while the specificity values for ANN, LR and SVM were 59.1%, 63.6% and 63.6%, respectively. Meanwhile, the CSA based strategy outperformed SFS and SA in terms of AUC, sensitivity and specificity. Conclusions: Based on clinical parameters, the SVM with the CSA optimal parameter set selection strategy achieves better performance than other strategies for predicting distant failure in lung SBRT patients.

Original languageEnglish (US)
JournalRadiotherapy and Oncology
DOIs
StateAccepted/In press - Nov 25 2015

Fingerprint

Non-Small Cell Lung Carcinoma
Radiotherapy
Logistic Models
Area Under Curve
Sensitivity and Specificity
ROC Curve
Demography
Support Vector Machine
Lung
Neoplasms
Machine Learning
Therapeutics

Keywords

  • Clinical parameter
  • Distant failure
  • Feature selection
  • Machine learning
  • SBRT

ASJC Scopus subject areas

  • Oncology
  • Radiology Nuclear Medicine and imaging
  • Hematology

Cite this

@article{bf92564de61e42ecaeeee8902234c485,
title = "Predicting distant failure in early stage NSCLC treated with SBRT using clinical parameters",
abstract = "Purpose/objective: The aim of this study is to predict early distant failure in early stage non-small cell lung cancer (NSCLC) treated with stereotactic body radiation therapy (SBRT) using clinical parameters by machine learning algorithms. Materials/methods: The dataset used in this work includes 81 early stage NSCLC patients with at least 6. months of follow-up who underwent SBRT between 2006 and 2012 at a single institution. The clinical parameters (n = 18) for each patient include demographic parameters, tumor characteristics, treatment fraction schemes, and pretreatment medications. Three predictive models were constructed based on different machine learning algorithms: (1) artificial neural network (ANN), (2) logistic regression (LR) and (3) support vector machine (SVM). Furthermore, to select an optimal clinical parameter set for the model construction, three strategies were adopted: (1) clonal selection algorithm (CSA) based selection strategy; (2) sequential forward selection (SFS) method; and (3) statistical analysis (SA) based strategy. 5-cross-validation is used to validate the performance of each predictive model. The accuracy was assessed by area under the receiver operating characteristic (ROC) curve (AUC), sensitivity and specificity of the system was also evaluated. Results: The AUCs for ANN, LR and SVM were 0.75, 0.73, and 0.80, respectively. The sensitivity values for ANN, LR and SVM were 71.2{\%}, 72.9{\%} and 83.1{\%}, while the specificity values for ANN, LR and SVM were 59.1{\%}, 63.6{\%} and 63.6{\%}, respectively. Meanwhile, the CSA based strategy outperformed SFS and SA in terms of AUC, sensitivity and specificity. Conclusions: Based on clinical parameters, the SVM with the CSA optimal parameter set selection strategy achieves better performance than other strategies for predicting distant failure in lung SBRT patients.",
keywords = "Clinical parameter, Distant failure, Feature selection, Machine learning, SBRT",
author = "Zhiguo Zhou and Michael Folkert and Nathan Cannon and Puneeth Iyengar and Kenneth Westover and Yuanyuan Zhang and Hak Choy and Robert Timmerman and Jingsheng Yan and Xie, {Xian J.} and Steve Jiang and Jing Wang",
year = "2015",
month = "11",
day = "25",
doi = "10.1016/j.radonc.2016.04.029",
language = "English (US)",
journal = "Radiotherapy and Oncology",
issn = "0167-8140",
publisher = "Elsevier Ireland Ltd",

}

TY - JOUR

T1 - Predicting distant failure in early stage NSCLC treated with SBRT using clinical parameters

AU - Zhou, Zhiguo

AU - Folkert, Michael

AU - Cannon, Nathan

AU - Iyengar, Puneeth

AU - Westover, Kenneth

AU - Zhang, Yuanyuan

AU - Choy, Hak

AU - Timmerman, Robert

AU - Yan, Jingsheng

AU - Xie, Xian J.

AU - Jiang, Steve

AU - Wang, Jing

PY - 2015/11/25

Y1 - 2015/11/25

N2 - Purpose/objective: The aim of this study is to predict early distant failure in early stage non-small cell lung cancer (NSCLC) treated with stereotactic body radiation therapy (SBRT) using clinical parameters by machine learning algorithms. Materials/methods: The dataset used in this work includes 81 early stage NSCLC patients with at least 6. months of follow-up who underwent SBRT between 2006 and 2012 at a single institution. The clinical parameters (n = 18) for each patient include demographic parameters, tumor characteristics, treatment fraction schemes, and pretreatment medications. Three predictive models were constructed based on different machine learning algorithms: (1) artificial neural network (ANN), (2) logistic regression (LR) and (3) support vector machine (SVM). Furthermore, to select an optimal clinical parameter set for the model construction, three strategies were adopted: (1) clonal selection algorithm (CSA) based selection strategy; (2) sequential forward selection (SFS) method; and (3) statistical analysis (SA) based strategy. 5-cross-validation is used to validate the performance of each predictive model. The accuracy was assessed by area under the receiver operating characteristic (ROC) curve (AUC), sensitivity and specificity of the system was also evaluated. Results: The AUCs for ANN, LR and SVM were 0.75, 0.73, and 0.80, respectively. The sensitivity values for ANN, LR and SVM were 71.2%, 72.9% and 83.1%, while the specificity values for ANN, LR and SVM were 59.1%, 63.6% and 63.6%, respectively. Meanwhile, the CSA based strategy outperformed SFS and SA in terms of AUC, sensitivity and specificity. Conclusions: Based on clinical parameters, the SVM with the CSA optimal parameter set selection strategy achieves better performance than other strategies for predicting distant failure in lung SBRT patients.

AB - Purpose/objective: The aim of this study is to predict early distant failure in early stage non-small cell lung cancer (NSCLC) treated with stereotactic body radiation therapy (SBRT) using clinical parameters by machine learning algorithms. Materials/methods: The dataset used in this work includes 81 early stage NSCLC patients with at least 6. months of follow-up who underwent SBRT between 2006 and 2012 at a single institution. The clinical parameters (n = 18) for each patient include demographic parameters, tumor characteristics, treatment fraction schemes, and pretreatment medications. Three predictive models were constructed based on different machine learning algorithms: (1) artificial neural network (ANN), (2) logistic regression (LR) and (3) support vector machine (SVM). Furthermore, to select an optimal clinical parameter set for the model construction, three strategies were adopted: (1) clonal selection algorithm (CSA) based selection strategy; (2) sequential forward selection (SFS) method; and (3) statistical analysis (SA) based strategy. 5-cross-validation is used to validate the performance of each predictive model. The accuracy was assessed by area under the receiver operating characteristic (ROC) curve (AUC), sensitivity and specificity of the system was also evaluated. Results: The AUCs for ANN, LR and SVM were 0.75, 0.73, and 0.80, respectively. The sensitivity values for ANN, LR and SVM were 71.2%, 72.9% and 83.1%, while the specificity values for ANN, LR and SVM were 59.1%, 63.6% and 63.6%, respectively. Meanwhile, the CSA based strategy outperformed SFS and SA in terms of AUC, sensitivity and specificity. Conclusions: Based on clinical parameters, the SVM with the CSA optimal parameter set selection strategy achieves better performance than other strategies for predicting distant failure in lung SBRT patients.

KW - Clinical parameter

KW - Distant failure

KW - Feature selection

KW - Machine learning

KW - SBRT

UR - http://www.scopus.com/inward/record.url?scp=84965017496&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84965017496&partnerID=8YFLogxK

U2 - 10.1016/j.radonc.2016.04.029

DO - 10.1016/j.radonc.2016.04.029

M3 - Article

JO - Radiotherapy and Oncology

JF - Radiotherapy and Oncology

SN - 0167-8140

ER -