Machine learning algorithms outperform conventional regression models in predicting development of hepatocellular carcinoma

Amit G. Singal, Ashin Mukherjee, B. Joseph Elmunzer, Peter D R Higgins, Anna S. Lok, Ji Zhu, Jorge A. Marrero, Akbar K. Waljee

Research output: Contribution to journalArticle

48 Citations (Scopus)

Abstract

OBJECTIVES:Predictive models for hepatocellular carcinoma (HCC) have been limited by modest accuracy and lack of validation. Machine-learning algorithms offer a novel methodology, which may improve HCC risk prognostication among patients with cirrhosis. Our study's aim was to develop and compare predictive models for HCC development among cirrhotic patients, using conventional regression analysis and machine-learning algorithms.METHODS:We enrolled 442 patients with Child A or B cirrhosis at the University of Michigan between January 2004 and September 2006 (UM cohort) and prospectively followed them until HCC development, liver transplantation, death, or study termination. Regression analysis and machine-learning algorithms were used to construct predictive models for HCC development, which were tested on an independent validation cohort from the Hepatitis C Antiviral Long-term Treatment against Cirrhosis (HALT-C) Trial. Both models were also compared with the previously published HALT-C model. Discrimination was assessed using receiver operating characteristic curve analysis, and diagnostic accuracy was assessed with net reclassification improvement and integrated discrimination improvement statistics.RESULTS:After a median follow-up of 3.5 years, 41 patients developed HCC. The UM regression model had a c-statistic of 0.61 (95% confidence interval (CI) 0.56-0.67), whereas the machine-learning algorithm had a c-statistic of 0.64 (95% CI 0.60-0.69) in the validation cohort. The HALT-C model had a c-statistic of 0.60 (95% CI 0.50-0.70) in the validation cohort and was outperformed by the machine-learning algorithm. The machine-learning algorithm had significantly better diagnostic accuracy as assessed by net reclassification improvement (P<0.001) and integrated discrimination improvement (P=0.04).CONCLUSIONS:Machine-learning algorithms improve the accuracy of risk stratifying patients with cirrhosis and can be used to accurately identify patients at high-risk for developing HCC.

Original languageEnglish (US)
Pages (from-to)1723-1730
Number of pages8
JournalAmerican Journal of Gastroenterology
Volume108
Issue number11
DOIs
StatePublished - Nov 2013

Fingerprint

Hepatocellular Carcinoma
Fibrosis
Hepatitis C
Antiviral Agents
Confidence Intervals
Regression Analysis
Machine Learning
ROC Curve
Liver Transplantation
Therapeutics

ASJC Scopus subject areas

  • Gastroenterology

Cite this

Machine learning algorithms outperform conventional regression models in predicting development of hepatocellular carcinoma. / Singal, Amit G.; Mukherjee, Ashin; Joseph Elmunzer, B.; Higgins, Peter D R; Lok, Anna S.; Zhu, Ji; Marrero, Jorge A.; Waljee, Akbar K.

In: American Journal of Gastroenterology, Vol. 108, No. 11, 11.2013, p. 1723-1730.

Research output: Contribution to journalArticle

Singal, Amit G. ; Mukherjee, Ashin ; Joseph Elmunzer, B. ; Higgins, Peter D R ; Lok, Anna S. ; Zhu, Ji ; Marrero, Jorge A. ; Waljee, Akbar K. / Machine learning algorithms outperform conventional regression models in predicting development of hepatocellular carcinoma. In: American Journal of Gastroenterology. 2013 ; Vol. 108, No. 11. pp. 1723-1730.
@article{540252d8a357455d908f69f5225d9d7b,
title = "Machine learning algorithms outperform conventional regression models in predicting development of hepatocellular carcinoma",
abstract = "OBJECTIVES:Predictive models for hepatocellular carcinoma (HCC) have been limited by modest accuracy and lack of validation. Machine-learning algorithms offer a novel methodology, which may improve HCC risk prognostication among patients with cirrhosis. Our study's aim was to develop and compare predictive models for HCC development among cirrhotic patients, using conventional regression analysis and machine-learning algorithms.METHODS:We enrolled 442 patients with Child A or B cirrhosis at the University of Michigan between January 2004 and September 2006 (UM cohort) and prospectively followed them until HCC development, liver transplantation, death, or study termination. Regression analysis and machine-learning algorithms were used to construct predictive models for HCC development, which were tested on an independent validation cohort from the Hepatitis C Antiviral Long-term Treatment against Cirrhosis (HALT-C) Trial. Both models were also compared with the previously published HALT-C model. Discrimination was assessed using receiver operating characteristic curve analysis, and diagnostic accuracy was assessed with net reclassification improvement and integrated discrimination improvement statistics.RESULTS:After a median follow-up of 3.5 years, 41 patients developed HCC. The UM regression model had a c-statistic of 0.61 (95{\%} confidence interval (CI) 0.56-0.67), whereas the machine-learning algorithm had a c-statistic of 0.64 (95{\%} CI 0.60-0.69) in the validation cohort. The HALT-C model had a c-statistic of 0.60 (95{\%} CI 0.50-0.70) in the validation cohort and was outperformed by the machine-learning algorithm. The machine-learning algorithm had significantly better diagnostic accuracy as assessed by net reclassification improvement (P<0.001) and integrated discrimination improvement (P=0.04).CONCLUSIONS:Machine-learning algorithms improve the accuracy of risk stratifying patients with cirrhosis and can be used to accurately identify patients at high-risk for developing HCC.",
author = "Singal, {Amit G.} and Ashin Mukherjee and {Joseph Elmunzer}, B. and Higgins, {Peter D R} and Lok, {Anna S.} and Ji Zhu and Marrero, {Jorge A.} and Waljee, {Akbar K.}",
year = "2013",
month = "11",
doi = "10.1038/ajg.2013.332",
language = "English (US)",
volume = "108",
pages = "1723--1730",
journal = "American Journal of Gastroenterology",
issn = "0002-9270",
publisher = "Nature Publishing Group",
number = "11",

}

TY - JOUR

T1 - Machine learning algorithms outperform conventional regression models in predicting development of hepatocellular carcinoma

AU - Singal, Amit G.

AU - Mukherjee, Ashin

AU - Joseph Elmunzer, B.

AU - Higgins, Peter D R

AU - Lok, Anna S.

AU - Zhu, Ji

AU - Marrero, Jorge A.

AU - Waljee, Akbar K.

PY - 2013/11

Y1 - 2013/11

N2 - OBJECTIVES:Predictive models for hepatocellular carcinoma (HCC) have been limited by modest accuracy and lack of validation. Machine-learning algorithms offer a novel methodology, which may improve HCC risk prognostication among patients with cirrhosis. Our study's aim was to develop and compare predictive models for HCC development among cirrhotic patients, using conventional regression analysis and machine-learning algorithms.METHODS:We enrolled 442 patients with Child A or B cirrhosis at the University of Michigan between January 2004 and September 2006 (UM cohort) and prospectively followed them until HCC development, liver transplantation, death, or study termination. Regression analysis and machine-learning algorithms were used to construct predictive models for HCC development, which were tested on an independent validation cohort from the Hepatitis C Antiviral Long-term Treatment against Cirrhosis (HALT-C) Trial. Both models were also compared with the previously published HALT-C model. Discrimination was assessed using receiver operating characteristic curve analysis, and diagnostic accuracy was assessed with net reclassification improvement and integrated discrimination improvement statistics.RESULTS:After a median follow-up of 3.5 years, 41 patients developed HCC. The UM regression model had a c-statistic of 0.61 (95% confidence interval (CI) 0.56-0.67), whereas the machine-learning algorithm had a c-statistic of 0.64 (95% CI 0.60-0.69) in the validation cohort. The HALT-C model had a c-statistic of 0.60 (95% CI 0.50-0.70) in the validation cohort and was outperformed by the machine-learning algorithm. The machine-learning algorithm had significantly better diagnostic accuracy as assessed by net reclassification improvement (P<0.001) and integrated discrimination improvement (P=0.04).CONCLUSIONS:Machine-learning algorithms improve the accuracy of risk stratifying patients with cirrhosis and can be used to accurately identify patients at high-risk for developing HCC.

AB - OBJECTIVES:Predictive models for hepatocellular carcinoma (HCC) have been limited by modest accuracy and lack of validation. Machine-learning algorithms offer a novel methodology, which may improve HCC risk prognostication among patients with cirrhosis. Our study's aim was to develop and compare predictive models for HCC development among cirrhotic patients, using conventional regression analysis and machine-learning algorithms.METHODS:We enrolled 442 patients with Child A or B cirrhosis at the University of Michigan between January 2004 and September 2006 (UM cohort) and prospectively followed them until HCC development, liver transplantation, death, or study termination. Regression analysis and machine-learning algorithms were used to construct predictive models for HCC development, which were tested on an independent validation cohort from the Hepatitis C Antiviral Long-term Treatment against Cirrhosis (HALT-C) Trial. Both models were also compared with the previously published HALT-C model. Discrimination was assessed using receiver operating characteristic curve analysis, and diagnostic accuracy was assessed with net reclassification improvement and integrated discrimination improvement statistics.RESULTS:After a median follow-up of 3.5 years, 41 patients developed HCC. The UM regression model had a c-statistic of 0.61 (95% confidence interval (CI) 0.56-0.67), whereas the machine-learning algorithm had a c-statistic of 0.64 (95% CI 0.60-0.69) in the validation cohort. The HALT-C model had a c-statistic of 0.60 (95% CI 0.50-0.70) in the validation cohort and was outperformed by the machine-learning algorithm. The machine-learning algorithm had significantly better diagnostic accuracy as assessed by net reclassification improvement (P<0.001) and integrated discrimination improvement (P=0.04).CONCLUSIONS:Machine-learning algorithms improve the accuracy of risk stratifying patients with cirrhosis and can be used to accurately identify patients at high-risk for developing HCC.

UR - http://www.scopus.com/inward/record.url?scp=84887246574&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84887246574&partnerID=8YFLogxK

U2 - 10.1038/ajg.2013.332

DO - 10.1038/ajg.2013.332

M3 - Article

C2 - 24169273

AN - SCOPUS:84887246574

VL - 108

SP - 1723

EP - 1730

JO - American Journal of Gastroenterology

JF - American Journal of Gastroenterology

SN - 0002-9270

IS - 11

ER -