TY - JOUR
T1 - Natural language processing for cohort discovery in a discharge prediction model for the neonatal ICU
AU - Temple, Michael W.
AU - Lehmann, Christoph U.
AU - Fabbri, Daniel
N1 - Publisher Copyright:
© Schattauer 2016.
Copyright:
Copyright 2016 Elsevier B.V., All rights reserved.
PY - 2016/2/24
Y1 - 2016/2/24
N2 - Objectives: Discharging patients from the Neonatal Intensive Care Unit (NICU) can be delayed for non-medical reasons including the procurement of home medical equipment, parental education, and the need for children’s services. We previously created a model to identify patients that will be medically ready for discharge in the subsequent 2–10 days. In this study we use Natural Language Processing to improve upon that model and discern why the model performed poorly on certain patients. Methods: We retrospectively examined the text of the Assessment and Plan section from daily progress notes of 4,693 patients (103,206 patient-days) from the NICU of a large, academic children’s hospital. A matrix was constructed using words from NICU notes (single words and bigrams) to train a supervised machine learning algorithm to determine the most important words differentiating poorly performing patients compared to well performing patients in our original discharge prediction model. Results: NLP using a bag of words (BOW) analysis revealed several cohorts that performed poorly in our original model. These included patients with surgical diagnoses, pulmonary hypertension, retinopathy of prematurity, and psychosocial issues. Discussion: The BOW approach aided in cohort discovery and will allow further refinement of our original discharge model prediction. Adequately identifying patients discharged home on g-tube feeds alone could improve the AUC of our original model by 0.02. Additionally, this approach identified social issues as a major cause for delayed discharge. Conclusion: A BOW analysis provides a method to improve and refine our NICU discharge prediction model and could potentially avoid over 900 (0.9%) hospital days.
AB - Objectives: Discharging patients from the Neonatal Intensive Care Unit (NICU) can be delayed for non-medical reasons including the procurement of home medical equipment, parental education, and the need for children’s services. We previously created a model to identify patients that will be medically ready for discharge in the subsequent 2–10 days. In this study we use Natural Language Processing to improve upon that model and discern why the model performed poorly on certain patients. Methods: We retrospectively examined the text of the Assessment and Plan section from daily progress notes of 4,693 patients (103,206 patient-days) from the NICU of a large, academic children’s hospital. A matrix was constructed using words from NICU notes (single words and bigrams) to train a supervised machine learning algorithm to determine the most important words differentiating poorly performing patients compared to well performing patients in our original discharge prediction model. Results: NLP using a bag of words (BOW) analysis revealed several cohorts that performed poorly in our original model. These included patients with surgical diagnoses, pulmonary hypertension, retinopathy of prematurity, and psychosocial issues. Discussion: The BOW approach aided in cohort discovery and will allow further refinement of our original discharge model prediction. Adequately identifying patients discharged home on g-tube feeds alone could improve the AUC of our original model by 0.02. Additionally, this approach identified social issues as a major cause for delayed discharge. Conclusion: A BOW analysis provides a method to improve and refine our NICU discharge prediction model and could potentially avoid over 900 (0.9%) hospital days.
KW - Area under curve
KW - Neonatal intensive care units
KW - Patient discharge
KW - ROC curve
UR - http://www.scopus.com/inward/record.url?scp=84959344798&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84959344798&partnerID=8YFLogxK
U2 - 10.4338/ACI-2015-09-RA-0114
DO - 10.4338/ACI-2015-09-RA-0114
M3 - Article
C2 - 27081410
AN - SCOPUS:84959344798
VL - 7
SP - 101
EP - 115
JO - Applied Clinical Informatics
JF - Applied Clinical Informatics
SN - 1869-0327
IS - 1
ER -