TY - JOUR
T1 - Data electronically extracted from the electronic health record require validation
AU - Scheid, Lisa M.
AU - Brown, L. Steven
AU - Clark, Christopher
AU - Rosenfeld, Charles R.
N1 - Publisher Copyright:
© 2019, Springer Nature America, Inc.
PY - 2019/3/1
Y1 - 2019/3/1
N2 - Objectives: Determine sources of error in electronically extracted data from electronic health records. Study design: Categorical and continuous variables related to early-onset neonatal hypoglycemia were preselected and electronically extracted from records of 100 randomly selected neonates within 3479 births with laboratory-proven early-onset hypoglycemia. Extraction language was written by an information technologist and data validated by blinded manual chart review. Kappa coefficient assessed categorical variables and percent validity continuous variables. Results: 8/23 (35%) categorical variables had acceptable Κappa (1–0.81); 5/23 (22%) had fair-slight agreement, Κappa < 0.40. Notably, “hypoglycemia” had poor agreement, Κappa 0.16. In contrast, 6/8 continuous variables had validity ≥ 94%. After correcting extraction language, 6/9 variables were corrected and inter-rater validation improved. However, “hypoglycemia” was not corrected, remaining an issue. Conclusions: Data extraction without validation procedures, especially categorical variables using International Classification of Diseases-9 (ICD-9) codes, often results in incorrect data identification. Electronically extracted data must incorporate built-in validating processes.
AB - Objectives: Determine sources of error in electronically extracted data from electronic health records. Study design: Categorical and continuous variables related to early-onset neonatal hypoglycemia were preselected and electronically extracted from records of 100 randomly selected neonates within 3479 births with laboratory-proven early-onset hypoglycemia. Extraction language was written by an information technologist and data validated by blinded manual chart review. Kappa coefficient assessed categorical variables and percent validity continuous variables. Results: 8/23 (35%) categorical variables had acceptable Κappa (1–0.81); 5/23 (22%) had fair-slight agreement, Κappa < 0.40. Notably, “hypoglycemia” had poor agreement, Κappa 0.16. In contrast, 6/8 continuous variables had validity ≥ 94%. After correcting extraction language, 6/9 variables were corrected and inter-rater validation improved. However, “hypoglycemia” was not corrected, remaining an issue. Conclusions: Data extraction without validation procedures, especially categorical variables using International Classification of Diseases-9 (ICD-9) codes, often results in incorrect data identification. Electronically extracted data must incorporate built-in validating processes.
UR - http://www.scopus.com/inward/record.url?scp=85060606104&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85060606104&partnerID=8YFLogxK
U2 - 10.1038/s41372-018-0311-8
DO - 10.1038/s41372-018-0311-8
M3 - Article
C2 - 30679823
AN - SCOPUS:85060606104
SN - 0743-8346
VL - 39
SP - 468
EP - 474
JO - Journal of Perinatology
JF - Journal of Perinatology
IS - 3
ER -