Data electronically extracted from the electronic health record require validation

Lisa M. Scheid, L. Steven Brown, Christopher Clark, Charles R Rosenfeld

Research output: Contribution to journalArticle

Abstract

Objectives: Determine sources of error in electronically extracted data from electronic health records. Study design: Categorical and continuous variables related to early-onset neonatal hypoglycemia were preselected and electronically extracted from records of 100 randomly selected neonates within 3479 births with laboratory-proven early-onset hypoglycemia. Extraction language was written by an information technologist and data validated by blinded manual chart review. Kappa coefficient assessed categorical variables and percent validity continuous variables. Results: 8/23 (35%) categorical variables had acceptable Κappa (1–0.81); 5/23 (22%) had fair-slight agreement, Κappa < 0.40. Notably, “hypoglycemia” had poor agreement, Κappa 0.16. In contrast, 6/8 continuous variables had validity ≥ 94%. After correcting extraction language, 6/9 variables were corrected and inter-rater validation improved. However, “hypoglycemia” was not corrected, remaining an issue. Conclusions: Data extraction without validation procedures, especially categorical variables using International Classification of Diseases-9 (ICD-9) codes, often results in incorrect data identification. Electronically extracted data must incorporate built-in validating processes.

Original languageEnglish (US)
JournalJournal of Perinatology
DOIs
StateAccepted/In press - Jan 1 2019

Fingerprint

Electronic Health Records
Hypoglycemia
Language
International Classification of Diseases
Research Design
Parturition
diadenosine pyrophosphate

ASJC Scopus subject areas

  • Pediatrics, Perinatology, and Child Health
  • Obstetrics and Gynecology

Cite this

Data electronically extracted from the electronic health record require validation. / Scheid, Lisa M.; Brown, L. Steven; Clark, Christopher; Rosenfeld, Charles R.

In: Journal of Perinatology, 01.01.2019.

Research output: Contribution to journalArticle

@article{71d1644f14bd4a08b882039f09951e7b,
title = "Data electronically extracted from the electronic health record require validation",
abstract = "Objectives: Determine sources of error in electronically extracted data from electronic health records. Study design: Categorical and continuous variables related to early-onset neonatal hypoglycemia were preselected and electronically extracted from records of 100 randomly selected neonates within 3479 births with laboratory-proven early-onset hypoglycemia. Extraction language was written by an information technologist and data validated by blinded manual chart review. Kappa coefficient assessed categorical variables and percent validity continuous variables. Results: 8/23 (35{\%}) categorical variables had acceptable Κappa (1–0.81); 5/23 (22{\%}) had fair-slight agreement, Κappa < 0.40. Notably, “hypoglycemia” had poor agreement, Κappa 0.16. In contrast, 6/8 continuous variables had validity ≥ 94{\%}. After correcting extraction language, 6/9 variables were corrected and inter-rater validation improved. However, “hypoglycemia” was not corrected, remaining an issue. Conclusions: Data extraction without validation procedures, especially categorical variables using International Classification of Diseases-9 (ICD-9) codes, often results in incorrect data identification. Electronically extracted data must incorporate built-in validating processes.",
author = "Scheid, {Lisa M.} and Brown, {L. Steven} and Christopher Clark and Rosenfeld, {Charles R}",
year = "2019",
month = "1",
day = "1",
doi = "10.1038/s41372-018-0311-8",
language = "English (US)",
journal = "Journal of Perinatology",
issn = "0743-8346",
publisher = "Nature Publishing Group",

}

TY - JOUR

T1 - Data electronically extracted from the electronic health record require validation

AU - Scheid, Lisa M.

AU - Brown, L. Steven

AU - Clark, Christopher

AU - Rosenfeld, Charles R

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Objectives: Determine sources of error in electronically extracted data from electronic health records. Study design: Categorical and continuous variables related to early-onset neonatal hypoglycemia were preselected and electronically extracted from records of 100 randomly selected neonates within 3479 births with laboratory-proven early-onset hypoglycemia. Extraction language was written by an information technologist and data validated by blinded manual chart review. Kappa coefficient assessed categorical variables and percent validity continuous variables. Results: 8/23 (35%) categorical variables had acceptable Κappa (1–0.81); 5/23 (22%) had fair-slight agreement, Κappa < 0.40. Notably, “hypoglycemia” had poor agreement, Κappa 0.16. In contrast, 6/8 continuous variables had validity ≥ 94%. After correcting extraction language, 6/9 variables were corrected and inter-rater validation improved. However, “hypoglycemia” was not corrected, remaining an issue. Conclusions: Data extraction without validation procedures, especially categorical variables using International Classification of Diseases-9 (ICD-9) codes, often results in incorrect data identification. Electronically extracted data must incorporate built-in validating processes.

AB - Objectives: Determine sources of error in electronically extracted data from electronic health records. Study design: Categorical and continuous variables related to early-onset neonatal hypoglycemia were preselected and electronically extracted from records of 100 randomly selected neonates within 3479 births with laboratory-proven early-onset hypoglycemia. Extraction language was written by an information technologist and data validated by blinded manual chart review. Kappa coefficient assessed categorical variables and percent validity continuous variables. Results: 8/23 (35%) categorical variables had acceptable Κappa (1–0.81); 5/23 (22%) had fair-slight agreement, Κappa < 0.40. Notably, “hypoglycemia” had poor agreement, Κappa 0.16. In contrast, 6/8 continuous variables had validity ≥ 94%. After correcting extraction language, 6/9 variables were corrected and inter-rater validation improved. However, “hypoglycemia” was not corrected, remaining an issue. Conclusions: Data extraction without validation procedures, especially categorical variables using International Classification of Diseases-9 (ICD-9) codes, often results in incorrect data identification. Electronically extracted data must incorporate built-in validating processes.

UR - http://www.scopus.com/inward/record.url?scp=85060606104&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85060606104&partnerID=8YFLogxK

U2 - 10.1038/s41372-018-0311-8

DO - 10.1038/s41372-018-0311-8

M3 - Article

C2 - 30679823

AN - SCOPUS:85060606104

JO - Journal of Perinatology

JF - Journal of Perinatology

SN - 0743-8346

ER -