Automated Classification of Radiology Reports to Facilitate Retrospective Study in Radiology

Yihua Zhou, Per K. Amundson, Fang Yu, Marcus M. Kessler, Tammie L.S. Benzinger, Franz J. Wippold

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

Retrospective research is an import tool in radiology. Identifying imaging examinations appropriate for a given research question from the unstructured radiology reports is extremely useful, but labor-intensive. Using the machine learning text-mining methods implemented in LingPipe [1], we evaluated the performance of the dynamic language model (DLM) and the Naïve Bayesian (NB) classifiers in classifying radiology reports to facilitate identification of radiological examinations for research projects. The training dataset consisted of 14,325 sentences from 11,432 radiology reports randomly selected from a database of 5,104,594 reports in all disciplines of radiology. The training sentences were categorized manually into six categories (Positive, Differential, Post Treatment, Negative, Normal, and History). A 10-fold cross-validation [2] was used to evaluate the performance of the models, which were tested in classification of radiology reports for cases of sellar or suprasellar masses and colloid cysts. The average accuracies for the DLM and NB classifiers were 88.5 % with 95 % confidence interval (CI) of 1.9 % and 85.9 % with 95 % CI of 2.0 %, respectively. The DLM performed slightly better and was used to classify 1,397 radiology reports containing the keywords “sellar or suprasellar mass”, or “colloid cyst”. The DLM model produced an accuracy of 88.2 % with 95 % CI of 2.1 % for 959 reports that contain “sellar or suprasellar mass” and an accuracy of 86.3 % with 95 % CI of 2.5 % for 437 reports of “colloid cyst”. We conclude that automated classification of radiology reports using machine learning techniques can effectively facilitate the identification of cases suitable for retrospective research.

Original languageEnglish (US)
Pages (from-to)730-736
Number of pages7
JournalJournal of Digital Imaging
Volume27
Issue number6
DOIs
StatePublished - Nov 6 2014
Externally publishedYes

Fingerprint

Radiology
Retrospective Studies
Colloid Cysts
Language
Central Nervous System Cysts
Colloids
Confidence Intervals
Research
Learning systems
Classifiers
Data Mining
Identification (control systems)
History
Personnel
Databases
Imaging techniques

Keywords

  • Computer analysis
  • Machine learning
  • Natural language processing
  • Radiology Information Systems (RIS)
  • Radiology report classification
  • Radiology reporting
  • Retrospective studies

ASJC Scopus subject areas

  • Radiological and Ultrasound Technology
  • Radiology Nuclear Medicine and imaging
  • Computer Science Applications

Cite this

Automated Classification of Radiology Reports to Facilitate Retrospective Study in Radiology. / Zhou, Yihua; Amundson, Per K.; Yu, Fang; Kessler, Marcus M.; Benzinger, Tammie L.S.; Wippold, Franz J.

In: Journal of Digital Imaging, Vol. 27, No. 6, 06.11.2014, p. 730-736.

Research output: Contribution to journalArticle

Zhou, Yihua ; Amundson, Per K. ; Yu, Fang ; Kessler, Marcus M. ; Benzinger, Tammie L.S. ; Wippold, Franz J. / Automated Classification of Radiology Reports to Facilitate Retrospective Study in Radiology. In: Journal of Digital Imaging. 2014 ; Vol. 27, No. 6. pp. 730-736.
@article{2cb05a66308b4dfda7796830af0bf5a3,
title = "Automated Classification of Radiology Reports to Facilitate Retrospective Study in Radiology",
abstract = "Retrospective research is an import tool in radiology. Identifying imaging examinations appropriate for a given research question from the unstructured radiology reports is extremely useful, but labor-intensive. Using the machine learning text-mining methods implemented in LingPipe [1], we evaluated the performance of the dynamic language model (DLM) and the Na{\"i}ve Bayesian (NB) classifiers in classifying radiology reports to facilitate identification of radiological examinations for research projects. The training dataset consisted of 14,325 sentences from 11,432 radiology reports randomly selected from a database of 5,104,594 reports in all disciplines of radiology. The training sentences were categorized manually into six categories (Positive, Differential, Post Treatment, Negative, Normal, and History). A 10-fold cross-validation [2] was used to evaluate the performance of the models, which were tested in classification of radiology reports for cases of sellar or suprasellar masses and colloid cysts. The average accuracies for the DLM and NB classifiers were 88.5 {\%} with 95 {\%} confidence interval (CI) of 1.9 {\%} and 85.9 {\%} with 95 {\%} CI of 2.0 {\%}, respectively. The DLM performed slightly better and was used to classify 1,397 radiology reports containing the keywords “sellar or suprasellar mass”, or “colloid cyst”. The DLM model produced an accuracy of 88.2 {\%} with 95 {\%} CI of 2.1 {\%} for 959 reports that contain “sellar or suprasellar mass” and an accuracy of 86.3 {\%} with 95 {\%} CI of 2.5 {\%} for 437 reports of “colloid cyst”. We conclude that automated classification of radiology reports using machine learning techniques can effectively facilitate the identification of cases suitable for retrospective research.",
keywords = "Computer analysis, Machine learning, Natural language processing, Radiology Information Systems (RIS), Radiology report classification, Radiology reporting, Retrospective studies",
author = "Yihua Zhou and Amundson, {Per K.} and Fang Yu and Kessler, {Marcus M.} and Benzinger, {Tammie L.S.} and Wippold, {Franz J.}",
year = "2014",
month = "11",
day = "6",
doi = "10.1007/s10278-014-9708-x",
language = "English (US)",
volume = "27",
pages = "730--736",
journal = "Journal of Digital Imaging",
issn = "0897-1889",
publisher = "Springer New York",
number = "6",

}

TY - JOUR

T1 - Automated Classification of Radiology Reports to Facilitate Retrospective Study in Radiology

AU - Zhou, Yihua

AU - Amundson, Per K.

AU - Yu, Fang

AU - Kessler, Marcus M.

AU - Benzinger, Tammie L.S.

AU - Wippold, Franz J.

PY - 2014/11/6

Y1 - 2014/11/6

N2 - Retrospective research is an import tool in radiology. Identifying imaging examinations appropriate for a given research question from the unstructured radiology reports is extremely useful, but labor-intensive. Using the machine learning text-mining methods implemented in LingPipe [1], we evaluated the performance of the dynamic language model (DLM) and the Naïve Bayesian (NB) classifiers in classifying radiology reports to facilitate identification of radiological examinations for research projects. The training dataset consisted of 14,325 sentences from 11,432 radiology reports randomly selected from a database of 5,104,594 reports in all disciplines of radiology. The training sentences were categorized manually into six categories (Positive, Differential, Post Treatment, Negative, Normal, and History). A 10-fold cross-validation [2] was used to evaluate the performance of the models, which were tested in classification of radiology reports for cases of sellar or suprasellar masses and colloid cysts. The average accuracies for the DLM and NB classifiers were 88.5 % with 95 % confidence interval (CI) of 1.9 % and 85.9 % with 95 % CI of 2.0 %, respectively. The DLM performed slightly better and was used to classify 1,397 radiology reports containing the keywords “sellar or suprasellar mass”, or “colloid cyst”. The DLM model produced an accuracy of 88.2 % with 95 % CI of 2.1 % for 959 reports that contain “sellar or suprasellar mass” and an accuracy of 86.3 % with 95 % CI of 2.5 % for 437 reports of “colloid cyst”. We conclude that automated classification of radiology reports using machine learning techniques can effectively facilitate the identification of cases suitable for retrospective research.

AB - Retrospective research is an import tool in radiology. Identifying imaging examinations appropriate for a given research question from the unstructured radiology reports is extremely useful, but labor-intensive. Using the machine learning text-mining methods implemented in LingPipe [1], we evaluated the performance of the dynamic language model (DLM) and the Naïve Bayesian (NB) classifiers in classifying radiology reports to facilitate identification of radiological examinations for research projects. The training dataset consisted of 14,325 sentences from 11,432 radiology reports randomly selected from a database of 5,104,594 reports in all disciplines of radiology. The training sentences were categorized manually into six categories (Positive, Differential, Post Treatment, Negative, Normal, and History). A 10-fold cross-validation [2] was used to evaluate the performance of the models, which were tested in classification of radiology reports for cases of sellar or suprasellar masses and colloid cysts. The average accuracies for the DLM and NB classifiers were 88.5 % with 95 % confidence interval (CI) of 1.9 % and 85.9 % with 95 % CI of 2.0 %, respectively. The DLM performed slightly better and was used to classify 1,397 radiology reports containing the keywords “sellar or suprasellar mass”, or “colloid cyst”. The DLM model produced an accuracy of 88.2 % with 95 % CI of 2.1 % for 959 reports that contain “sellar or suprasellar mass” and an accuracy of 86.3 % with 95 % CI of 2.5 % for 437 reports of “colloid cyst”. We conclude that automated classification of radiology reports using machine learning techniques can effectively facilitate the identification of cases suitable for retrospective research.

KW - Computer analysis

KW - Machine learning

KW - Natural language processing

KW - Radiology Information Systems (RIS)

KW - Radiology report classification

KW - Radiology reporting

KW - Retrospective studies

UR - http://www.scopus.com/inward/record.url?scp=84911970862&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84911970862&partnerID=8YFLogxK

U2 - 10.1007/s10278-014-9708-x

DO - 10.1007/s10278-014-9708-x

M3 - Article

C2 - 24874407

AN - SCOPUS:84911970862

VL - 27

SP - 730

EP - 736

JO - Journal of Digital Imaging

JF - Journal of Digital Imaging

SN - 0897-1889

IS - 6

ER -