Performance of four computer-based diagnostic systems

Eta S. Berner; George D. Webster; Alwyn A. Shugerman; James R. Jackson; James Algina; Alfred L. Baker; Eugene V. Ball; C. Glenn Cobbs; Vincent W. Dennis; Eugene P. Frenkel; Leonard D. Hudson; Elliott L. Mancall; Charles E. Rackley; O. David Taunton

doi:10.1056/NEJM199406233302506

Performance of four computer-based diagnostic systems

Eta S. Berner, George D. Webster, Alwyn A. Shugerman, James R. Jackson, James Algina, Alfred L. Baker, Eugene V. Ball, C. Glenn Cobbs, Vincent W. Dennis, Eugene P. Frenkel, Leonard D. Hudson, Elliott L. Mancall, Charles E. Rackley, O. David Taunton

Research output: Contribution to journal › Article › peer-review

226 Scopus citations

Abstract

Computer-based diagnostic systems are available commercially, but there has been limited evaluation of their performance. We assessed the diagnostic capabilities of four internal medicine diagnostic systems: Dxplain, Iliad, Meditel, and QMR. Ten expert clinicians created a set of 105 diagnostically challenging clinical case summaries involving actual patients. Clinical data were entered into each program with the vocabulary provided by the program's developer. Each of the systems produced a ranked list of possible diagnoses for each patient, as did the group of experts. We calculated scores on several performance measures for each computer program. No single computer program scored better than the others on all performance measures. Among all cases and all programs, the proportion of correct diagnoses ranged from 0.52 to 0.71, and the mean proportion of relevant diagnoses ranged from 0.19 to 0.37. On average, less than half the diagnoses on the experts' original list of reasonable diagnoses were suggested by any of the programs. However, each program suggested an average of approximately two additional diagnoses per case that the experts found relevant but had not originally considered. The results provide a profile of the strengths and limitations of these computer programs. The programs should be used by physicians who can identify and use the relevant information and ignore the irrelevant information that can be produced.

Original language	English (US)
Pages (from-to)	1792-1796
Number of pages	5
Journal	New England Journal of Medicine
Volume	330
Issue number	25
DOIs	https://doi.org/10.1056/NEJM199406233302506
State	Published - Jun 23 1994

ASJC Scopus subject areas

General Medicine

Access to Document

10.1056/NEJM199406233302506

Cite this

Berner, E. S., Webster, G. D., Shugerman, A. A., Jackson, J. R., Algina, J., Baker, A. L., Ball, E. V., Glenn Cobbs, C., Dennis, V. W., Frenkel, E. P., Hudson, L. D., Mancall, E. L., Rackley, C. E., & David Taunton, O. (1994). Performance of four computer-based diagnostic systems. New England Journal of Medicine, 330(25), 1792-1796. https://doi.org/10.1056/NEJM199406233302506

@article{17e256efaf22409a93178a3edb179622,

title = "Performance of four computer-based diagnostic systems",

abstract = "Computer-based diagnostic systems are available commercially, but there has been limited evaluation of their performance. We assessed the diagnostic capabilities of four internal medicine diagnostic systems: Dxplain, Iliad, Meditel, and QMR. Ten expert clinicians created a set of 105 diagnostically challenging clinical case summaries involving actual patients. Clinical data were entered into each program with the vocabulary provided by the program's developer. Each of the systems produced a ranked list of possible diagnoses for each patient, as did the group of experts. We calculated scores on several performance measures for each computer program. No single computer program scored better than the others on all performance measures. Among all cases and all programs, the proportion of correct diagnoses ranged from 0.52 to 0.71, and the mean proportion of relevant diagnoses ranged from 0.19 to 0.37. On average, less than half the diagnoses on the experts' original list of reasonable diagnoses were suggested by any of the programs. However, each program suggested an average of approximately two additional diagnoses per case that the experts found relevant but had not originally considered. The results provide a profile of the strengths and limitations of these computer programs. The programs should be used by physicians who can identify and use the relevant information and ignore the irrelevant information that can be produced.",

author = "Berner, {Eta S.} and Webster, {George D.} and Shugerman, {Alwyn A.} and Jackson, {James R.} and James Algina and Baker, {Alfred L.} and Ball, {Eugene V.} and {Glenn Cobbs}, C. and Dennis, {Vincent W.} and Frenkel, {Eugene P.} and Hudson, {Leonard D.} and Mancall, {Elliott L.} and Rackley, {Charles E.} and {David Taunton}, O.",

year = "1994",

month = jun,

day = "23",

doi = "10.1056/NEJM199406233302506",

language = "English (US)",

volume = "330",

pages = "1792--1796",

journal = "New England Journal of Medicine",

issn = "0028-4793",

publisher = "Massachussetts Medical Society",

number = "25",

}

TY - JOUR

T1 - Performance of four computer-based diagnostic systems

AU - Berner, Eta S.

AU - Webster, George D.

AU - Shugerman, Alwyn A.

AU - Jackson, James R.

AU - Algina, James

AU - Baker, Alfred L.

AU - Ball, Eugene V.

AU - Glenn Cobbs, C.

AU - Dennis, Vincent W.

AU - Frenkel, Eugene P.

AU - Hudson, Leonard D.

AU - Mancall, Elliott L.

AU - Rackley, Charles E.

AU - David Taunton, O.

PY - 1994/6/23

Y1 - 1994/6/23

N2 - Computer-based diagnostic systems are available commercially, but there has been limited evaluation of their performance. We assessed the diagnostic capabilities of four internal medicine diagnostic systems: Dxplain, Iliad, Meditel, and QMR. Ten expert clinicians created a set of 105 diagnostically challenging clinical case summaries involving actual patients. Clinical data were entered into each program with the vocabulary provided by the program's developer. Each of the systems produced a ranked list of possible diagnoses for each patient, as did the group of experts. We calculated scores on several performance measures for each computer program. No single computer program scored better than the others on all performance measures. Among all cases and all programs, the proportion of correct diagnoses ranged from 0.52 to 0.71, and the mean proportion of relevant diagnoses ranged from 0.19 to 0.37. On average, less than half the diagnoses on the experts' original list of reasonable diagnoses were suggested by any of the programs. However, each program suggested an average of approximately two additional diagnoses per case that the experts found relevant but had not originally considered. The results provide a profile of the strengths and limitations of these computer programs. The programs should be used by physicians who can identify and use the relevant information and ignore the irrelevant information that can be produced.

AB - Computer-based diagnostic systems are available commercially, but there has been limited evaluation of their performance. We assessed the diagnostic capabilities of four internal medicine diagnostic systems: Dxplain, Iliad, Meditel, and QMR. Ten expert clinicians created a set of 105 diagnostically challenging clinical case summaries involving actual patients. Clinical data were entered into each program with the vocabulary provided by the program's developer. Each of the systems produced a ranked list of possible diagnoses for each patient, as did the group of experts. We calculated scores on several performance measures for each computer program. No single computer program scored better than the others on all performance measures. Among all cases and all programs, the proportion of correct diagnoses ranged from 0.52 to 0.71, and the mean proportion of relevant diagnoses ranged from 0.19 to 0.37. On average, less than half the diagnoses on the experts' original list of reasonable diagnoses were suggested by any of the programs. However, each program suggested an average of approximately two additional diagnoses per case that the experts found relevant but had not originally considered. The results provide a profile of the strengths and limitations of these computer programs. The programs should be used by physicians who can identify and use the relevant information and ignore the irrelevant information that can be produced.

UR - http://www.scopus.com/inward/record.url?scp=0028234740&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0028234740&partnerID=8YFLogxK

U2 - 10.1056/NEJM199406233302506

DO - 10.1056/NEJM199406233302506

M3 - Article

C2 - 8190157

AN - SCOPUS:0028234740

SN - 0028-4793

VL - 330

SP - 1792

EP - 1796

JO - New England Journal of Medicine

JF - New England Journal of Medicine

IS - 25

ER -

Performance of four computer-based diagnostic systems

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this