A robust biomarker discovery pipeline for high-performance mass spectrometry data

Wayne G. Fisher, Kevin P. Rosenblatt, David A. Fishman, Gordon R. Whitteley, Alvydas Mikulskis, Scott A. Kuzdzal, Mary F. Lopez, Niclas Chiang Tan, Dwight C. German, Harold R. Garner

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

A high-throughput software pipeline for analyzing high-performance mass spectral data sets has been developed to facilitate rapid and accurate biomarker determination. The software exploits the mass precision and resolution of high-performance instrumentation, bypasses peak-finding steps, and instead uses discrete m/z data points to identify putative biomarkers. The technique is insensitive to peak shape, and works on overlapping and non-Gaussian peaks which can confound peak-finding algorithms. Methods are presented to assess data set quality and the suitability of groups of m/z values that map to peaks as potential biomarkers. The algorithm is demonstrated with serum mass spectra from patients with and without ovarian cancer. Biomarker candidates are identified and ranked by their ability to discriminate between cancer and noncancer conditions. Their discriminating power is tested by classifying unknowns using a simple distance calculation, and a sensitivity of 95.6% and a specificity of 97.1% are obtained. In contrast, the sensitivity of the ovarian cancer blood marker CA125 is ∼50% for stage I/II and ∼80% for stage III/IV cancers. While the generalizability of these markers is currently unknown, we have demonstrated the ability of our analytical package to extract biomarker candidates from high-performance mass spectral data.

Original languageEnglish (US)
Pages (from-to)1023-1045
Number of pages23
JournalJournal of Bioinformatics and Computational Biology
Volume5
Issue number5
DOIs
StatePublished - Oct 2007

Fingerprint

Biomarkers
Mass spectrometry
Mass Spectrometry
Pipelines
Ovarian Neoplasms
Software
Neoplasms
Blood
Throughput
Serum
Datasets

Keywords

  • Biomarkers
  • Mass spectra
  • Ovarian cancer

ASJC Scopus subject areas

  • Medicine(all)
  • Cell Biology

Cite this

Fisher, W. G., Rosenblatt, K. P., Fishman, D. A., Whitteley, G. R., Mikulskis, A., Kuzdzal, S. A., ... Garner, H. R. (2007). A robust biomarker discovery pipeline for high-performance mass spectrometry data. Journal of Bioinformatics and Computational Biology, 5(5), 1023-1045. https://doi.org/10.1142/S021972000700303X

A robust biomarker discovery pipeline for high-performance mass spectrometry data. / Fisher, Wayne G.; Rosenblatt, Kevin P.; Fishman, David A.; Whitteley, Gordon R.; Mikulskis, Alvydas; Kuzdzal, Scott A.; Lopez, Mary F.; Tan, Niclas Chiang; German, Dwight C.; Garner, Harold R.

In: Journal of Bioinformatics and Computational Biology, Vol. 5, No. 5, 10.2007, p. 1023-1045.

Research output: Contribution to journalArticle

Fisher, WG, Rosenblatt, KP, Fishman, DA, Whitteley, GR, Mikulskis, A, Kuzdzal, SA, Lopez, MF, Tan, NC, German, DC & Garner, HR 2007, 'A robust biomarker discovery pipeline for high-performance mass spectrometry data', Journal of Bioinformatics and Computational Biology, vol. 5, no. 5, pp. 1023-1045. https://doi.org/10.1142/S021972000700303X
Fisher, Wayne G. ; Rosenblatt, Kevin P. ; Fishman, David A. ; Whitteley, Gordon R. ; Mikulskis, Alvydas ; Kuzdzal, Scott A. ; Lopez, Mary F. ; Tan, Niclas Chiang ; German, Dwight C. ; Garner, Harold R. / A robust biomarker discovery pipeline for high-performance mass spectrometry data. In: Journal of Bioinformatics and Computational Biology. 2007 ; Vol. 5, No. 5. pp. 1023-1045.
@article{ea26b7a57f84489191c8931058ea4706,
title = "A robust biomarker discovery pipeline for high-performance mass spectrometry data",
abstract = "A high-throughput software pipeline for analyzing high-performance mass spectral data sets has been developed to facilitate rapid and accurate biomarker determination. The software exploits the mass precision and resolution of high-performance instrumentation, bypasses peak-finding steps, and instead uses discrete m/z data points to identify putative biomarkers. The technique is insensitive to peak shape, and works on overlapping and non-Gaussian peaks which can confound peak-finding algorithms. Methods are presented to assess data set quality and the suitability of groups of m/z values that map to peaks as potential biomarkers. The algorithm is demonstrated with serum mass spectra from patients with and without ovarian cancer. Biomarker candidates are identified and ranked by their ability to discriminate between cancer and noncancer conditions. Their discriminating power is tested by classifying unknowns using a simple distance calculation, and a sensitivity of 95.6{\%} and a specificity of 97.1{\%} are obtained. In contrast, the sensitivity of the ovarian cancer blood marker CA125 is ∼50{\%} for stage I/II and ∼80{\%} for stage III/IV cancers. While the generalizability of these markers is currently unknown, we have demonstrated the ability of our analytical package to extract biomarker candidates from high-performance mass spectral data.",
keywords = "Biomarkers, Mass spectra, Ovarian cancer",
author = "Fisher, {Wayne G.} and Rosenblatt, {Kevin P.} and Fishman, {David A.} and Whitteley, {Gordon R.} and Alvydas Mikulskis and Kuzdzal, {Scott A.} and Lopez, {Mary F.} and Tan, {Niclas Chiang} and German, {Dwight C.} and Garner, {Harold R.}",
year = "2007",
month = "10",
doi = "10.1142/S021972000700303X",
language = "English (US)",
volume = "5",
pages = "1023--1045",
journal = "Journal of Bioinformatics and Computational Biology",
issn = "0219-7200",
publisher = "World Scientific Publishing Co. Pte Ltd",
number = "5",

}

TY - JOUR

T1 - A robust biomarker discovery pipeline for high-performance mass spectrometry data

AU - Fisher, Wayne G.

AU - Rosenblatt, Kevin P.

AU - Fishman, David A.

AU - Whitteley, Gordon R.

AU - Mikulskis, Alvydas

AU - Kuzdzal, Scott A.

AU - Lopez, Mary F.

AU - Tan, Niclas Chiang

AU - German, Dwight C.

AU - Garner, Harold R.

PY - 2007/10

Y1 - 2007/10

N2 - A high-throughput software pipeline for analyzing high-performance mass spectral data sets has been developed to facilitate rapid and accurate biomarker determination. The software exploits the mass precision and resolution of high-performance instrumentation, bypasses peak-finding steps, and instead uses discrete m/z data points to identify putative biomarkers. The technique is insensitive to peak shape, and works on overlapping and non-Gaussian peaks which can confound peak-finding algorithms. Methods are presented to assess data set quality and the suitability of groups of m/z values that map to peaks as potential biomarkers. The algorithm is demonstrated with serum mass spectra from patients with and without ovarian cancer. Biomarker candidates are identified and ranked by their ability to discriminate between cancer and noncancer conditions. Their discriminating power is tested by classifying unknowns using a simple distance calculation, and a sensitivity of 95.6% and a specificity of 97.1% are obtained. In contrast, the sensitivity of the ovarian cancer blood marker CA125 is ∼50% for stage I/II and ∼80% for stage III/IV cancers. While the generalizability of these markers is currently unknown, we have demonstrated the ability of our analytical package to extract biomarker candidates from high-performance mass spectral data.

AB - A high-throughput software pipeline for analyzing high-performance mass spectral data sets has been developed to facilitate rapid and accurate biomarker determination. The software exploits the mass precision and resolution of high-performance instrumentation, bypasses peak-finding steps, and instead uses discrete m/z data points to identify putative biomarkers. The technique is insensitive to peak shape, and works on overlapping and non-Gaussian peaks which can confound peak-finding algorithms. Methods are presented to assess data set quality and the suitability of groups of m/z values that map to peaks as potential biomarkers. The algorithm is demonstrated with serum mass spectra from patients with and without ovarian cancer. Biomarker candidates are identified and ranked by their ability to discriminate between cancer and noncancer conditions. Their discriminating power is tested by classifying unknowns using a simple distance calculation, and a sensitivity of 95.6% and a specificity of 97.1% are obtained. In contrast, the sensitivity of the ovarian cancer blood marker CA125 is ∼50% for stage I/II and ∼80% for stage III/IV cancers. While the generalizability of these markers is currently unknown, we have demonstrated the ability of our analytical package to extract biomarker candidates from high-performance mass spectral data.

KW - Biomarkers

KW - Mass spectra

KW - Ovarian cancer

UR - http://www.scopus.com/inward/record.url?scp=35348901937&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=35348901937&partnerID=8YFLogxK

U2 - 10.1142/S021972000700303X

DO - 10.1142/S021972000700303X

M3 - Article

C2 - 17933009

AN - SCOPUS:35348901937

VL - 5

SP - 1023

EP - 1045

JO - Journal of Bioinformatics and Computational Biology

JF - Journal of Bioinformatics and Computational Biology

SN - 0219-7200

IS - 5

ER -