Performance evaluation of existing de novo sequencing algorithms

Sergey Pevtsov, Irina Fedulova, Hamid Mirzaei, Charles Buck, Xiang Zhang

Research output: Contribution to journalArticle

78 Citations (Scopus)

Abstract

Two methods have been developed for protein identification from tandem mass spectra: database searching and de novo sequencing. De novo sequencing identifies peptide directly from tandem mass spectra. Among many proposed algorithms, we evaluated the performance of the five de novo sequencing algorithms, AUDENS, Lutefisk, NovoHMM, PepNovo, and PEAKS. Our evaluation methods are based on calculation of relative sequence distance (RSD), algorithm sensitivity, and spectrum quality. We found that de novo sequencing algorithms have different performance in analyzing QSTAR and LCQ mass spectrometer data, but in general, perform better in analyzing QSTAR data than LCQ data. For the QSTAR data, the performance order of the five algorithms is PEAKS > Lutefisk, PepNovo > AUDENS, NovoHMM. The performance of PEAKS, Lutefisk, and PepNovo strongly depends on the spectrum quality and increases with an increase of spectrum quality. However, AUDENS and NovoHMM are not sensitive to the spectrum quality. Compared with other four algorithms, PEAKS has the best sensitivity and also has the best performance in the entire range of spectrum quality. For the LCQ data, the performance order is NovoHMM > PepNovo, PEAKS > Lutefisk > AUDENS. NovoHMM has the best sensitivity, and its performance is the best in the entire range of spectrum quality. But the overall performance of NovoHMM is not significantly different from the performance of PEAKS and PepNovo. AUDENS does not give a good performance in analyzing either QSTAR and LCQ data.

Original languageEnglish (US)
Pages (from-to)3018-3028
Number of pages11
JournalJournal of Proteome Research
Volume5
Issue number11
DOIs
StatePublished - Nov 2006

Fingerprint

Mass spectrometers
Databases
Peptides
Proteins

Keywords

  • De novo sequencing
  • Mass spectral quality
  • Mass spectrometry
  • Peptide identification

ASJC Scopus subject areas

  • Genetics
  • Biotechnology
  • Biochemistry

Cite this

Pevtsov, S., Fedulova, I., Mirzaei, H., Buck, C., & Zhang, X. (2006). Performance evaluation of existing de novo sequencing algorithms. Journal of Proteome Research, 5(11), 3018-3028. https://doi.org/10.1021/pr060222h

Performance evaluation of existing de novo sequencing algorithms. / Pevtsov, Sergey; Fedulova, Irina; Mirzaei, Hamid; Buck, Charles; Zhang, Xiang.

In: Journal of Proteome Research, Vol. 5, No. 11, 11.2006, p. 3018-3028.

Research output: Contribution to journalArticle

Pevtsov, S, Fedulova, I, Mirzaei, H, Buck, C & Zhang, X 2006, 'Performance evaluation of existing de novo sequencing algorithms', Journal of Proteome Research, vol. 5, no. 11, pp. 3018-3028. https://doi.org/10.1021/pr060222h
Pevtsov, Sergey ; Fedulova, Irina ; Mirzaei, Hamid ; Buck, Charles ; Zhang, Xiang. / Performance evaluation of existing de novo sequencing algorithms. In: Journal of Proteome Research. 2006 ; Vol. 5, No. 11. pp. 3018-3028.
@article{5df2246a1a824ae4bf3030539b1fe00e,
title = "Performance evaluation of existing de novo sequencing algorithms",
abstract = "Two methods have been developed for protein identification from tandem mass spectra: database searching and de novo sequencing. De novo sequencing identifies peptide directly from tandem mass spectra. Among many proposed algorithms, we evaluated the performance of the five de novo sequencing algorithms, AUDENS, Lutefisk, NovoHMM, PepNovo, and PEAKS. Our evaluation methods are based on calculation of relative sequence distance (RSD), algorithm sensitivity, and spectrum quality. We found that de novo sequencing algorithms have different performance in analyzing QSTAR and LCQ mass spectrometer data, but in general, perform better in analyzing QSTAR data than LCQ data. For the QSTAR data, the performance order of the five algorithms is PEAKS > Lutefisk, PepNovo > AUDENS, NovoHMM. The performance of PEAKS, Lutefisk, and PepNovo strongly depends on the spectrum quality and increases with an increase of spectrum quality. However, AUDENS and NovoHMM are not sensitive to the spectrum quality. Compared with other four algorithms, PEAKS has the best sensitivity and also has the best performance in the entire range of spectrum quality. For the LCQ data, the performance order is NovoHMM > PepNovo, PEAKS > Lutefisk > AUDENS. NovoHMM has the best sensitivity, and its performance is the best in the entire range of spectrum quality. But the overall performance of NovoHMM is not significantly different from the performance of PEAKS and PepNovo. AUDENS does not give a good performance in analyzing either QSTAR and LCQ data.",
keywords = "De novo sequencing, Mass spectral quality, Mass spectrometry, Peptide identification",
author = "Sergey Pevtsov and Irina Fedulova and Hamid Mirzaei and Charles Buck and Xiang Zhang",
year = "2006",
month = "11",
doi = "10.1021/pr060222h",
language = "English (US)",
volume = "5",
pages = "3018--3028",
journal = "Journal of Proteome Research",
issn = "1535-3893",
publisher = "American Chemical Society",
number = "11",

}

TY - JOUR

T1 - Performance evaluation of existing de novo sequencing algorithms

AU - Pevtsov, Sergey

AU - Fedulova, Irina

AU - Mirzaei, Hamid

AU - Buck, Charles

AU - Zhang, Xiang

PY - 2006/11

Y1 - 2006/11

N2 - Two methods have been developed for protein identification from tandem mass spectra: database searching and de novo sequencing. De novo sequencing identifies peptide directly from tandem mass spectra. Among many proposed algorithms, we evaluated the performance of the five de novo sequencing algorithms, AUDENS, Lutefisk, NovoHMM, PepNovo, and PEAKS. Our evaluation methods are based on calculation of relative sequence distance (RSD), algorithm sensitivity, and spectrum quality. We found that de novo sequencing algorithms have different performance in analyzing QSTAR and LCQ mass spectrometer data, but in general, perform better in analyzing QSTAR data than LCQ data. For the QSTAR data, the performance order of the five algorithms is PEAKS > Lutefisk, PepNovo > AUDENS, NovoHMM. The performance of PEAKS, Lutefisk, and PepNovo strongly depends on the spectrum quality and increases with an increase of spectrum quality. However, AUDENS and NovoHMM are not sensitive to the spectrum quality. Compared with other four algorithms, PEAKS has the best sensitivity and also has the best performance in the entire range of spectrum quality. For the LCQ data, the performance order is NovoHMM > PepNovo, PEAKS > Lutefisk > AUDENS. NovoHMM has the best sensitivity, and its performance is the best in the entire range of spectrum quality. But the overall performance of NovoHMM is not significantly different from the performance of PEAKS and PepNovo. AUDENS does not give a good performance in analyzing either QSTAR and LCQ data.

AB - Two methods have been developed for protein identification from tandem mass spectra: database searching and de novo sequencing. De novo sequencing identifies peptide directly from tandem mass spectra. Among many proposed algorithms, we evaluated the performance of the five de novo sequencing algorithms, AUDENS, Lutefisk, NovoHMM, PepNovo, and PEAKS. Our evaluation methods are based on calculation of relative sequence distance (RSD), algorithm sensitivity, and spectrum quality. We found that de novo sequencing algorithms have different performance in analyzing QSTAR and LCQ mass spectrometer data, but in general, perform better in analyzing QSTAR data than LCQ data. For the QSTAR data, the performance order of the five algorithms is PEAKS > Lutefisk, PepNovo > AUDENS, NovoHMM. The performance of PEAKS, Lutefisk, and PepNovo strongly depends on the spectrum quality and increases with an increase of spectrum quality. However, AUDENS and NovoHMM are not sensitive to the spectrum quality. Compared with other four algorithms, PEAKS has the best sensitivity and also has the best performance in the entire range of spectrum quality. For the LCQ data, the performance order is NovoHMM > PepNovo, PEAKS > Lutefisk > AUDENS. NovoHMM has the best sensitivity, and its performance is the best in the entire range of spectrum quality. But the overall performance of NovoHMM is not significantly different from the performance of PEAKS and PepNovo. AUDENS does not give a good performance in analyzing either QSTAR and LCQ data.

KW - De novo sequencing

KW - Mass spectral quality

KW - Mass spectrometry

KW - Peptide identification

UR - http://www.scopus.com/inward/record.url?scp=33751068791&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33751068791&partnerID=8YFLogxK

U2 - 10.1021/pr060222h

DO - 10.1021/pr060222h

M3 - Article

C2 - 17081053

AN - SCOPUS:33751068791

VL - 5

SP - 3018

EP - 3028

JO - Journal of Proteome Research

JF - Journal of Proteome Research

SN - 1535-3893

IS - 11

ER -