Strategies for metagenomic-guided whole-community proteomics of complex microbial environments

Brandi L. Cantarel, Alison R. Erickson, Nathan C. VerBerkmoes, Brian K. Erickson, Patricia A. Carey, Chongle Pan, Manesh Shah, Emmanuel F. Mongodin, Janet K. Jansson, Claire M. Fraser-Liggett, Robert L. Hettich

Research output: Contribution to journalArticle

40 Citations (Scopus)

Abstract

Accurate protein identification in large-scale proteomics experiments relies upon a detailed, accurate protein catalogue, which is derived from predictions of open reading frames based on genome sequence data. Integration of mass spectrometry-based proteomics data with computational proteome predictions from environmental metagenomic sequences has been challenging because of the variable overlap between proteomic datasets and corresponding short-read nucleotide sequence data. In this study, we have benchmarked several strategies for increasing microbial peptide spectral matching in metaproteomic datasets using protein predictions generated from matched metagenomic sequences from the same human fecal samples. Additionally, we investigated the impact of mass spectrometry-based filters (high mass accuracy, delta correlation), and de novo peptide sequencing on the number and robustness of peptide-spectrum assignments in these complex datasets. In summary, we find that high mass accuracy peptide measurements searched against non-assembled reads from DNA sequencing of the same samples significantly increased identifiable proteins without sacrificing accuracy.

Original languageEnglish (US)
Article numbere27173
JournalPLoS One
Volume6
Issue number11
DOIs
StatePublished - Nov 23 2011

Fingerprint

Metagenomics
Proteomics
proteomics
peptides
Peptides
Mass spectrometry
prediction
Mass Spectrometry
Proteins
proteins
mass spectrometry
Proteome
proteome
DNA Sequence Analysis
Open Reading Frames
open reading frames
Nucleotides
sequence analysis
Genes
Genome

ASJC Scopus subject areas

  • Agricultural and Biological Sciences(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Medicine(all)

Cite this

Cantarel, B. L., Erickson, A. R., VerBerkmoes, N. C., Erickson, B. K., Carey, P. A., Pan, C., ... Hettich, R. L. (2011). Strategies for metagenomic-guided whole-community proteomics of complex microbial environments. PLoS One, 6(11), [e27173]. https://doi.org/10.1371/journal.pone.0027173

Strategies for metagenomic-guided whole-community proteomics of complex microbial environments. / Cantarel, Brandi L.; Erickson, Alison R.; VerBerkmoes, Nathan C.; Erickson, Brian K.; Carey, Patricia A.; Pan, Chongle; Shah, Manesh; Mongodin, Emmanuel F.; Jansson, Janet K.; Fraser-Liggett, Claire M.; Hettich, Robert L.

In: PLoS One, Vol. 6, No. 11, e27173, 23.11.2011.

Research output: Contribution to journalArticle

Cantarel, BL, Erickson, AR, VerBerkmoes, NC, Erickson, BK, Carey, PA, Pan, C, Shah, M, Mongodin, EF, Jansson, JK, Fraser-Liggett, CM & Hettich, RL 2011, 'Strategies for metagenomic-guided whole-community proteomics of complex microbial environments', PLoS One, vol. 6, no. 11, e27173. https://doi.org/10.1371/journal.pone.0027173
Cantarel, Brandi L. ; Erickson, Alison R. ; VerBerkmoes, Nathan C. ; Erickson, Brian K. ; Carey, Patricia A. ; Pan, Chongle ; Shah, Manesh ; Mongodin, Emmanuel F. ; Jansson, Janet K. ; Fraser-Liggett, Claire M. ; Hettich, Robert L. / Strategies for metagenomic-guided whole-community proteomics of complex microbial environments. In: PLoS One. 2011 ; Vol. 6, No. 11.
@article{5526ebb3888a4c70b829cec44154079c,
title = "Strategies for metagenomic-guided whole-community proteomics of complex microbial environments",
abstract = "Accurate protein identification in large-scale proteomics experiments relies upon a detailed, accurate protein catalogue, which is derived from predictions of open reading frames based on genome sequence data. Integration of mass spectrometry-based proteomics data with computational proteome predictions from environmental metagenomic sequences has been challenging because of the variable overlap between proteomic datasets and corresponding short-read nucleotide sequence data. In this study, we have benchmarked several strategies for increasing microbial peptide spectral matching in metaproteomic datasets using protein predictions generated from matched metagenomic sequences from the same human fecal samples. Additionally, we investigated the impact of mass spectrometry-based filters (high mass accuracy, delta correlation), and de novo peptide sequencing on the number and robustness of peptide-spectrum assignments in these complex datasets. In summary, we find that high mass accuracy peptide measurements searched against non-assembled reads from DNA sequencing of the same samples significantly increased identifiable proteins without sacrificing accuracy.",
author = "Cantarel, {Brandi L.} and Erickson, {Alison R.} and VerBerkmoes, {Nathan C.} and Erickson, {Brian K.} and Carey, {Patricia A.} and Chongle Pan and Manesh Shah and Mongodin, {Emmanuel F.} and Jansson, {Janet K.} and Fraser-Liggett, {Claire M.} and Hettich, {Robert L.}",
year = "2011",
month = "11",
day = "23",
doi = "10.1371/journal.pone.0027173",
language = "English (US)",
volume = "6",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "11",

}

TY - JOUR

T1 - Strategies for metagenomic-guided whole-community proteomics of complex microbial environments

AU - Cantarel, Brandi L.

AU - Erickson, Alison R.

AU - VerBerkmoes, Nathan C.

AU - Erickson, Brian K.

AU - Carey, Patricia A.

AU - Pan, Chongle

AU - Shah, Manesh

AU - Mongodin, Emmanuel F.

AU - Jansson, Janet K.

AU - Fraser-Liggett, Claire M.

AU - Hettich, Robert L.

PY - 2011/11/23

Y1 - 2011/11/23

N2 - Accurate protein identification in large-scale proteomics experiments relies upon a detailed, accurate protein catalogue, which is derived from predictions of open reading frames based on genome sequence data. Integration of mass spectrometry-based proteomics data with computational proteome predictions from environmental metagenomic sequences has been challenging because of the variable overlap between proteomic datasets and corresponding short-read nucleotide sequence data. In this study, we have benchmarked several strategies for increasing microbial peptide spectral matching in metaproteomic datasets using protein predictions generated from matched metagenomic sequences from the same human fecal samples. Additionally, we investigated the impact of mass spectrometry-based filters (high mass accuracy, delta correlation), and de novo peptide sequencing on the number and robustness of peptide-spectrum assignments in these complex datasets. In summary, we find that high mass accuracy peptide measurements searched against non-assembled reads from DNA sequencing of the same samples significantly increased identifiable proteins without sacrificing accuracy.

AB - Accurate protein identification in large-scale proteomics experiments relies upon a detailed, accurate protein catalogue, which is derived from predictions of open reading frames based on genome sequence data. Integration of mass spectrometry-based proteomics data with computational proteome predictions from environmental metagenomic sequences has been challenging because of the variable overlap between proteomic datasets and corresponding short-read nucleotide sequence data. In this study, we have benchmarked several strategies for increasing microbial peptide spectral matching in metaproteomic datasets using protein predictions generated from matched metagenomic sequences from the same human fecal samples. Additionally, we investigated the impact of mass spectrometry-based filters (high mass accuracy, delta correlation), and de novo peptide sequencing on the number and robustness of peptide-spectrum assignments in these complex datasets. In summary, we find that high mass accuracy peptide measurements searched against non-assembled reads from DNA sequencing of the same samples significantly increased identifiable proteins without sacrificing accuracy.

UR - http://www.scopus.com/inward/record.url?scp=81755176064&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=81755176064&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0027173

DO - 10.1371/journal.pone.0027173

M3 - Article

C2 - 22132090

AN - SCOPUS:81755176064

VL - 6

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 11

M1 - e27173

ER -