Comparative genomics of the archaea (Euryarchaeota): Evolution of conserved protein families, the stable core, and the variable shell

Kira S. Makarova, L. Aravind, Michael Y. Galperin, Nick V. Grishin, Roman L. Tatusov, Yuri I. Wolf, Eugene V. Koonin

Research output: Contribution to journalArticle

221 Citations (Scopus)

Abstract

Comparative analysis of the protein sequences encoded in the four euryarchaeal species whose genomes have been sequenced completely (Methanococcus jannaschii, Methanobacterium thermoautotrophicum, Archaeoglobus fulgidus, and Pyrococcus horikoshii) revealed 1326 orthologous sets, of which 543 are represented in all four species. The proteins that belong to these conserved euryarchaeal families comprise 31%-35% of the gene complement and may be considered the evolutionarily stable core of the archaeal genomes. The core gene set includes the great majority of genes coding for proteins involved in genome replication and expression, but only a relatively small subset of metabolic functions. For many gene families that are conserved in all euryarchaea, previously undetected orthologs in bacteria and eukaryotes were identified. A number of euryarchaeal synapomorphies (unique shared characters) were identified; these are protein families that possess sequence signatures or domain architectures that are conserved in all euryarchaea but are not found in bacteria or eukaryotes. In addition, euryarchaea-specific expansions of several protein and domain families were detected. In terms of their apparent phylogenetic affinities, the archaeal protein families split into bacterial and eukaryotic families. The majority of the proteins that have only eukaryotic orthologs or show the greatest similarity to their eukaryotic counterparts belong to the core set. The families of euryarchaeal genes that are conserved in only two or three species constitute a relatively mobile component of the genomes whose evolution should have involved multiple events of lineage-specific gene loss and horizontal gene transfer. Frequently these proteins have detectable orthologs only in bacteria or show the greatest similarity to the bacterial homologs, which might suggest a significant role of horizontal gene transfer from bacteria in the evolution of the euryarchaeota.

Original languageEnglish (US)
Pages (from-to)608-628
Number of pages21
JournalGenome Research
Volume9
Issue number7
StatePublished - 1999

Fingerprint

Euryarchaeota
Archaea
Genomics
Bacteria
Horizontal Gene Transfer
Genes
Proteins
Eukaryota
Pyrococcus horikoshii
Archaeal Genome
Archaeal Proteins
Archaeoglobus fulgidus
Genome Components
Methanocaldococcus
Methanobacterium
Genome
Protein Sequence Analysis

ASJC Scopus subject areas

  • Genetics

Cite this

Makarova, K. S., Aravind, L., Galperin, M. Y., Grishin, N. V., Tatusov, R. L., Wolf, Y. I., & Koonin, E. V. (1999). Comparative genomics of the archaea (Euryarchaeota): Evolution of conserved protein families, the stable core, and the variable shell. Genome Research, 9(7), 608-628.

Comparative genomics of the archaea (Euryarchaeota) : Evolution of conserved protein families, the stable core, and the variable shell. / Makarova, Kira S.; Aravind, L.; Galperin, Michael Y.; Grishin, Nick V.; Tatusov, Roman L.; Wolf, Yuri I.; Koonin, Eugene V.

In: Genome Research, Vol. 9, No. 7, 1999, p. 608-628.

Research output: Contribution to journalArticle

Makarova, KS, Aravind, L, Galperin, MY, Grishin, NV, Tatusov, RL, Wolf, YI & Koonin, EV 1999, 'Comparative genomics of the archaea (Euryarchaeota): Evolution of conserved protein families, the stable core, and the variable shell', Genome Research, vol. 9, no. 7, pp. 608-628.
Makarova, Kira S. ; Aravind, L. ; Galperin, Michael Y. ; Grishin, Nick V. ; Tatusov, Roman L. ; Wolf, Yuri I. ; Koonin, Eugene V. / Comparative genomics of the archaea (Euryarchaeota) : Evolution of conserved protein families, the stable core, and the variable shell. In: Genome Research. 1999 ; Vol. 9, No. 7. pp. 608-628.
@article{9bb326421dd345b8b5c5c513bab868c5,
title = "Comparative genomics of the archaea (Euryarchaeota): Evolution of conserved protein families, the stable core, and the variable shell",
abstract = "Comparative analysis of the protein sequences encoded in the four euryarchaeal species whose genomes have been sequenced completely (Methanococcus jannaschii, Methanobacterium thermoautotrophicum, Archaeoglobus fulgidus, and Pyrococcus horikoshii) revealed 1326 orthologous sets, of which 543 are represented in all four species. The proteins that belong to these conserved euryarchaeal families comprise 31{\%}-35{\%} of the gene complement and may be considered the evolutionarily stable core of the archaeal genomes. The core gene set includes the great majority of genes coding for proteins involved in genome replication and expression, but only a relatively small subset of metabolic functions. For many gene families that are conserved in all euryarchaea, previously undetected orthologs in bacteria and eukaryotes were identified. A number of euryarchaeal synapomorphies (unique shared characters) were identified; these are protein families that possess sequence signatures or domain architectures that are conserved in all euryarchaea but are not found in bacteria or eukaryotes. In addition, euryarchaea-specific expansions of several protein and domain families were detected. In terms of their apparent phylogenetic affinities, the archaeal protein families split into bacterial and eukaryotic families. The majority of the proteins that have only eukaryotic orthologs or show the greatest similarity to their eukaryotic counterparts belong to the core set. The families of euryarchaeal genes that are conserved in only two or three species constitute a relatively mobile component of the genomes whose evolution should have involved multiple events of lineage-specific gene loss and horizontal gene transfer. Frequently these proteins have detectable orthologs only in bacteria or show the greatest similarity to the bacterial homologs, which might suggest a significant role of horizontal gene transfer from bacteria in the evolution of the euryarchaeota.",
author = "Makarova, {Kira S.} and L. Aravind and Galperin, {Michael Y.} and Grishin, {Nick V.} and Tatusov, {Roman L.} and Wolf, {Yuri I.} and Koonin, {Eugene V.}",
year = "1999",
language = "English (US)",
volume = "9",
pages = "608--628",
journal = "Genome Research",
issn = "1088-9051",
publisher = "Cold Spring Harbor Laboratory Press",
number = "7",

}

TY - JOUR

T1 - Comparative genomics of the archaea (Euryarchaeota)

T2 - Evolution of conserved protein families, the stable core, and the variable shell

AU - Makarova, Kira S.

AU - Aravind, L.

AU - Galperin, Michael Y.

AU - Grishin, Nick V.

AU - Tatusov, Roman L.

AU - Wolf, Yuri I.

AU - Koonin, Eugene V.

PY - 1999

Y1 - 1999

N2 - Comparative analysis of the protein sequences encoded in the four euryarchaeal species whose genomes have been sequenced completely (Methanococcus jannaschii, Methanobacterium thermoautotrophicum, Archaeoglobus fulgidus, and Pyrococcus horikoshii) revealed 1326 orthologous sets, of which 543 are represented in all four species. The proteins that belong to these conserved euryarchaeal families comprise 31%-35% of the gene complement and may be considered the evolutionarily stable core of the archaeal genomes. The core gene set includes the great majority of genes coding for proteins involved in genome replication and expression, but only a relatively small subset of metabolic functions. For many gene families that are conserved in all euryarchaea, previously undetected orthologs in bacteria and eukaryotes were identified. A number of euryarchaeal synapomorphies (unique shared characters) were identified; these are protein families that possess sequence signatures or domain architectures that are conserved in all euryarchaea but are not found in bacteria or eukaryotes. In addition, euryarchaea-specific expansions of several protein and domain families were detected. In terms of their apparent phylogenetic affinities, the archaeal protein families split into bacterial and eukaryotic families. The majority of the proteins that have only eukaryotic orthologs or show the greatest similarity to their eukaryotic counterparts belong to the core set. The families of euryarchaeal genes that are conserved in only two or three species constitute a relatively mobile component of the genomes whose evolution should have involved multiple events of lineage-specific gene loss and horizontal gene transfer. Frequently these proteins have detectable orthologs only in bacteria or show the greatest similarity to the bacterial homologs, which might suggest a significant role of horizontal gene transfer from bacteria in the evolution of the euryarchaeota.

AB - Comparative analysis of the protein sequences encoded in the four euryarchaeal species whose genomes have been sequenced completely (Methanococcus jannaschii, Methanobacterium thermoautotrophicum, Archaeoglobus fulgidus, and Pyrococcus horikoshii) revealed 1326 orthologous sets, of which 543 are represented in all four species. The proteins that belong to these conserved euryarchaeal families comprise 31%-35% of the gene complement and may be considered the evolutionarily stable core of the archaeal genomes. The core gene set includes the great majority of genes coding for proteins involved in genome replication and expression, but only a relatively small subset of metabolic functions. For many gene families that are conserved in all euryarchaea, previously undetected orthologs in bacteria and eukaryotes were identified. A number of euryarchaeal synapomorphies (unique shared characters) were identified; these are protein families that possess sequence signatures or domain architectures that are conserved in all euryarchaea but are not found in bacteria or eukaryotes. In addition, euryarchaea-specific expansions of several protein and domain families were detected. In terms of their apparent phylogenetic affinities, the archaeal protein families split into bacterial and eukaryotic families. The majority of the proteins that have only eukaryotic orthologs or show the greatest similarity to their eukaryotic counterparts belong to the core set. The families of euryarchaeal genes that are conserved in only two or three species constitute a relatively mobile component of the genomes whose evolution should have involved multiple events of lineage-specific gene loss and horizontal gene transfer. Frequently these proteins have detectable orthologs only in bacteria or show the greatest similarity to the bacterial homologs, which might suggest a significant role of horizontal gene transfer from bacteria in the evolution of the euryarchaeota.

UR - http://www.scopus.com/inward/record.url?scp=0032766490&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0032766490&partnerID=8YFLogxK

M3 - Article

C2 - 10413400

AN - SCOPUS:0032766490

VL - 9

SP - 608

EP - 628

JO - Genome Research

JF - Genome Research

SN - 1088-9051

IS - 7

ER -