Phylogeny based on whole genome as inferred from complete information set analysis

W. Li, W. Fang, L. Ling, J. Wang, Z. Xuan, R. Chen

Research output: Contribution to journalArticle

23 Citations (Scopus)

Abstract

Previous molecular phylogeny algorithms mainly rely on multi-sequence alignments of cautiously selected characteristic sequences, thus not directly appropriate for whole genome phylogeny where events such as rearrangements make full-length alignments impossible. We introduce here the concept of Complete Information Set (CIS) and its measurement implementation as evolution distance without reference to sizes. As method proof-test, the 16s rRNA sequences of 22 completely sequenced Bacteria and Archaea species are used to reconstruct a phylogenetic tree, which is generally consistent with the commonly accepted one. Based on whole genome, our further efforts yield a highly robust whole genome phylogenetic tree, supporting separate monophyletic cluster of species with similar phenotype as well as the early evolution of thermophilic Bacteria and late diverging of Eukarya. The purpose of this work is not to contradict or confirm previous phylogeny standards but rather to bring a brand-new algorithm and tool to the phylogeny research community. The software to estimate the sequence distance and materials used in this study are available upon request to corresponding author.

Original languageEnglish (US)
Pages (from-to)439-447
Number of pages9
JournalJournal of Biological Physics
Volume28
Issue number3
DOIs
StatePublished - Nov 13 2002
Externally publishedYes

Fingerprint

genome
Phylogeny
Genome
bacteria
alignment
phenotype
Bacteria
Sequence Alignment
Archaea
Eukaryota
computer programs
Software
estimates
Phenotype
Research

Keywords

  • Comparative genomics
  • Information discrepancy
  • Molecular evolution
  • Sequence analysis

ASJC Scopus subject areas

  • Biophysics
  • Atomic and Molecular Physics, and Optics
  • Molecular Biology
  • Cell Biology

Cite this

Phylogeny based on whole genome as inferred from complete information set analysis. / Li, W.; Fang, W.; Ling, L.; Wang, J.; Xuan, Z.; Chen, R.

In: Journal of Biological Physics, Vol. 28, No. 3, 13.11.2002, p. 439-447.

Research output: Contribution to journalArticle

Li, W. ; Fang, W. ; Ling, L. ; Wang, J. ; Xuan, Z. ; Chen, R. / Phylogeny based on whole genome as inferred from complete information set analysis. In: Journal of Biological Physics. 2002 ; Vol. 28, No. 3. pp. 439-447.
@article{e6f2e4137d2245749ac0612ba5dc8dee,
title = "Phylogeny based on whole genome as inferred from complete information set analysis",
abstract = "Previous molecular phylogeny algorithms mainly rely on multi-sequence alignments of cautiously selected characteristic sequences, thus not directly appropriate for whole genome phylogeny where events such as rearrangements make full-length alignments impossible. We introduce here the concept of Complete Information Set (CIS) and its measurement implementation as evolution distance without reference to sizes. As method proof-test, the 16s rRNA sequences of 22 completely sequenced Bacteria and Archaea species are used to reconstruct a phylogenetic tree, which is generally consistent with the commonly accepted one. Based on whole genome, our further efforts yield a highly robust whole genome phylogenetic tree, supporting separate monophyletic cluster of species with similar phenotype as well as the early evolution of thermophilic Bacteria and late diverging of Eukarya. The purpose of this work is not to contradict or confirm previous phylogeny standards but rather to bring a brand-new algorithm and tool to the phylogeny research community. The software to estimate the sequence distance and materials used in this study are available upon request to corresponding author.",
keywords = "Comparative genomics, Information discrepancy, Molecular evolution, Sequence analysis",
author = "W. Li and W. Fang and L. Ling and J. Wang and Z. Xuan and R. Chen",
year = "2002",
month = "11",
day = "13",
doi = "10.1023/A:1020316706928",
language = "English (US)",
volume = "28",
pages = "439--447",
journal = "Journal of Biological Physics",
issn = "0092-0606",
publisher = "Springer Netherlands",
number = "3",

}

TY - JOUR

T1 - Phylogeny based on whole genome as inferred from complete information set analysis

AU - Li, W.

AU - Fang, W.

AU - Ling, L.

AU - Wang, J.

AU - Xuan, Z.

AU - Chen, R.

PY - 2002/11/13

Y1 - 2002/11/13

N2 - Previous molecular phylogeny algorithms mainly rely on multi-sequence alignments of cautiously selected characteristic sequences, thus not directly appropriate for whole genome phylogeny where events such as rearrangements make full-length alignments impossible. We introduce here the concept of Complete Information Set (CIS) and its measurement implementation as evolution distance without reference to sizes. As method proof-test, the 16s rRNA sequences of 22 completely sequenced Bacteria and Archaea species are used to reconstruct a phylogenetic tree, which is generally consistent with the commonly accepted one. Based on whole genome, our further efforts yield a highly robust whole genome phylogenetic tree, supporting separate monophyletic cluster of species with similar phenotype as well as the early evolution of thermophilic Bacteria and late diverging of Eukarya. The purpose of this work is not to contradict or confirm previous phylogeny standards but rather to bring a brand-new algorithm and tool to the phylogeny research community. The software to estimate the sequence distance and materials used in this study are available upon request to corresponding author.

AB - Previous molecular phylogeny algorithms mainly rely on multi-sequence alignments of cautiously selected characteristic sequences, thus not directly appropriate for whole genome phylogeny where events such as rearrangements make full-length alignments impossible. We introduce here the concept of Complete Information Set (CIS) and its measurement implementation as evolution distance without reference to sizes. As method proof-test, the 16s rRNA sequences of 22 completely sequenced Bacteria and Archaea species are used to reconstruct a phylogenetic tree, which is generally consistent with the commonly accepted one. Based on whole genome, our further efforts yield a highly robust whole genome phylogenetic tree, supporting separate monophyletic cluster of species with similar phenotype as well as the early evolution of thermophilic Bacteria and late diverging of Eukarya. The purpose of this work is not to contradict or confirm previous phylogeny standards but rather to bring a brand-new algorithm and tool to the phylogeny research community. The software to estimate the sequence distance and materials used in this study are available upon request to corresponding author.

KW - Comparative genomics

KW - Information discrepancy

KW - Molecular evolution

KW - Sequence analysis

UR - http://www.scopus.com/inward/record.url?scp=0036034976&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0036034976&partnerID=8YFLogxK

U2 - 10.1023/A:1020316706928

DO - 10.1023/A:1020316706928

M3 - Article

C2 - 23345787

AN - SCOPUS:0036034976

VL - 28

SP - 439

EP - 447

JO - Journal of Biological Physics

JF - Journal of Biological Physics

SN - 0092-0606

IS - 3

ER -