Abstract
Previous molecular phylogeny algorithms mainly rely on multi-sequence alignments of cautiously selected characteristic sequences, thus not directly appropriate for whole genome phylogeny where events such as rearrangements make full-length alignments impossible. We introduce here the concept of Complete Information Set (CIS) and its measurement implementation as evolution distance without reference to sizes. As method proof-test, the 16s rRNA sequences of 22 completely sequenced Bacteria and Archaea species are used to reconstruct a phylogenetic tree, which is generally consistent with the commonly accepted one. Based on whole genome, our further efforts yield a highly robust whole genome phylogenetic tree, supporting separate monophyletic cluster of species with similar phenotype as well as the early evolution of thermophilic Bacteria and late diverging of Eukarya. The purpose of this work is not to contradict or confirm previous phylogeny standards but rather to bring a brand-new algorithm and tool to the phylogeny research community. The software to estimate the sequence distance and materials used in this study are available upon request to corresponding author.
Original language | English (US) |
---|---|
Pages (from-to) | 439-447 |
Number of pages | 9 |
Journal | Journal of Biological Physics |
Volume | 28 |
Issue number | 3 |
DOIs | |
State | Published - 2002 |
Externally published | Yes |
Keywords
- Comparative genomics
- Information discrepancy
- Molecular evolution
- Sequence analysis
ASJC Scopus subject areas
- Biophysics
- Atomic and Molecular Physics, and Optics
- Molecular Biology
- Cell Biology