ECOD

An Evolutionary Classification of Protein Domains

Hua Cheng, R. Dustin Schaeffer, Yuxing Liao, Lisa N. Kinch, Jimin Pei, Shuoyong Shi, Bong Hyun Kim, Nick V. Grishin

Research output: Contribution to journalArticle

86 Citations (Scopus)

Abstract

Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or “fold”). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies.

Original languageEnglish (US)
JournalPLoS Computational Biology
Volume10
Issue number12
DOIs
StatePublished - Dec 1 2014

Fingerprint

Proteins
Protein
protein
proteins
Protein Structure
protein structure
Homology
topology
Protein Domains
homology
Protein Classification
Topology
Spatial Structure
Fold
Synchronization
Update
Classify
aid
Distinct
Databases

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Modeling and Simulation
  • Ecology, Evolution, Behavior and Systematics
  • Genetics
  • Molecular Biology
  • Ecology
  • Cellular and Molecular Neuroscience

Cite this

ECOD : An Evolutionary Classification of Protein Domains. / Cheng, Hua; Schaeffer, R. Dustin; Liao, Yuxing; Kinch, Lisa N.; Pei, Jimin; Shi, Shuoyong; Kim, Bong Hyun; Grishin, Nick V.

In: PLoS Computational Biology, Vol. 10, No. 12, 01.12.2014.

Research output: Contribution to journalArticle

Cheng, Hua ; Schaeffer, R. Dustin ; Liao, Yuxing ; Kinch, Lisa N. ; Pei, Jimin ; Shi, Shuoyong ; Kim, Bong Hyun ; Grishin, Nick V. / ECOD : An Evolutionary Classification of Protein Domains. In: PLoS Computational Biology. 2014 ; Vol. 10, No. 12.
@article{c4cc8122e2334ecaa95863b564b5debb,
title = "ECOD: An Evolutionary Classification of Protein Domains",
abstract = "Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or “fold”). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies.",
author = "Hua Cheng and Schaeffer, {R. Dustin} and Yuxing Liao and Kinch, {Lisa N.} and Jimin Pei and Shuoyong Shi and Kim, {Bong Hyun} and Grishin, {Nick V.}",
year = "2014",
month = "12",
day = "1",
doi = "10.1371/journal.pcbi.1003926",
language = "English (US)",
volume = "10",
journal = "PLoS Computational Biology",
issn = "1553-734X",
publisher = "Public Library of Science",
number = "12",

}

TY - JOUR

T1 - ECOD

T2 - An Evolutionary Classification of Protein Domains

AU - Cheng, Hua

AU - Schaeffer, R. Dustin

AU - Liao, Yuxing

AU - Kinch, Lisa N.

AU - Pei, Jimin

AU - Shi, Shuoyong

AU - Kim, Bong Hyun

AU - Grishin, Nick V.

PY - 2014/12/1

Y1 - 2014/12/1

N2 - Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or “fold”). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies.

AB - Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or “fold”). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies.

UR - http://www.scopus.com/inward/record.url?scp=84919623580&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84919623580&partnerID=8YFLogxK

U2 - 10.1371/journal.pcbi.1003926

DO - 10.1371/journal.pcbi.1003926

M3 - Article

VL - 10

JO - PLoS Computational Biology

JF - PLoS Computational Biology

SN - 1553-734X

IS - 12

ER -