Manual classification strategies in the ECOD database

Hua Cheng, Yuxing Liao, R. Dustin Schaeffer, Nick V. Grishin

Research output: Contribution to journalArticlepeer-review

41 Scopus citations


ABSTRACT: ECOD (Evolutionary Classification Of protein Domains) is a comprehensive and up-to-date protein structure classification database. The majority of new structures released from the PDB (Protein Data Bank) each week already have close homologs in the ECOD hierarchy and thus can be reliably partitioned into domains and classified by software without manual intervention. However, those proteins that lack confidently detectable homologs require careful analysis by experts. Although many bioinformatics resources rely on expert curation to some degree, specific examples of how this curation occurs and in what cases it is necessary are not always described. Here, we illustrate the manual classification strategy in ECOD by example, focusing on two major issues in protein classification: domain partitioning and the relationship between homology and similarity scores. Most examples show recently released and manually classified PDB structures. We discuss multi-domain proteins, discordance between sequence and structural similarities, difficulties with assessing homology with scores, and integral membrane proteins homologous to soluble proteins. By timely assimilation of newly available structures into its hierarchy, ECOD strives to provide a most accurate and updated view of the protein structure world as a result of combined computational and expert-driven analysis.

Original languageEnglish (US)
Pages (from-to)1238-1251
Number of pages14
JournalProteins: Structure, Function and Bioinformatics
Issue number7
StatePublished - Jul 1 2015


  • Classification
  • Database
  • Domain
  • Evolution
  • Homology
  • Protein
  • Sequence
  • Structure

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology


Dive into the research topics of 'Manual classification strategies in the ECOD database'. Together they form a unique fingerprint.

Cite this