TY - JOUR
T1 - MALISAM
T2 - A database of structurally analogous motifs in proteins
AU - Cheng, Hua
AU - Kim, Bong Hyun
AU - Grishin, Nick V.
N1 - Funding Information:
We are grateful to Sara Cheek for useful scripts. We thank Lisa Kinch for critical reading of the manuscript and S. Sri Krishna for helpful discussions. This work was supported by NIH grant GM67165 to N.V.G. Funding to pay the Open Access publication charges for this article was provided by Howard Hughes Medical Institute.
PY - 2008/1
Y1 - 2008/1
N2 - MALISAM (manual alignments for structurally analogous motifs) represents the first database containing pairs of structural analogs and their alignments. To find reliable analogs, we developed an approach based on three ideas. First, an insertion together with a part of the evolutionary core of one domain family (a hybrid motif) is analogous to a similar motif contained within the core of another domain family. Second, a motif at an interface, formed by secondary structural elements (SSEs) contributed by two or more domains or subunits contacting along that interface, is analogous to a similar motif present in the core of a single domain. Third, an artificial protein obtained through selection from random peptides or in sequence design experiments not biased by sequences of a particular homologous family, is analogous to a structurally similar natural protein. Each analogous pair is superimposed and aligned manually, as well as by several commonly used programs. Applications of this database may range from protein evolution studies, e.g. development of remote homology inference tools and discriminators between homologs and analogs, to protein-folding research, since in the absence of evolutionary reasons, similarity between proteins is caused by structural and folding constraints. The database is publicly available at http://prodata.swmed.edu/malisam.
AB - MALISAM (manual alignments for structurally analogous motifs) represents the first database containing pairs of structural analogs and their alignments. To find reliable analogs, we developed an approach based on three ideas. First, an insertion together with a part of the evolutionary core of one domain family (a hybrid motif) is analogous to a similar motif contained within the core of another domain family. Second, a motif at an interface, formed by secondary structural elements (SSEs) contributed by two or more domains or subunits contacting along that interface, is analogous to a similar motif present in the core of a single domain. Third, an artificial protein obtained through selection from random peptides or in sequence design experiments not biased by sequences of a particular homologous family, is analogous to a structurally similar natural protein. Each analogous pair is superimposed and aligned manually, as well as by several commonly used programs. Applications of this database may range from protein evolution studies, e.g. development of remote homology inference tools and discriminators between homologs and analogs, to protein-folding research, since in the absence of evolutionary reasons, similarity between proteins is caused by structural and folding constraints. The database is publicly available at http://prodata.swmed.edu/malisam.
UR - http://www.scopus.com/inward/record.url?scp=38549170925&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=38549170925&partnerID=8YFLogxK
U2 - 10.1093/nar/gkm698
DO - 10.1093/nar/gkm698
M3 - Article
C2 - 17855399
AN - SCOPUS:38549170925
SN - 0305-1048
VL - 36
SP - D211-D217
JO - Nucleic acids research
JF - Nucleic acids research
IS - SUPPL. 1
ER -