TY - JOUR
T1 - Saturated BLAST
T2 - An automated multiple intermediate sequence search used to detect distant homology
AU - Li, Weizhong
AU - Pio, Frederic
AU - Pawłowski, Krzysztof
AU - Godzik, Adam
N1 - Funding Information:
We thank Dr Kutbuddin S. Doctor for the suggestions and testing of this software. This research was partly supported by the NIH grant GM60049.
PY - 2000
Y1 - 2000
N2 - Motivation: Two proteins can have a similar 3-dimensional structure and biological function, but have sequences sufficiently different that traditional protein sequence comparison algorithms do not identify their relationship. The desire to identify such relations has led to the development of more sensitive sequence alignment strategies. One such strategy is the Intermediate Sequence Search (ISS), which connects two proteins through one or more intermediate sequences. In its brute-force implementation, ISS is a strategy that repetitively uses the results of the previous query as new search seeds, making it time-consuming and difficult to analyze. Results: Saturated BLAST is a package that performs ISS in an efficient and automated manner. It was developed using Perl and Perl/Tk and implemented on the LINUX operating system. Starting with a protein sequence, Saturated BLAST runs a BLAST search and identifies representative sequences for the next generation of searches. The procedure is run until convergence or until some predefined criteria are met. Saturated BLAST has a friendly graphic user interface, a built-in BLAST result parser, several multiple alignment tools, clustering algorithms and various filters for the elimination of false positives, thereby providing an easy way to edit, visualize, analyze, monitor and control the search. Besides detecting remote homologies, Saturated BLAST can be used to maintain protein family databases and to search for new genes in genomic databases.
AB - Motivation: Two proteins can have a similar 3-dimensional structure and biological function, but have sequences sufficiently different that traditional protein sequence comparison algorithms do not identify their relationship. The desire to identify such relations has led to the development of more sensitive sequence alignment strategies. One such strategy is the Intermediate Sequence Search (ISS), which connects two proteins through one or more intermediate sequences. In its brute-force implementation, ISS is a strategy that repetitively uses the results of the previous query as new search seeds, making it time-consuming and difficult to analyze. Results: Saturated BLAST is a package that performs ISS in an efficient and automated manner. It was developed using Perl and Perl/Tk and implemented on the LINUX operating system. Starting with a protein sequence, Saturated BLAST runs a BLAST search and identifies representative sequences for the next generation of searches. The procedure is run until convergence or until some predefined criteria are met. Saturated BLAST has a friendly graphic user interface, a built-in BLAST result parser, several multiple alignment tools, clustering algorithms and various filters for the elimination of false positives, thereby providing an easy way to edit, visualize, analyze, monitor and control the search. Besides detecting remote homologies, Saturated BLAST can be used to maintain protein family databases and to search for new genes in genomic databases.
UR - http://www.scopus.com/inward/record.url?scp=0034491129&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0034491129&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/16.12.1105
DO - 10.1093/bioinformatics/16.12.1105
M3 - Article
C2 - 11159329
AN - SCOPUS:0034491129
SN - 1367-4803
VL - 16
SP - 1105
EP - 1110
JO - Bioinformatics
JF - Bioinformatics
IS - 12
ER -