Probabilistic scoring measures for profile-profile comparison yield more accurate short seed alignments

David Mittelman, Ruslan Sadreyev, Nick Grishin

Research output: Contribution to journalArticlepeer-review

65 Scopus citations

Abstract

Motivation: The development of powerful automatic methods for the comparison of protein sequences has become increasingly important. Profile-to-profile comparisons allow for the use of broader information about protein families, resulting in more sensitive and accurate comparisons of distantly related sequences. A key part in the comparison of two profiles is the method for the calculation of scores for the position matches. A number of methods based on various theoretical considerations have been proposed. We implemented several previously reported scoring functions as well as our own functions, and compared them on the basis of their ability to produce accurate short ungapped alignments of a given length. Results: Our results suggest that the family of the probabilistic methods (log-odds based methods and profℐm) may be the more appropriate choice for the generation of initial 'seeds' as the first step to produce local profile-profile alignments. The most effective scoring systems were the closely related modifications of functions previously implemented in the COMPASS and Picasso methods.

Original languageEnglish (US)
Pages (from-to)1531-1539
Number of pages9
JournalBioinformatics
Volume19
Issue number12
DOIs
StatePublished - Aug 12 2003

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Fingerprint

Dive into the research topics of 'Probabilistic scoring measures for profile-profile comparison yield more accurate short seed alignments'. Together they form a unique fingerprint.

Cite this