Abstract
In this paper, we develop a probabilistic model to approach two realistic scenarios regarding the singular haplotype reconstruction problem - the incompleteness and inconsistency that occurred in the DNA sequencing process to generate the input haplotype fragments, and the common practice used to generate synthetic data in experimental algorithm studies. We design three algorithms in the model that can reconstruct the two unknown haplotypes from the given matrix of haplotype fragments with provable high probability and in linear time in the size of the input matrix. We also present experimental results that conform with the theoretical efficient performance of those algorithms. The software of our algorithms is available for public access and for real-time on-line demonstration.
Original language | English (US) |
---|---|
Pages (from-to) | 535-546 |
Number of pages | 12 |
Journal | Journal of Computational Biology |
Volume | 15 |
Issue number | 5 |
DOIs | |
State | Published - Jun 1 2008 |
Keywords
- Inconsistency and incompleteness errors
- Linear time probabilistic algorithm
- Probabilistic modeling and analysis
- SNP fragments
- Singular haplotype reconstruction
ASJC Scopus subject areas
- Modeling and Simulation
- Molecular Biology
- Genetics
- Computational Mathematics
- Computational Theory and Mathematics