We discuss the problem of how to infer haplotypes from genotypes on pedigree data under the Mendelian law of inheritance and the minimum recombination principle. The problem is important for the construction of haplotype maps and genetic linkage/association analysis. We prove that the problem of finding a minimum-recombinant haplotype configuration is in general NP-hard. This is the first complexity result concerning the problem to our knowledge. An iterative algorithm based on blocks of consecutive resolved marker loci (called block-extension) is proposed. It is very efficient and can be used for large pedigrees with a large number of markers, especially for those data sets requiring few recombinants (or recombination events). A polynomial-time exact algorithm for haplotype reconstruction without recombinants is also presented. The algorithm first identifies all the necessary constraints based on the Mendelian law and the zero-recombinant assumption, and represents them as a system of linear equations over the cyclic group Z2. By using a simple method based on Gaussian elimination, we could obtain all possible feasible haplotype configurations. Finally, we describe an integrated approach to haplotype inference and missing allele imputation based on integer linear programming (ILP). We have implemented the block-extension ILP algorithms and tested them on simulated data and real data. The results show that the programs perform very well on both types of data and will be useful for large scale haplotype inference projects.
(This is joint work with Jing Li)
About the speaker: Prof. Tao Jiang received B.S. in Computer Science and Technology from the University of Science and Technology of China (USTC), Hefei, P.R.China in 1984 and Ph.D. in Computer Science from the University of Minnesota in 1988. During Jan. 1989 - June 2001, he was a faculty member at Department of Computing and Software, McMaster University, Hamilton, Ontario, Canada. During 1995-96, he was on a research leave at University of Washington, Seattle, and at Gunma University, Kiryu, Japan. He joined University of California - Riverside as Professor of Computer Science in Sept. 1999, and is a member of the Algorithms and Computational Biology Lab, Genetics Graduate Program, Institute for Integrative Genome Biology, and Center for Plant Cell Biology. He has published actively in many computer science and bioinformatics/computational biology journals and conferences. He is presently serving on the editorial boards of International Journal of Foundations of Computer Science (IJFCS), Journal of Combinatorial Optimization (JOCO), Journal of Computer Science and Technology (JCST), Journal of Bioinformatics and Computational Biology (JBCB), and IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), and program/technical committees of RECOMB'2004, ICALP'2004, IEEE BIBE'2004, IEEE CSB'2004, COCOON'2004, and CIAA'2004.Host: Bhaskar DasGupta