Genotyping Common and Rare Variation Using Overlapping Pool Sequencing

Background: Recent advances in sequencing technologies set the stage for large, population based studies, inwhich the ANA or RNA of thousands of individuals will be sequenced. Currently, however, such studies are stillinfeasible using a straightforward sequencing approach; as a result, recently a few multiplexing schemes have been suggested, in which a small number of ANA pools are sequenced, and the results are then deconvoluted using compressed sensing or similar approaches. These methods, however, are limited to the detection of rare variants.Results: In this paper we provide a new algorithm for the deconvolution of DNA pools multiplexing schemes. The presented algorithm utilizes a likelihood model and linear programming. The approach allows for the addition of external data, particularly imputation data, resulting in a flexible environment that is suitable for different applications.Conclusions: Particularly, we demonstrate that both low and high allele frequency SNPs can be accuratelygenotyped when the DNA pooling scheme is performed in conjunction with microarray genotyping andimputation. Additionally, we demonstrate the use of our framework for the detection of cancer fusion genes fromRNA sequences.

