Population Haplotyping (Beta)
The Haplotyping tool is a beta tool introduced in the Illumina DRAGEN Bio-IT Platform v4.2. It supports the estimation of haplotypes from a population scale dataset using the packaging of the SHAPEIT5 Software (2022, Hofmeister RJ, Ribeiro DM, Rubinacci S., Delaneau O). It is designed to phase common variants as well as rare variants in a step-by-step mode. The following step-by-step workflow must be reproduced to phase each chromosome of the studied genome.
1. | Phase Common step to estimate the haplotypes of common variants (variants with allele frequency above a given allele frequency threshold) on defined regions. |
2. | Common Ligate step to ligate the phased common variants from the previous step into a single chromosome. |
3. | Phase Rare step to add the haplotypes of rare variants (variants with allele frequency below a given allele frequency threshold) on defined regions to the common variant scaffold obtained in the previous step. |
4. | Concat All step to concatenate the haplotype regions obtained in the previous step into a single chromosome. |
This tool provides best accuracy on population scale dataset with thousands of samples. It is intended to be run on multiple nodes to parallelize processes. A common use case of the Population Haplotyping tool is the generation of a custom reference panel to be used for the VCF Imputation pipeline.
The tool supports autosomes and mixed ploidy chromosomes for diploid species only. It does not use the FPGA accelerated capability and it can run on generic software only compute node.
The Population Haplotyping tool only supports input msVCF produced with the DRAGEN gVCF Genotyper tool.