Population Mode
DRAGEN provides a population-based analysis option to jointly analyze samples from unrelated individuals. To initiate population mode, use the following genotypers.
• | gVCF Genotyper—Uses a set of single or multisample gVCFs as input and outputs a multisample VCF that contains one entry for any variant seen in any of the input gVCFs. The variants are genotyped across all input samples using information from the hom-ref blocks as necessary. The gVCF genotyper does not adjust genotypes based on population information. See gVCF Genotyper Options for information on the available command line options. |
• | Joint Genotyper—Uses information from the whole cohort to improve the accuracy of individual genotypes. You can input multisample VCF, multisample gVCF, or a set of single sample gVCFs. To receive output as a multisample gVCF, set --enable-multi-sample-gVCF to true. See Joint Genotyper Options for information on the available command line options. |
The following figure displays the different pathways and data flows between the gVCF Genotyper and Joint Genotyper.
The gVCF pathway is suitable for only small data sets, such as pedigrees or cohorts with 3–15 samples. The VCF pathways can scale to larger data sets. If using a VCF pathway, you can analyze 1000 samples on a single server in about 24 hours.
To receive a list of variants present in the cohort and the genotypes of that variant in each of the cohort members, run the gVCF Genotyper. Optionally, you can run the Joint Genotyper after to build a second multisample VCF. The Joint Genotyper output refines the sample genotypes based on the population information. If using the gVCF Genotyper output only, you can filter out infrequent variants to prevent noise if the variants contain low depth or low genotype quality. Use an open-source utility like bcftools on the output file to filter the variants.
To compare multiple pedigrees, you can run gVCF Genotyper on the output of Joint Genotyper and merge multiple joint-called pedigrees into a single multisample VCF.
Use the --enable-multi-sample-gvcf=true gVCF option to configure the Joint Genotyper to write a multisample gVCF.