DRAGEN Graph Mapper

The graph mapper in DRAGEN improves variant calling accuracy in segmental duplications and other regions that are difficult to map with Illumina reads. The graph-based method uses alt-aware mapping for population haplotypes that are stitched into the reference with known alignments. The method establishes alternate graph paths that reads can seed-map and align to. The graph mapper reduces mapping ambiguity because reads that contain population variants are attracted to the specific regions where the variants are observed.

To evolve the FASTA reference to a graph reference, DRAGEN augments the FASTA reference with around 900,000 short alternate contigs derived from population haplotypes of phased variants. The mapper has alt-aware capabilities that project reads that match the population haplotypes to corresponding primary assembly alignments with a precise lift-over alignment.

When given a set of population variants (VCF) or haplotypes, the FASTA modification is categorized in the following two types:

Alternate contigs—This type represents population haplotypes. Alt-contigs can have a single variant or a combination of nearby phased variants.
Ambiguous codes (IUPAC codes)—This type represents SNPs. To improve alignment, edit the reference FASTA with isolated population SNPs.