ALT-Aware Mapping

The GRCh38 human reference contains many more alternate haplotypes, or ALT contigs, than previous versions of the reference. Generally, including ALT contigs for complex regions in the mapping reference improves mapping and variant calling specificity. Otherwise, reads that match an ALT contig but score poorly against the corresponding location in the primary assembly can be misaligned. However, when many reads align equally well to an ALT contig and to the corresponding position in the primary assembly, mapping without special treatment degrades variant calling sensitivity.DRAGEN ALT-aware mapping eliminates the sensitivity degradation issue while maintaining specificity improvements from ALT contigs.

ALT-aware mapping requires hash tables that are built with ALT liftover alignments specified. For more information, see ALT-Aware Hash Tables. If a hash table built with liftover alignments is provided, DRAGEN automatically runs with ALT-aware mapping. To disable ALT-Aware mapping with a liftover reference, set the --alt-aware option to false.

If ALT contigs are detected in an hg19 or GRCh38 reference, DRAGEN requires ALT-Aware hash tables. To disable this requirement in DRAGEN, set the --ht-alt-aware-validate option to false.

When ALT-aware mapping is enabled, the DRAGEN mapper and aligner are aware of the liftover relationship between ALT contig positions and corresponding primary assembly positions. DRAGEN uses seed matches within ALT contigs to obtain corresponding primary assembly alignments, even if the latter score poorly. Liftover groups are formed, each containing a primary assembly alignment candidate and zero or more ALT alignment candidates that lift to the same location. Each liftover group is scored according to its best-matching alignments, taking properly paired alignments into account. The highest-scoring liftover group provides its primary assembly representative as the primary output alignment, with MAPQ calculated based on the score difference to the second-best liftover group. Emitting primary alignments within the primary assembly maintains normal aligned coverage and facilitates variant calling there. If the --Aligner.en-alt-hap-aln option is set to 1 and --Aligner.supp-aligns is greater than 0, then corresponding alternate haplotype alignments can also be output and flagged as supplementary alignments.

The following is a comparison of alternative approaches for dealing with alternate haplotypes.

Mapping without ALT contigs in the reference:
Reads matching ALT contigs can misalign and result in a false-positive variant call.
Poor mapping and variant calling sensitivity where reads matching an ALT contig differ greatly from the primary assembly.
Mapping with ALT contigs but no ALT awareness:
Reads matching ALT contigs do not misalign and related false-positive variant calls are prevented.
Low or zero aligned coverage in primary assembly regions covered by alternate haplotypes, because some reads are mapping to ALT contigs.
Low or zero MAPQ in regions covered by alternate haplotypes, where they are similar or identical to the primary assembly.
Variant calling sensitivity is reduced throughout regions covered by alternate haplotypes.
Mapping with ALT contigs and alt awareness:
Reads matching ALT contigs do not misalign and related false-positive variant calls are prevented.
Normal aligned coverage in regions covered by alternate haplotypes, because primary alignments are mapping to the primary assembly.
Normal MAPQs are assigned because alignment candidates within a liftover group are not considered in competition.
Good mapping and variant calling sensitivity where reads matching an ALT contig differ greatly from the primary assembly.