DRAGEN HLA Caller
DRAGEN provides an HLA aligner and HLA-specific analysis components for class I HLA allele typing. You can enable HLA typing by setting the --enable-hla to true.
The HLA Caller accepts standard input files, as well as an HLA region BED file, HLA allele reference file, and HLA frequency file. The additional input files help align HLA reads and break ties for typing HLA alleles.

You can use the HLA region BED input file to specify the region to extract HLA reads from. To specify an HLA region BED file, use --hla-bed-file. DRAGEN parses the input file for regions within the BED file, and then extracts reads accordingly to align with the HLA allele reference.
The following is an example of a valid BED file.
chr6 29942554 29942627 hla_a 1 +
chr6 29942757 29943027 hla_a 2 +
chr6 29943268 29943544 hla_a 3 +
chr6 29944122 29944398 hla_a 4 +
chr6 29944500 29944617 hla_a 5 +
chr6 29945059 29945092 hla_a 6 +
chr6 29945234 29945282 hla_a 7 +
chr6 31357086 31357159 hla_b 1 -
chr6 31356688 31356958 hla_b 2 -
chr6 31356167 31356443 hla_b 3 -
chr6 31355317 31355593 hla_b 4 -
chr6 31355107 31355224 hla_b 5 -
chr6 31354633 31354666 hla_b 6 -
chr6 31354483 31354527 hla_b 7 -
chr6 31271999 31272072 hla_c 1 -
chr6 31271599 31271869 hla_c 2 -
chr6 31271073 31271349 hla_c 3 -
chr6 31270210 31270486 hla_c 4 -
chr6 31269966 31270086 hla_c 5 -
chr6 31269493 31269526 hla_c 6 -
chr6 31269338 31269386 hla_c 7 -

You can use the HLA allele reference file to specify the reference allele exons that the extracted HLA reads are aligned against. To specify an HLA region BED file, use the command line option --hla-reference-file. The input HLA reference file must be in FASTA format and contain the protein sequence of the HLA alleles to reference separated into exons.
The following is an example of a valid BED file.
>A*01:01-E1
MAVMAPRTLLLLLSGALALTQTWAG
>A*01:01-E2
SHSMRYFFTSVSRPGRGEPRFIAVGYVDDTQFVRFDSDAASQKMEPRAPWIEQEGPEYWDQETRNMKAHSQTDRANLGTLRGYYNQSEDG
>A*01:01-E3
SHTIQIMYGCDVGPDGRFLRGYRQDAYDGKDYIALNEDLRSWTAADMAAQITKRKWEAVHAAEQRRVYLEGRCVDGLRRYLENGKETLQRTD
>A*01:01-E4
PPKTHMTHHPISDHEATLRCWALGFYPAEITLTWQRDGEDQTQDTELVETRPAGDGTFQKWAAVVVPSGEEQRYTCHVQHEGLPKPLTLRWE
>A*01:01-E5
LSSQPTIPIVGIIAGLVLLGAVITGAVVAAVMWRRKSSD
>A*01:01-E6
RKGGSYTQAAS
>A*01:01-E7
SDSAQGSDVSLTACKV
>A*01:03-E1
MAVMAPRTLLLLLSGALALTQTWAG
>A*01:03-E2
SHSMRYFFTSVSRPGRGEPRFIAVGYVDDTQFVRFDSDAASQKMEPRAPWIEQEGPEYWDQETRNMKAHSQTDRANLGTLRGYYNQSEDG
>A*01:03-E3
SHTIQMMYGCDVGPDGRFLRGYRQDAYDGKDYIALNEDLRSWTAADMAAQITKRKWEAVHAAEQRRVYLEGRCVDGLRRYLENGKETLQRTD
>A*01:03-E4
PPKTHMTHHPISDHEATLRCWALGFYPAEITLTWQRDGEDQTQDTELVETRPAGDGTFQKWAAVVVPSGEEQRYTCHVQHEGLPKPLTLRWE
>A*01:03-E5
LSSQPTIPIVGIIAGLVLLGAVITGAVVAAVMWRRKSSD
>A*01:03-E6
RKGGSYTQAAS
>A*01:03-E7
...

You can use the HLA allele frequency file to break ties if one or more HLA allele produces the same or similar results. To specify an HLA allele frequency file, use the command line option --hla-allele-frequency-file. The input HLA allele frequency file must be in CSV format and contain the HLA alleles and the occurrence frequency in population.
The following is an example of a valid BED file:
A*01:01,305
A*01:02,140
A*01:03,100
A*01:04N,13
A*01:06,58
A*01:07,17
A*01:08,14
A*01:09,25
...

You can use the following options to configure the HLA Caller.
• | --hla-tiebreaker-threshold—If more than one allele has a similar number of reads aligned and there is not a clear indicator for the best allele, the alleles are considered as ties. The HLA Caller places the tied alleles into a candidate set for tie breaking based on the population allele frequency. If an allele has more than a fraction of reads aligned to the top hit, then the allele is included into the candidate set for tie breaking. |
• | --hla-zygosity-threshold—The ILP process that determines the best HLA alleles prefers heterozygosity. The HLA Caller performs a zygosity check afterwards by checking the number of aligned reads to each allele in a locus. If the lesser allele has less than a fraction of reads of the greater allele, then HLA Caller assumes homozygosity. |

The DRAGEN HLA Caller generates HLA typing results with six class I alleles in TSV format. The output file contains a header row with one column for each of the six alleles and a body row with the HLA types of each allele.
The following is an example output file.
A1 A2 B1 B2 C1 C2
A*26:01 A*29:02 B*44:02 B*44:03 C*05:01 C*16:01