Maximum Hit Frequency
The --ht-max-seed-freq option sets a limit on the number of seed hits (reference genome locations) that can be populated for any primary or extended seed. If a given primary seed maps to more reference positions than the specified limit, the primary seed must be extended long enough, so that the extended seeds subdivide into smaller groups of identical seeds under the limit. If, even at the maximum extended seed length (--ht-max-ext-seed-len), a group of identical reference seeds is larger than this limit, their reference positions are not populated into the hash table. Instead, DRAGEN populates a single high frequency record.
The maximum hit frequency can be configured from 1–256. If the value is too low, hash table construction can fail because too many seed extensions are needed. The recommended minimum for a whole human genome reference is 8.

A higher maximum hit frequency can lead to more successful mapping, due to the following.
• | A higher limit rejects fewer reference positions that cannot map under it. |
• | A higher limit allows seed extensions to be shorter, which improves the odds of exact seed matching without overlapping variants or sequencing errors. |
However, as with very short seeds, allowing high hit counts can sometimes lower mapping accuracy. Most of the seed hits in a large group are not to the true mapping location. Occasionally one of these noise hits could be reported due to imperfect scoring models. The mapper also limits the total number of reference positions considered. Allowing very high hit counts can potentially crowd out the actual best match from consideration.

Higher maximum hit frequencies slow down read mapping because seed mapping finds more reference locations, which result in additional work, such as Smith-Waterman alignments, to determine the best result.