Maximum Hit Frequency

The --ht-max-seed-freq option sets a limit on the number of seed hits (reference genome locations) that can be populated for any primary or extended seed. If a given primary seed maps to more reference positions than the specified limit, the primary seed must be extended long enough, so that the extended seeds subdivide into smaller groups of identical seeds under the limit. If, even at the maximum extended seed length (--ht-max-ext-seed-len), a group of identical reference seeds is larger than this limit, their reference positions are not populated into the hash table. Instead, DRAGEN populates a single high frequency record.

The maximum hit frequency can be configured from 1–256. If the value is too low, hash table construction can fail because too many seed extensions are needed. The recommended minimum for a whole human genome reference is 8.