Downsampling
DRAGEN can reserve a random subset of reads that are separate from the normal alignment outputs using downsampling. You can use downsampling to generate data sets for performing comparisons between samples or between replicates. DRAGEN samples reads after performing any hardware accelerated trimming or filtering functions, which enables DRAGEN to rapidly create analysis-read test data sets.
To enable downsampling, set the --enable-down-sampler command line option to true.
You can use any valid sequencing data format that is compatible with the DRAGEN Host Software. For more information on compatible input options, see Input Options.
DRAGEN downsampling outputs the reserved subset of data in FASTQ format. If the input is paired-ended, DRAGEN outputs two FASTQ files that contain subsampled data. If the input is unpaired, DRAGEN outputs two FASTQ files.

In addition to enabling the downsampling command line option, you must set the quantity of reads to downsample. To set the quantity of reads, use either --down-sampler-reads or --down-sampler-coverage.
If you specified a coverage level, you must also specify a genome using the --ref-dir or manually specify the genome size using --down-sampler-genome-size. If you specify both a read and coverage limit, DRAGEN applies both quantity limits and keeps whichever result is smaller.
Option |
Description |
---|---|
--enable-down-sampler |
Set to true to enable downsampling. The default value is false. If enabled, you must set either down-sampler-reads or --down-sampler-coverage. |
--down-sampler-num-threads |
Specify the number of threads to use for down-sampled reads. The default value is 8. |
--down-sampler-random-seed |
Set random seed for down-sampled reads. The default value is 42. |
--down-sampler-genome-size |
Set target genome size for downsampling coverage. The default value is 0. The --down-sampler-genome-size option is not compatible with the --ref-dir option. |
--down-sampler-reads |
Specify the target number of reads for downsampling. The default value is 0. |
--down-sampler-coverage |
Set target genomic coverage for downsampling. The default value is 0. If enabled, you must set either -ref-dir or--down-sampler-genome-size. |