Epigenome Map and Align Examples
Prior to performing an epigenome (methylation) map and align run with bisulfite sequencing data, you must first create methylation-specific reference hash tables, as follows:
mkdir -p /staging/human/reference/hg19_epigenome
dragen --build-hash-table true \
--ht-reference /staging/human/reference/hg19/hg19.fa \
--ht-max-seed-freq 64 --ht-seed-len 27 --ht-methylated true \
--output-directory /staging/human/reference/hg19_epigenome \
--ht-alt-liftover /opt/edico/liftover/hg19_alt_liftover.sam
The above dragen command produces two hash table directories under /staging/human/reference/hg19_epigenome: GA_converted and CT_converted. The CT_converted hash table is produced by converting each C base to T in the reference sequences. Similarly, the GA_converted hash table is produced from the G->A base-converted reference sequences. The base-converted references have less complexity, and to compensate, the hash table seed length argument (--ht-seed-len) is typically increased to 27 for mammalian genomes (default seed length is 21).
The --ht-alt-aware-validate false option can be used in place of --ht-alt-liftover. However, the dragen map quality will be significantly affected due to the presence of alternate contigs in the hg19.fa reference.
Epigenome Map/Align, Directional-protocol, Single-Ended FASTQ Input, BAM Output
The directional (Lister) protocol produces reads from two of the four possible bisulfite sequencing strands. Therefore, when the --methylation-protocol=directional option is used, DRAGEN aligns each read or read pair twice with different constraints corresponding to the two possible strands. The following DRAGEN command produces two separate BAM files:
mkdir –p /staging/epigenome/directional
dragen -f –r /staging/human/reference/hg19_epigenome \
-1 /staging/epigenome/reads/sample_1_R1.fastq.gz \
-2 /staging/epigenome/reads/sample_10_R2.fastq.gz \
--RGID Illumina_RGID \
--RGSM sample_10 \
--RGPL illumina \
--output-directory /staging/epigenome/directional \
--output-file-prefix sample_10 \
--methylation-protocol=directional \
--enable-sort false
Epigenome Map/Align, Nondirectional-protocol, Paired-Ended FASTQ Input, BAM Output
The nondirectional protocol produces reads from all four possible bisulfite sequencing strands. Therefore, when the --methylation-protocol=non-directional argument is used, DRAGEN aligns each read four times and produces four separate BAM files.
mkdir –p /staging/epigenome/non-directional
dragen -f –r /staging/human/reference/hg19_epigenome \
-1 /staging/epigenome/reads/sample_10_R1.fastq.gz \
-2 /staging/epigenome/reads/sample_10_R2.fastq.gz \
--RGID Illumina_RGID \
--RGSM sample_10 \
--RGPL illumina \
--output-directory /staging/epigenome/non-directional \
--output-file-prefix sample_10 \
--methylation-protocol non-directional \
--enable-sort false