Somatic UMI Tumor Normal
This recipe is for processing sequencing data with unique molecular identifier (UMI) for somatic tumor normal workflows.
For Somatic UMI Tumor Normal inputs, tumor and normal sample need to be run separately for the Map/Align stage, and then Variant Calling is started from tumor and normal UMI collapsed BAM.
The following are partial templates that can be used as starting points. Adjust them accordingly for your specific use case.
Map/Align stage
| • | Configure the INPUT options |
| • | Configure the OUTPUT options |
| • | Configure MAP/ALIGN |
| • | Configure UMI options |
#!/bin/bash
set -euo pipefail
# Path to DRAGEN hashtable
DRAGEN_HASH_TABLE=<REF_DIR>
# Path to output directory for the DRAGEN run
OUTPUT=<OUT_DIR>
# File prefix for DRAGEN output files
PREFIX=<OUT_PREFIX>
# Define the input sources, select fastq list, fastq, bam, or cram. Please select either tumor or normal input with UMI to generate collapsed BAM. In this example, we use tumor input option.
INPUT_FASTQ_LIST="
--tumor-fastq-list $TUMOR_FASTQ_LIST \
--tumor-fastq-list-sample-id $TUMOR_FASTQ_LIST_SAMPLE_ID \
"
INPUT_FASTQ="
--tumor-fastq1 $TUMOR_FASTQ1 \
--tumor-fastq2 $TUMOR_FASTQ2 \
--RGSM-tumor $RGSM_TUMOR \
--RGID-tumor $RGID_TUMOR \
"
INPUT_BAM="
--tumor-bam-input $TUMOR_BAM \
"
INPUT_CRAM="
--tumor-cram-input $TUMOR_CRAM \
"
# Select input source, here in this example we use INPUT_FASTQ_LIST
INPUT_OPTIONS="
--ref-dir $DRAGEN_HASH_TABLE \
$INPUT_FASTQ_LIST \
"
OUTPUT_OPTIONS="
--output-directory $OUTPUT \
--output-file-prefix $PREFIX \
"
MA_OPTIONS="
--enable-map-align true \
--enable-sort true \
"
UMI_OPTIONS="
--enable-umi true \
--umi-source $UMI_SOURCE \
--umi-library-type $UMI_LIBRARY_TYPE \
"
# Construct final command line
CMD="
dragen \
$INPUT_OPTIONS \
$OUTPUT_OPTIONS \
$MA_OPTIONS \
$UMI_OPTIONS \
"
# Execute
echo $CMD
bash -c $CMD
Variant Calling stage
| • | Configure the INPUT options |
| • | Configure the OUTPUT options |
| • | Configure the VARIANT CALLERs based on the application |
| • | Configure any additional options |
| • | Build up the necessary options for each component separately, so that they can be re-used in the final command line. |
#!/bin/bash
set -euo pipefail
# Path to DRAGEN hashtable
DRAGEN_HASH_TABLE=<REF_DIR>
# Path to output directory for the DRAGEN run
OUTPUT=<OUT_DIR>
# File prefix for DRAGEN output files
PREFIX=<OUT_PREFIX>
INPUT_BAM="
--tumor-bam-input $TUMOR_BAM \
--bam-input $BAM \
"
INPUT_OPTIONS="
--ref-dir $DRAGEN_HASH_TABLE \
$INPUT_BAM \
"
OUTPUT_OPTIONS="
--output-directory $OUTPUT \
--output-file-prefix $PREFIX \
"
SNV_OPTIONS="
--enable-variant-caller true \
--vc-enable-umi-solid true or --vc-enable-umi-liquid true \
--vc-target-bed $VC_TARGET_BED \
"
CNV_OPTIONS="
--enable-cnv true \
--cnv-target-bed $CNV_TARGET_BED \
--cnv-normals-list $CNV_PANEL_OF_NORMALS \
"
SV_OPTIONS="
--enable-sv true \
--sv-exome true \
--sv-call-regions-bed $SV_TARGET_BED \
"
# Construct final command line
CMD="
dragen \
$INPUT_OPTIONS \
$OUTPUT_OPTIONS \
$SNV_OPTIONS \
$CNV_OPTIONS \
$SV_OPTIONS \
"
# Execute
echo $CMD
bash -c $CMD
