Gene Fusion Detection
The DRAGEN Gene Fusion module uses the DRAGEN RNA spliced aligner for detection of gene fusion events. It performs a split-read analysis on the supplementary (chimeric) alignments to detect potential breakpoints. The putative fusion events then go through various filtering stages to mitigate potential false positives. In addition to the final results, all potential candidates (unfiltered) are output, which can be used to maximize sensitivity.

You can run the DRAGEN Gene Fusion module together with a regular RNA-Seq map/align job. To enable the DRAGEN Gene Fusion module, set --enable-rna-gene-fusion to true in your current RNA-Seq command-line scripts. The DRAGEN Gene Fusion module requires a gene annotations file in GTF or GFF format.
The following is an example command line for running an end to end RNA-Seq experiment.
/opt/edico/bin/dragen \
-r <HASHTABLE> \
-1 <FASTQ1> \
-2 <FASTQ2> \
-a <GTF_FILE> \
--output-dir <OUT_DIRECTORY> \
--output-file-prefix <PREFIX> \
--RGID <READ_GROUP_ID> \
--RGSM <Sample_NAME> \
--enable-rna true \
--enable-rna-gene-fusion true
At the end of a run, a summary of detected gene fusion events is output, which is similar to the following example.
==================================================================
Loading gene annotations file
==================================================================
Input annotations file: ref_annot.gtf
Number of genes: 27459
Number of transcripts: 196520
Number of exons: 1196293
==================================================================
Launching DRAGEN Gene Fusion Detection
==================================================================
annotation-file: ref_annot.gtf
rna-gf-blast-pairs: blast_pairs.outfmt6
rna-gf-exon-snap: 50
rna-gf-min-anchor: 25
rna-gf-min-neighbor-dist: 15
rna-gf-max-partners: 3
rna-gf-min-score-ratio: 0.15
rna-gf-min-support: 2
rna-gf-min-support-be: 10
rna-gf-restrict-genes true
==================================================================
Completed DRAGEN Gene Fusion Detection
==================================================================
Chimeric alignments: 107923
Total fusion candidates: 38 (2116 before filters)
Time loading annotations: 00:00:08.543
Time running gene fusion: 00:00:18.470
Total runtime: 00:00:27.760
***********************************************************
DRAGEN finished normally

The DRAGEN Gene Fusion module can be run as a standalone utility, taking the *.Chimeric.out.junction file as input and the gene annotations file as a GTF/GFF file. Running the Gene Fusion module standalone is useful for trying out various configuration options at the gene fusion detection stage, without having to map and align the RNA-Seq data multiple times.
To execute the DRAGEN Gene Fusion module as a standalone utility, use the --rna-gf-input-file option to specify the already generated *.Chimeric.out.junction file.
The following is an example command line for running the gene fusion module as a standalone utility.
/opt/edico/bin/dragen \
-a <GTF_FILE> \
--rna-gf-input-file <INPUT_CHIMERIC> \
--output-dir <OUT_DIRECTORY \
--output-file-prefix <PREFIX> \
--enable-rna true \
--enable-rna-gene-fusion true
Standalone mode does not produce identical results to running from reads.

The <outputPrefix>fusion_candidates.features.csv file lists the detected gene fusion events. The output CSV file includes the following columns. Any additional columns describe additional features of the fusion candidates.
• | #FusionGene—Parent gene names (in 5' to 3' order of transcript) participating in the fusion. If a fusion breakend overlaps multiple genes, all are listed. |
• | Score—Fusion call confidence score based on the number of supporting split reads and read-pairs as well as other fusion features. The score can be 0 (low confidence) to 1 (high-confidence call). |
• | LeftBreakpoint—Gene 1 breakpoint formatted as <Chromosome>:<Position>:<Strand>. |
• | RightBreakpoint—Gene 2 breakpoint formatted as <Chromosome>:<Position>:<Strand>. |
• | Filter—Semicolon separated list of filters. Each output is either a Confidence or Information Only filter. The Filter value is PASS if none of the confidence filters are triggered. Otherwise, the output value is FAIL. |
The following are the available filters.
Filter |
Type |
Description |
---|---|---|
DOUBLE_BROKEN_EXON |
Confidence |
If both breakpoints are 50 bp from annotated exon boundaries, then the number of supporting reads do not satisfy a high threshold requirement (≥ 10 supporting reads). |
LOW_MAPQ |
Confidence |
All fusion supporting read alignments at either of the breakpoints have MAPQ < 20. |
LOW_UNIQUE_ALIGNMENTS |
Confidence |
All fusion supporting read alignments near at least one of the two breakpoints have the same start and end position. |
MIN_SCORE |
Confidence |
The fusion candidate has low probabilistic score (< 0.5) as determined by the features of the candidate. |
MIN_SUPPORT |
Confidence |
The fusion candidate has < 2 fusion supporting read pairs. |
UNENRICHED_GENES |
Confidence |
If an enrichment list is provided, then neither of the two parent genes is enriched. |
READ_THROUGH |
Confidence |
The breakpoints are cis neighbors (< 200,000 bp) on the reference genome. |
ANCHOR_SUPPORT |
Information only |
Read alignments of fusion supporting reads are 12 bp) at either of the two breakpoints. |
HOMOLOGOUS |
Information only |
The candidate is likely a false candidate generated because the two genes involved have high gene homology. |
LOW_ALT_TO_REF |
Information only |
The number of fusion supporting reads is < 1% of the number of reads supporting the reference transcript at either of the two breakpoints. |
LOW_GENE_COVERAGE |
Information only |
Either of the two breakpoints have less than 125 bp with nonzero read coverage. |

The following options can be used to configure the fusion caller:
Option |
Description |
---|---|
--rna-gf-enriched-genes |
For RNA enrichment assays, a list of targeted genes specified as one gene-name per line. Only fusion calls involving at least one gene on the list are reported. |
--rna-gf-blast-pairs |
A file listing gene pairs that have a high level of similarity. This list of gene pairs is used as a homology filter to reduce false positives. For information on generating this file, visit the Fusion Filter GitHub page. Use the ref annot.cdsplus.fa.allvsall.outfmt6.genesym.gz file produced by CTAT. For runs on human genome assemblies GRCH38 and hg19, DRAGEN automatically applies a default file generated using Gencode version 32 annotations for primary chromosomes if no other file is specified using the command line. |
--rna-repeat-intervals |
BED file that contains a target list of repeat intervals for sensitive fusion detection. Exclusive from --rna-repeat-genes. This option overrides the default files, which contain the genes CIC, DUX4, and SEPTIN14 for GRCh38 and hg19 reference genomes. |
--rna-repeat-genes |
Text file that contains the names or IDs (from annotation GTF file) of targeted repetitive genes for sensitive fusion detection. Exclusive from --rna-repeat-intervals. This option overrides the default BED file. |
--enable-variant-annotation --variant-annotation-assembly --variant-annotation-data |
Enable Illumina Annotation Engine (IAE) to report fusion annotations in JSON format. --enable-variant-annotation must be set to true. For more information, see Illumina Annotation Engine.
|
--rna-gf-restrict-genes |
When parsing the gene annotations file (GTF/GFF) for use in the DRAGEN Gene Fusion module, you can use this option to restrict the entries of interest to only protein-coding regions. Restricting the GTF to only the protein-coding and lincRNA genes reduces false positive rates in currently studied fusion events. The default value is true. |