DRAGEN SV Caller Overview

The DRAGEN SV Caller divides the SV and indel discovery process into the following steps.

1. Reads input files to estimate alignment statistics, including fragment size distribution and chromosome level depth. For more information on the SV Caller input options, see Command-Line Options.
2. Scans the genome or a subset of genome (specified by the call regions) to build various genome-wide data structures, including a breakend association graph of all SV associated regions. The graph contains edges that connect all regions of the genome that have a possible breakend association. Edges can connect two different regions of the genome to represent evidence of a long-range association, or an edge can connect to a region to capture a local indel/small SV association. These associations are more general than a specific SV hypothesis and multiple breakend candidates might be found on one edge. Typically only one or two candidates are found per edge. Instead of passing an inclusion region BED file, an exclusion region BED file can be passed to DRAGEN so that any SV breakend that overlaps with these regions gets removed from downstream analyses. The excluded regions are excluded from the graph building process, but active regions can get extended and present in the excluded regions in the refinement step. This can happen for the active regions that are close to the boundaries of the excluded regions. Hence, the final SV calls may still get extended to these regions.
a. Infers SV candidates that are associated with the given graph edge.
b. Assembles the SV breakends.
c. Merges discovered SV candidates with any known SV candidates included in the input data.
d. Scores/genotypes and filters each SV candidate under various biological models (currently germline and somatic).
e. Outputs scored SVs to VCF.