DRAGEN SV Caller Overview
The DRAGEN SV Caller divides the SV and indel discovery process into the following steps.
|
1.
|
Reads input files to estimate alignment statistics, including fragment size distribution and chromosome level depth. For more information on the SV Caller input options, see Command-Line Options. |
|
2.
|
Scans the genome or a subset of the genome (specified by the call regions) to build various genome-wide data structures, including a breakend association graph of all SV associated regions. The graph contains edges that connect all regions of the genome that have a possible breakend association. Edges can connect two different regions of the genome to represent evidence of a long-range association, or an edge can connect to a region to capture a local indel/small SV association. These associations are more general than a specific SV hypothesis and multiple breakend candidates might be found on one edge. Typically only one or two candidates are found per edge. Instead of passing an inclusion region BED file, an exclusion region BED file can be passed to DRAGEN so that any SV breakend that overlaps with these regions gets removed from downstream analysis. The excluded regions are removed from the graph building process, but active regions can get extended and present in the excluded regions in the refinement step. This can happen for the active regions that are close to the boundaries of the excluded regions. Hence, the final SV calls may still get extended to these regions. |
|
3.
|
Analyzes the breakend association graph to discover candidate SVs, then scores discovered candidate SVs and any known SVs from the input. Analysis and scoring are performed as follows: |
|
a.
|
Infers SV candidates that are associated with the given graph edge. |
|
b.
|
Assembles the SV breakends. |
|
c.
|
Merges discovered SV candidates with any known SV candidates included in the input data. |
|
d.
|
Scores/genotypes and filters each SV candidate under various biological models (currently germline, tumor-normal, and tumor-only). |
|
e.
|
Outputs scored SVs to VCF. |