Segmentation

After a case sample has been normalized, it goes through a segmentation stage. There are multiple segmentation algorithms implemented in DRAGEN, including the following:

CBS (Circular Binary Segmentation)
SLM (Shifting Level Models)

The SLM algorithm has three variants, SLM, HSLM, and ASLM. HSLM (Heterogeneous SLM) is for use in exome analysis and handles target capture kits that are not equally spaced. ASLM (Adaptive SLM) includes additional sample-specific estimation of technical variability of depth of coverage (as opposed to changes in copy number), based on the median variance within fixed windows or a preliminary set of segments based on b-allele ratios, and can provide more robustness to "noisy" or "wavy" samples.

The default segmentation algorithm in use is SLM for germline whole genome processing, ASLM for somatic whole genome processing, and CBS for whole exome processing.

For the targeted sequencing workflows, you can also run with a --cnv-segmentation-bed. The option pre-defines the segments to estimate copy numbers for and skips the segmentation step of the workflow. See Targeted Segmentation (Segment BED)

Option

Description

--cnv-segmentation-mode

Specifies the segmentation algorithm to perform. The following values are available:

segment-bed
cbs
slm—The default for germline WGS analysis.
aslm—The default for somatic WGS analysis.
hslm—The default for targeted/WES analysis.

--cnv-merge-distance

Specifies the maximum number of base pairs between two segments that would allow them to be merged. The default value is 0 for WGS, which means the segments must be directly adjacent. For WES analysis this parameter is disabled by default due to the spacing of targeted intervals.

--cnv-merge-threshold

Specifies the maximum segment mean difference at which two adjacent segments should be merged. The segment mean is represented as a linear copy ratio value. The default is 0.2 for WGS and 0.4 for WES. To disable merging, set the value to 0.