Specify B-Allele Loci
The Somatic WGS CNV Caller requires a source of heterozygous SNP sites to measure b-allele counts of the tumor sample. The following are the available modes.
Option |
Description |
---|---|
cnv-normal-b-allele-vcf |
Specify a matched normal SNV VCF. Use when a matched normal sample and the matched normal SNV VCF are available. To use this option, you must run the matched normal sample through the DRAGEN Germline workflow. |
cnv-population-b-allele-vcf |
Specify a population SNP VCF. Use when a matched normal sample is not available and analysis must be performed in tumor-only mode. |
cnv-use-somatic-vc-baf |
Set to true to bypass the DRAGEN Germline workflow. Use if tumor and matched normal input are available. Enable the Somatic SNV Caller to use this option. |
To specify a matched normal sample SNV VCF, use the --cnv-normal-b-allele-vcf option. The VCF file should come from processing the matched normal sample through the DRAGEN germline small variant caller with filters applied. Typically, this file name has a *.hard-filtered.vcf.gz extension. All records marked as PASS that are determined to be heterozygous in the normal sample are used to measure the b-allele counts of the tumor sample. You can also use equivalent gVCF file (*.hard-filtered.gvcf.gz), but the processing time is significantly longer due to the number of records, most of which are not heterozygous sites.
To specify a population SNP VCF, use --cnv-population-b-allele-vcf option. To obtain a population SNP VCF, process an appropriate catalog of population variation, such as from dbSNP, the 1000 genome project, or other large cohort discovery efforts. A suitable example file for this parameter is 1000G_phase1.snps.high_confidence.vcf.gz from the GATK resource bundle. Only high-frequency SNPs should be included. For example, include SNPs with minor allele population frequency ≥ 10% to limit run time impact and reduce artifacts. Specify the ALT allele frequency by adding AF=<alt frequency> to the INFO section of each record. Additional INFO fields might be present, but DRAGEN only parses and uses the AF field. Sites specified with --cnv-population-b-allele-vcf can be either heterozygous or homozygous in the germline genome from which the tumor genome derives.
The following is an example valid population SNP record:
chr1 51479 . T A 1000 PASS AF=0.3253
DRAGEN considers the following requirements when parsing records from the b-allele VCF:
• | Only simple SNV sites. |
• | Records must be marked PASS in the FILTER field. |
• | If there are records with the same CHROM and POS in the VCF, then DRAGEN uses the first record that occurs. |
If a tumor sample and matched normal input are available, use --cnv-use-somatic-vc-baf true. You must enable the Somatic SNV Caller. If using this option, DRAGEN determines the germline heterozygous sites from the matched normal input and measures the b-allele counts of the tumor sample. The information is passed to the Somatic WGS CNV Caller to simplify the overall somatic workflow.
To enable --cnv-use-somatic-vc-baf, enter the following command line options.
• | --tumor-bam-input <TUMOR_BAM>—Specify the tumor input |
• | --bam-input <NORMAL_BAM>—Specify the matched normal input |
• | --enable-variant-caller true—Enable the somatic SNV variant caller |
• | --cnv-use-somatic-vc-baf true—Enable somatic VC BAF |