Joint Analysis Output Format
There are two available joint analysis output files:
• | Multisample VCF—A VCF file containing a column with genotype information for each of the input samples according to the input variants. |
• | Multisample gVCF—A gVCF file augmenting the content of a multisample VCF file, similar to how a gVCF file augments a VCF file for a single sample. In between variant sites, the multisample gVCF contains statistics that describe the level of confidence that each sample is homozygous to the reference genome. Multisample gVCF is a convenient format for combining the results from a pedigree or small cohort into a single file. If using a large number of samples, fluctuation in coverage or variation in any of the input samples creates a new hom-ref block, which causes a highly fragmented block structure and a large output file that can be slow to create. |

In hom-ref blocks, the following FORMAT fields are calculated uniquely.
• | FORMAT/DP—Values represent the minimum DP across all positions within the band. |
• | FORMAT/AD—Values represent the position in the band where DP=median DP. |
• | FORMAT/AF—Values are based on FORMAT/AD. |
• | FORMAT/PL—Values represent the Phred likelihoods per genotype hypothesis. For hom-ref blocks, each value in FORMAT/PL represents the minimum value across all positions within the band. |
• | FORMAT/SPL and FORMAT/ICNT—Parameters reported in the gVCF records, including both hom-ref blocks and variant records. The parameters are used to compute the confidence score of a variant being de novo in the proband of a trio. For SNP, FORMAT/PL and FORMAT/SPL are both used as input to the DeNovo Caller. FORMAT/PL represents Phred likelihoods obtained from the genotyper, if the genotyper is called. FORMAT/SPL represents Phred likelihoods obtained from column-wise estimation, pregraph. Each value in FORMAT/SPL represents the minimum across all positions within the band. For INDEL, the PL value is computed in the joint pedigree calling step based on the FORMAT/ICNT reported in the gVCF file. FORMAT/ICNT consist of two values. The first value is the number of reads with no indels at the position, and the second value is the number of reads with indels at the position. Each value in FORMAT/ICNT represents the maximum of the value across all positions within the band. |
In the following example hom-ref block, ICNT provides information on if each sample contains an Indel at the position of interest. If the proband contains an indel at the position and the ICNT of the parents does not indicate any read supporting an indel, then the confidence score is high for the proband to have an indel de novo call at the position.
cchr1 10288 . C <NON_REF> . PASS END=10290
GT:AD:DP:GQ:MIN_DP:PL:SPL:ICNT
0/0:131,4:135:69:132:0,69,1035:0,125,255:23,1
chr1 10291 . C
T,<NON_REF> 38.45 PASS
DP=100;MQ=24.72;MQRankSum=0.733;ReadPosRankSum=4.112;FractionInformativeReads=0.600;R2_5P_bias=0.000
GT:AD:AF:DP:F1R2:F2R1:GQ:PL:SPL:ICNT:GP:PRI:SB:MB
0/1:28,32,0:0.533,0.000:60:20,21,0:8,11,0:15:73,0,12,307,157,464:255,0,255:23,10:3.8452e+01,1.3151e-01,1.5275e+01,3.0757e+02,1.9173e+02,4.5000e+02:0.00,34.77,37.77,34.77,69.54,37.77:4,24,7,25:8,20,14,18
SPL and ICNT values are specific to DRAGEN. The GATK variant caller does not output SPL and ICNT values.

If merging gVCF files, the gVCF Genotyper copies and combines the gVCF fields from each input file into the output msVCF. The gVCF fields are combined into the output fields as follows.
• | FORMAT/FT—The FILTER field from each input file is copied to the output field for each sample. |
• | QUAL—The output file value is the sum of the QUAL value in each input file. |
• | INFO/MQ—The output file value is the sum of the INFO/MQ value in each input field, weighted by INFO/DP in each input field. |
• | INFO/MQRankSum, INFO/ReadPosRankSum, and INFO/R2_5P_bias—The output file value is the median of the field values in the input files. |
• | INFO/DP—The output file value is the sum of the INFO/DP value in each input file. |
The FORMAT fields are copied from each input file into the output file, but because the output file contains alleles that do not occur in all samples, the genotyper estimates missing values or uses a placeholder value.

DRAGEN variant calls on the mitochondrium (chrM) differ from autosomal variant calls as follows.
• | The genotype quality (GQ) and genotype likelihood (PL) metrics are not calculated on chrM. DRAGEN uses Phred-scaled quality scores SQ instead. If using DRAGEN v3.8 or earlier, LOD is used. |
• | Multiallelic calls are split onto several lines in the VCF files, instead of merged into one line. Each variant is located on a new line. |