Single Cell RNA Sample Demultiplexing
DRAGEN implements genotype-type based scRNA demultiplexing for datasets that represent mixtures of cells from different individuals, such as cells from different individuals pooled in one library prep or microfluidic run. Because individuals have different genetic variants, DRAGEN can assign sample identity to cells based on the alleles observed in reads in each cell. DRAGEN only takes SNVs into account. Additionally DRAGEN can flag any doublets, such as droplets that contain multiple cells from different individuals.
To use sample demultiplexing, you must provide a VCF file with genotypes for each sample in the dataset. The sample genotypes are represented by the GT field.

You can use the following command line options for scRNA demultiplexing.
• | single-cell-demux-sample-vcf—Specify the VCF file that contains the sample genotypes. The default value is false. |
• | single-cell-demux-detect-doublets—Enable the doublet detection in genotype-based sample demultiplexing. The default value is false. |
The following is an example command line to run the DRAGEN Single Cell RNA Pipeline with demultiplexing.
dragen --enable-rna=true --enable-single-cell-rna=true --umi-source=fastq --single-cell-barcode 0_15 --single-cell-umi 16_25 -r reference_genomes/Mus_musculus/mm10/DRAGEN/8 -a reference_genomes/Mus_musculus/mm10/gtf/gencode.vM23.annotation.gtf.gz -1 lib1_S7_L001_R2_001.fastq.gz --umi-fastq lib1_S7_L001_R1_001.fastq.gz --RGID=1 --RGSM=sample1 --output-dir=/staging/out --output-file-prefix=sample1 --single-cell-demux-detect-doublet=true --single-cell-demux-sample-vcf=sample.vcf

You can find information related to the output of genotype-based scRNA sample demultiplexing in the following three files.
The <prefix>.scRNA.barcodeSummary.tsv contains per-cell metrics, including cell barcodes. The following columns contain information on demultiplexing per-cell. See Outputs for more information on <prefix>.scRNA.barcodeSummary.tsv metrics.
Column |
Description |
|||||||||
---|---|---|---|---|---|---|---|---|---|---|
SampleIdentity |
The SampleIdentity column can contain the following values:
|
|||||||||
IdentityQscore |
The IdentityQscore column contains the value used to estimate the confidence of the sample identity call. After DRAGEN determines the doublet status of the cell as singlet, ambiguous, or doublet, the identity Q-score is defined as -10 * log10(Probability that the assigned identity is correct, given the second most likely identity and the doublet status). The higher values of identity Q-score correspond to more confident sample identity calls. |
The <prefix>.scRNA.demux.tsv file contains sample demultiplexing statistics that were used to infer sample identity of each cell.
Column |
Description |
---|---|
Barcode |
The cell barcode associated with the sample. |
DemuxSNPCount |
The number of SNPs that the reads of the cell barcode intersect. |
DemuxReadCount |
The number of UMIs of the cell barcode that intersect at least one SNP. |
Pure Samples |
Samples from the VCF file. |
BestMixtureIdentity |
Mixture sample with the highest log likelihood. Only available if --single-cell-demux-detect-doublets=true. |
BestMixtureLogLikelihood |
The log likelihood of the best mixture sample. Only available if --single-cell-demux-detect-doublets=true. |
The <prefix>.scRNA.metrics.demuxSamples.csv file contains per-cell metrics, similar to the metrics reported for the overall dataset in <prefix>.scRNA.metrics.csv.
Column |
Description |
---|---|
Passing cells |
The number of cell barcodes that passed. |
Fraction genic reads in cells |
Counted reads assigned to the cells that passed. |
Median reads per cell |
Total counted reads per cell that passed the filters. |
Median UMIs per cell |
Total counted UMIs per cell that passed the filters. |
Median genes per cell |
The log likelihood of the best mixture sample. |