Outputs

Single-cell RNA outputs are found in the standard DRAGEN output location using the prefix scRNA.

Counts

The following three files provide information per-cell gene expression level in matrix market (*.mtx) format:

Option	Description
<prefix>.scRNA.matrix.mtx.gz	Count of unique UMIs for each cell/gene pair in sparse matrix format.
<prefix>.scRNA.barcodes.tsv.gz	Cell-barcode sequence for each cell from the matrix. This includes all cell-barcodes.
<prefix>.scRNA.genes.tsv.gz	Gene name and ID for each gene in the matrix.

The subset of barcodes corresponding to passing cells can be found under the Filter column in <prefix>.scRNA.barcodeSummary.tsv indicated by values PASS and FAIL.

The output includes filtered matrix files which only include the per-cell gene expression for the filtered PASS cells in matrix market *.mtx format. The scRNA.genes.tsv.gz files is common for the unfiltered and filtered matrices:

Option	Description
<prefix>.scRNA.filtered.matrix.mtx.gz	Count of unique UMIs for each filtered cell/gene pair in sparse matrix format.
<prefix>.scRNA.filtered.barcodes.tsv.gz	Cell-barcode sequence for each filtered cell from the matrix.

Loading output in a dense matrix

Loading the matrix in a dense dataframe using python allows you to create an output in a readable format. Illumina recommends a sparse representation of the matrix due to the significant usage of memory and disk space of dense matrices. Several tools are available to work efficiently with "sparse" representations of single cell matrices. Illumina tested the loading the matrix in python 3.10.0 using scanpy 1.9.3 and pandas 1.5.3 tools.

Follow the steps below:

Enter the following command to install the required libraries:

> pip install -U scanpy pandas

Use the following python commands to load the matrix in dense representation:

# import libraries

import pandas as pd

import scanpy as sc

# define path to input files

matrix_path = "path/to/matrix.mtx.gz"

features_path = "path/to/genes.tsv.gz"

barcodes_path = "path/to/barcodes.tsv.gz"

# load matrix through scanpy

adata = sc.read_mtx(matrix_path).T

adata.var_names = pd.read_csv(features_path, sep="\t", header=None, compression="gzip")[1]

adata.obs_names = pd.read_csv(barcodes_path, sep="\t", header=None, compression="gzip")[0]

# convert scanpy internal format (AnnData) to dense pandas DataFrame

df = pd.DataFrame(adata.X.todense(), index=adata.obs_names, columns=adata.var_names)

# save it as CSV file

df.to_csv("output_matrix.csv")

The matrix can be saved through different output formats (eg, CSV), although it is not recommended due to high disk usage.

Overall Metrics

The <prefix>.scRNA.metrics.csv file contains per sample scRNA metrics.

Barcode Read Metrics

Metric	Description
Invalid barcode read	Overall barcode sequence (cell barcode + UMI) failed basic checks. For example, the barcode read was missing or too short.
Error free cell-barcode	Reads with cell-barcode sequences that were not altered during error correction. For example, if the read was an exact match to the allow list.
Eallow listected cell-barcode	Reads with cell-barcode sequences successfully corrected to a valid sequence.
Filtered cell-barcode	Reads with cell-barcode sequences that could not be corrected to a valid sequence. For example, the sequence does not match allow list with at most one mismatch.

Transcript Read Metrics

Metric	Description
Unique exon match	Reads with valid cell-barcode and UMI that match a unique gene.
Unique intron match	Reads do not match exons, but introns of exactly one gene. For example, if using the command --single-cell-count-introns=true.
Ambiguous match	Reads match to multiple genes.
Wrong strand	Reads overlap a gene on the opposite strand defined by library type.
Mitochondrial reads	Reads map to the mitochondrial example, if there is a matching gene.
No gene	Reads do not match to any gene. Includes intronic reads unless using --single-cell-count-introns=true .
Filtered multimapper	Reads excluded due to multiple alignment positions in the genome.
Feature reads	Reads matching to features, when using feature counting.

UMI Count Metrics)

Metric	Description
Total counted reads	Reads with valid cell-barcode and UMI that match a unique gene.
Reads with error-corrected UMI	Counted reads where the UMI was error-corrected to match another similar UMI sequence.
Reads with invalid UMI	Reads that were not counted due to invalid UMI sequence. For example, pure homopolymer reads or reads containing Ns.
Sequencing saturation	Fraction of reads with duplicate UMIs. 1 - (UMIs / Reads).
Unique cell-barcodes	Overall number of unique cell-barcode sequences in counted reads only.
Unique UMIs	Overall number of unique cell-barcode and UMI combinations counted.

Cell Metrics

Metric	Description
UMI threshold for passing cells	Number of UMIs required for a cell-barcode to pass filtering.
Passing cells	Number of cell-barcodes that passed the filters.
Fraction genic reads in cells	Counted reads assigned to cells that passed the filters.
Fraction reads putative cells	All counted reads assigned to cells that passed the filters.
Median reads per cells	Total counted reads per cell that passed the filters.
Median UMIs per cells	Total counted UMIs per cell that passed the filters.
Median genes per cells	Genes with at least one UMI per cell that passed the filters.
Total genes detected	Genes with at least one UMI in at least one cell that passed the filters.

Per-Cell Metrics

The <prefix>.scRNA.barcodeSummary.tsv contains summary statistics for each unique cell-barcode per cell after error correction.

Metric

Description

Unique numeric ID for the cell-barcode. The ID matches the line in UMI count matrix (*.mtx) output.

Barcode

The cell-barcode sequence.

TotalReads

Total reads with the cell-barcode sequence. This includes error corrected reads.

GeneReads

Reads counted towards a gene.

UMIs

Total number of UMIs in counted reads.

Genes

Unique genes detected.

MitochondrialReads

Reads mapped to mitochondrial genome.

Filter

The following are the available filter values:

•

PASS—Cell-barcode passes the filter.

•

LOW—UMI count is below the threshold.