Outputs

Single-cell ATAC outputs are found in the standard DRAGEN output location using the prefix scATAC.

Counts

The following three files provide information per-cell gene expression level in matrix market (*.mtx) format:

Option	Description
<prefix>.scATAC.matrix.mtx.gv	Count of unique UMIs for each cell/gene pair in sparse matrix format.
<prefix>.scATAC.barcodes.tsv.gv	Cell-barcode sequence for each cell from the matrix. This includes all cell-barcodes.
<prefix>.scATAC.peaks.tsv.gv	Peak name and ID for each peak in the matrix.

The subset of barcodes corresponding to passing cells can be found under the Filter column in <prefix>.scATAC.barcodeSummary.tsv indicated by values PASS and FAIL.

The output includes filtered matrix files which only include the per-cell chromatin accessibility level for the filtered cells in matrix market (*.mtx) format. The scATAC.peaks.tsv.gz file is common for the unfiltered and filtered matrices:

Option	Description
<prefix>.scATAC.filtered.matrix.mtx.gz	Count of unique UMIs for each filtered cell/peak pair in sparse matrix format.
<prefix>.scATAC.filtered.barcodes.tsv.gz	Cell-barcode sequence for each filtered cell from the matrix.

Loading Output in a Dense Matrix

Loading the matrix in a dense dataframe using python allows you to create an output in a readable format. Illumina recommends a sparse representation of the matrix due to the significant usage of memory and disk space of dense matrices. Several tools are available to work efficiently with "sparse" representations of single cell matrices. Illumina tested the loading the matrix in python 3.10.0 using scanpy 1.9.3 and pandas 1.5.3 tools.

Follow the steps below:

Enter the following command to install the required libraries:

> pip install -U scanpy pandas

Use the following python commands to load the matrix in dense representation:

# import libraries

import pandas as pd

import scanpy as sc

# define path to input files

matrix_path = "path/to/matrix.mtx.gz"

features_path = "path/to/peaks.tsv.gz"

barcodes_path = "path/to/barcodes.tsv.gz"

# load matrix through scanpy

adata = sc.read_mtx(matrix_path).T

adata.var_names = pd.read_csv(features_path, sep="\t", header=None, compression="gzip")[1]

adata.obs_names = pd.read_csv(barcodes_path, sep="\t", header=None, compression="gzip")[0]

# convert scanpy internal format (AnnData) to dense pandas DataFrame

df = pd.DataFrame(adata.X.todense(), index=adata.obs_names, columns=adata.var_names)

# save it as CSV file

df.to_csv("output_matrix.csv")

The matrix can be saved through different output formats (eg, CSV), although it is not recommended due to high disk usage.

Overall Metrics

The <prefix>.scATAC.metrics.csv file contains per sample scATAC metrics.

Barcode Read Metrics

Metric	Description
Invalid barcode read	Overall barcode sequence failed basic checks. For example, the barcode read was missing or too short.
Error free cell-barcode	Reads with cell-barcode sequences that were not altered during error correction. For example, if the read was an exact match to the allow list.
Error corrected cell-barcode	Reads with cell-barcode sequences successfully corrected to a valid sequence.
Filtered cell-barcode	Reads with cell-barcode sequences that could not be corrected to a valid sequence. For example, the sequence does not match allow list with at most one mismatch.

Genomic Fragment Metrics

Metric	Description
Fragments passing filters	Non-chimeric non-mitochondrial fragments that align to primary contigs with a high mapping quality (greater than 30 by default).
Non-primary contig fragments	Fragments that align to non-primary contigs (any contigs that are not autosome, X and Y).
Chimeric fragments	Fragments with the two reads aligning to different contigs.
Mitochondrial fragments	Fragments aligning to the mitochondrial contigs.
Low mapping quality fragments	Fragments with the two reads aligning with a mapping quality set to some specific value (default is 30).
Improperly mapped fragments	The two reads in the fragment are not mapped in proper pair (SAM flag "read mapped in proper pair" is set to 0).

Cell Metrics

Metric	Description
Fragment threshold for passing cells	Number of fragments required for a cell-barcode to pass filtering.
Passing cells	Number of cell-barcodes that passed the filters.
Fraction peak fragments in passing cells:	Percentage of counted fragments intersecting peaks assigned to cells that passed the filters.
Fraction fragments in passing cells	Percentage of all counted fragments assigned to cells that passed the filters.
Median fragments per cells	Total counted fragments per cell that passed the filters.
Median peaks per cells	Peaks with at least one fragment per cell that passed the filters.
Total peaks detected	Peaks with at least one fragment in at least one cell that passed the filters.

Per-Cell Metrics

The <prefix>.scATAC.barcodeSummary.tsv contains summary statistics for each unique cell-barcode per cell after error correction.

Metric

Description

Unique numeric ID for the cell-barcode.

Barcode

The cell-barcode sequence.

TotalFragments

Total fragments with the cell-barcode sequence.

UniqueFragments

Unique fragments counted towards a peak.

NonPrimaryContigFragments

Unique non-primary contig framgnets.

ChimericFragments

Unique chimeric fragments.

LowMapqFragments

Unique low mapping quality fragments.

MitochondrialFragments

Unique fragments mapped to mitochondrial genome.

Peaks

Unique peaks detected

Filter

The following are the available filter values:

•

PASS - Cell-barcode passes the filter

•

LOW - UMI count is below threshold