Peak Annotation
DRAGEN annotates each peak with respect to a gene symbol as promoter, distal, or intergenic depending on the genomic position of both the peak and the gene. The following rules are used to determine the annotation of a peak:
|
•
|
If a peak overlaps with promoter region (-1000bp, +100bp) of any transcription start site (TSS), it is annotated as a promoter peak of the gene. |
|
•
|
If a peak is within 200kb of the closest TSS, and if it is not a promoter peak of the gene of the closest TSS, it will be annotated as a distal peak of that gene. |
|
•
|
If a peak overlaps the body of a transcript, and it is not a promoter nor a distal peak of the gene, it will be annotated as a distal peak of that gene with distance set as zero. |
|
•
|
If a peak has not been mapped to any gene at the step, it will be annotated as an intergenic peak without a gene symbol assigned. |
To enable peak annotation in DRAGEN scATAC-seq workflow, specify a gene annotation file (GTF) using the option -a. Peak annotations are written to a file with name <prefix>.scATAC.peaks.tsv and each annotation is represented as a row with the following 6 columns:
|
•
|
Distance from peak to gene |
|
•
|
Peak annotation (i.e, promoter, distal, or intergenic). |