Download Data Files

To store annotation data files, create a top-level directory. The created directory contains three subdirectories:

Cache contains gene models.
SupplementaryAnnotation contains external data sources like dbSNP and gnomAD.
References contains the reference genome.

The following command line options are used:

Option

Value

Example

Description

--ga

GRCh37, GRCh38, or both

GRCh38

Genome assembly

--out

Output directory

~/Data

Top-level output directory

Download data files as follows.

If the DRAGEN Server does not have an internet connection, the Downloader executable can be copied to a non-DRAGEN Server that is connected to the internet to download the annotation data. Once the download has completed, the annotation data can then be copied locally to the DRAGEN Server for subsequent annotation.

1. To create a data directory, enter the following command.

This example creates the Data directory in your home directory.

mkdir ~/Data

2. Download the files for a genome assembly.

This example downloads the genome assembly GRCh38.

/opt/edico/share/nirvana/Downloader --ga GRCh38 --out ~/Data

You can use the same command to resynchronize the data sources with the Illumina Annotation Engine servers, including the following actions:

Remove obsolete files, such as old versions of data sources, from the output directory.
Download newer files.

The following is the created output:

---------------------------------------------------------------------------

Downloader (c) 2020 Illumina, Inc.

Stromberg, Roy, Lajugie, Jiang, Li, and Kang 3.9.1-0-gc823805

---------------------------------------------------------------------------

- downloading manifest... 37 files.

- downloading file metadata:

- finished (00:00:00.8).

- downloading files (22.123 GB):

- downloading 1000_Genomes_Project_Phase_3_v3_plus_refMinor.rma.idx (GRCh38)

- downloading MITOMAP_20200224.nsa.idx (GRCh38)

- downloading ClinVar_20200302.nsa.idx (GRCh38)

- downloading REVEL_20160603.nsa.idx (GRCh38)

- downloading phyloP_hg38.npd.idx (GRCh38)

- downloading ClinGen_Dosage_Sensitivity_Map_20200131.nsi (GRCh38)

- downloading MITOMAP_SV_20200224.nsi (GRCh38)

- downloading dbSNP_151_globalMinor.nsa.idx (GRCh38)

- downloading ClinGen_Dosage_Sensitivity_Map_20190507.nga (GRCh38)

- downloading PrimateAI_0.2.nsa.idx (GRCh38)

- downloading ClinGen_disease_validity_curations_20191202.nga (GRCh38)

- downloading 1000_Genomes_Project_Phase_3_v3_plus.nsa.idx (GRCh38)

- downloading SpliceAi_1.3.nsa.idx (GRCh38)

- downloading dbSNP_153.nsa.idx (GRCh38)

- downloading TOPMed_freeze_5.nsa.idx (GRCh38)

- downloading MITOMAP_20200224.nsa (GRCh38)

- downloading gnomAD_2.1.nsa.idx (GRCh38)

- downloading ClinGen_20160414.nsi (GRCh38)

- downloading gnomAD_gene_scores_2.1.nga (GRCh38)

- downloading 1000_Genomes_Project_(SV)_Phase_3_v5a.nsi (GRCh38)

- downloading MultiZ100Way_20171006.pcs (GRCh38)

- downloading 1000_Genomes_Project_Phase_3_v3_plus_refMinor.rma (GRCh38)

- downloading ClinVar_20200302.nsa (GRCh38)

- downloading OMIM_20200409.nga (GRCh38)

- downloading Both.transcripts.ndb (GRCh38)

- downloading REVEL_20160603.nsa (GRCh38)

- downloading PrimateAI_0.2.nsa (GRCh38)

- downloading dbSNP_151_globalMinor.nsa (GRCh38)

- downloading Both.sift.ndb (GRCh38)

- downloading Both.polyphen.ndb (GRCh38)

- downloading Homo_sapiens.GRCh38.Nirvana.dat

- downloading 1000_Genomes_Project_Phase_3_v3_plus.nsa (GRCh38)

- downloading phyloP_hg38.npd (GRCh38)

- downloading SpliceAi_1.3.nsa (GRCh38)

- downloading TOPMed_freeze_5.nsa (GRCh38)

- downloading dbSNP_153.nsa (GRCh38)

- downloading gnomAD_2.1.nsa (GRCh38)

- finished (00:04:10.1).

Description Status

---------------------------------------------------------------------------

1000_Genomes_Project_(SV)_Phase_3_v5a.nsi (GRCh38) OK

1000_Genomes_Project_Phase_3_v3_plus.nsa (GRCh38) OK

1000_Genomes_Project_Phase_3_v3_plus.nsa.idx (GRCh38) OK

1000_Genomes_Project_Phase_3_v3_plus_refMinor.rma (GRCh38) OK

1000_Genomes_Project_Phase_3_v3_plus_refMinor.rma.idx (... OK

Both.polyphen.ndb (GRCh38) OK

Both.sift.ndb (GRCh38) OK

Both.transcripts.ndb (GRCh38) OK

ClinGen_20160414.nsi (GRCh38) OK

ClinGen_Dosage_Sensitivity_Map_20190507.nga (GRCh38) OK

ClinGen_Dosage_Sensitivity_Map_20200131.nsi (GRCh38) OK

ClinGen_disease_validity_curations_20191202.nga (GRCh38) OK

ClinVar_20200302.nsa (GRCh38) OK

ClinVar_20200302.nsa.idx (GRCh38) OK

Homo_sapiens.GRCh38.Nirvana.dat OK

MITOMAP_20200224.nsa (GRCh38) OK

MITOMAP_20200224.nsa.idx (GRCh38) OK

MITOMAP_SV_20200224.nsi (GRCh38) OK

MultiZ100Way_20171006.pcs (GRCh38) OK

OMIM_20200409.nga (GRCh38) OK

PrimateAI_0.2.nsa (GRCh38) OK

PrimateAI_0.2.nsa.idx (GRCh38) OK

REVEL_20160603.nsa (GRCh38) OK

REVEL_20160603.nsa.idx (GRCh38) OK

SpliceAi_1.3.nsa (GRCh38) OK

SpliceAi_1.3.nsa.idx (GRCh38) OK

TOPMed_freeze_5.nsa (GRCh38) OK

TOPMed_freeze_5.nsa.idx (GRCh38) OK

dbSNP_151_globalMinor.nsa (GRCh38) OK

dbSNP_151_globalMinor.nsa.idx (GRCh38) OK

dbSNP_153.nsa (GRCh38) OK

dbSNP_153.nsa.idx (GRCh38) OK

gnomAD_2.1.nsa (GRCh38) OK

gnomAD_2.1.nsa.idx (GRCh38) OK

gnomAD_gene_scores_2.1.nga (GRCh38) OK

phyloP_hg38.npd (GRCh38) OK

phyloP_hg38.npd.idx (GRCh38) OK

---------------------------------------------------------------------------

Peak memory usage: 52.3 MB

Time: 00:04:12.2