Download Data Files

To store annotation data files, create a top-level directory. The created directory contains three subdirectories:

•

Cache contains gene models.

•

SupplementaryAnnotation contains external data sources like dbSNP and gnomAD.

•

References contains the reference genome.

The following command line options are used:

Option	Value	Example	Description
--ga	GRCh37, GRCh38, or both	GRCh38	Genome assembly
--out	Output directory	~/Data	Top-level output directory

Download data files as follows.

If the DRAGEN Server does not have an internet connection, the Downloader executable can be copied to a non-DRAGEN Server that is connected to the internet to download the annotation data. Once the download has completed, the annotation data can then be copied locally to the DRAGEN Server for subsequent annotation.

To create a data directory, enter the following command.

This example creates the Data directory in your home directory.

mkdir ~/Data

Download the files for a genome assembly.

This example downloads the genome assembly GRCh38.

/opt/edico/share/nirvana/Downloader --ga GRCh38 --out ~/Data

You can use the same command to resynchronize the data sources with the Illumina Annotation Engine servers, including the following actions:

•

Remove obsolete files, such as old versions of data sources, from the output directory.

•

Download newer files.

The following is the created output:

---------------------------------------------------------------------------

Stromberg, Roy, Lajugie, Jiang, Li, and Kang 3.9.1-0-gc823805

---------------------------------------------------------------------------

- downloading manifest... 37 files.

- downloading file metadata:

- finished (00:00:00.8).

- downloading files (22.123 GB):

- downloading 1000_Genomes_Project_Phase_3_v3_plus_refMinor.rma.idx (GRCh38)

- downloading MITOMAP_20200224.nsa.idx (GRCh38)

- downloading ClinVar_20200302.nsa.idx (GRCh38)

- downloading REVEL_20160603.nsa.idx (GRCh38)

- downloading phyloP_hg38.npd.idx (GRCh38)

- downloading ClinGen_Dosage_Sensitivity_Map_20200131.nsi (GRCh38)

- downloading MITOMAP_SV_20200224.nsi (GRCh38)

- downloading dbSNP_151_globalMinor.nsa.idx (GRCh38)

- downloading ClinGen_Dosage_Sensitivity_Map_20190507.nga (GRCh38)

- downloading PrimateAI_0.2.nsa.idx (GRCh38)

- downloading ClinGen_disease_validity_curations_20191202.nga (GRCh38)

- downloading 1000_Genomes_Project_Phase_3_v3_plus.nsa.idx (GRCh38)

- downloading SpliceAi_1.3.nsa.idx (GRCh38)

- downloading dbSNP_153.nsa.idx (GRCh38)

- downloading TOPMed_freeze_5.nsa.idx (GRCh38)

- downloading MITOMAP_20200224.nsa (GRCh38)

- downloading gnomAD_2.1.nsa.idx (GRCh38)

- downloading ClinGen_20160414.nsi (GRCh38)

- downloading gnomAD_gene_scores_2.1.nga (GRCh38)

- downloading 1000_Genomes_Project_(SV)_Phase_3_v5a.nsi (GRCh38)

- downloading MultiZ100Way_20171006.pcs (GRCh38)

- downloading 1000_Genomes_Project_Phase_3_v3_plus_refMinor.rma (GRCh38)

- downloading ClinVar_20200302.nsa (GRCh38)

- downloading OMIM_20200409.nga (GRCh38)

- downloading Both.transcripts.ndb (GRCh38)

- downloading REVEL_20160603.nsa (GRCh38)

- downloading PrimateAI_0.2.nsa (GRCh38)

- downloading dbSNP_151_globalMinor.nsa (GRCh38)

- downloading Both.sift.ndb (GRCh38)

- downloading Both.polyphen.ndb (GRCh38)

- downloading Homo_sapiens.GRCh38.Nirvana.dat

- downloading 1000_Genomes_Project_Phase_3_v3_plus.nsa (GRCh38)

- downloading phyloP_hg38.npd (GRCh38)

- downloading SpliceAi_1.3.nsa (GRCh38)

- downloading TOPMed_freeze_5.nsa (GRCh38)

- downloading dbSNP_153.nsa (GRCh38)

- downloading gnomAD_2.1.nsa (GRCh38)

- finished (00:04:10.1).

Description Status

---------------------------------------------------------------------------

1000_Genomes_Project_(SV)_Phase_3_v5a.nsi (GRCh38) OK

1000_Genomes_Project_Phase_3_v3_plus.nsa (GRCh38) OK

1000_Genomes_Project_Phase_3_v3_plus.nsa.idx (GRCh38) OK

1000_Genomes_Project_Phase_3_v3_plus_refMinor.rma (GRCh38) OK

1000_Genomes_Project_Phase_3_v3_plus_refMinor.rma.idx (... OK

Both.polyphen.ndb (GRCh38) OK

Both.sift.ndb (GRCh38) OK

Both.transcripts.ndb (GRCh38) OK

ClinGen_20160414.nsi (GRCh38) OK

ClinGen_Dosage_Sensitivity_Map_20190507.nga (GRCh38) OK

ClinGen_Dosage_Sensitivity_Map_20200131.nsi (GRCh38) OK

ClinGen_disease_validity_curations_20191202.nga (GRCh38) OK

ClinVar_20200302.nsa (GRCh38) OK

ClinVar_20200302.nsa.idx (GRCh38) OK

Homo_sapiens.GRCh38.Nirvana.dat OK

MITOMAP_20200224.nsa (GRCh38) OK

MITOMAP_20200224.nsa.idx (GRCh38) OK

MITOMAP_SV_20200224.nsi (GRCh38) OK

MultiZ100Way_20171006.pcs (GRCh38) OK

OMIM_20200409.nga (GRCh38) OK

PrimateAI_0.2.nsa (GRCh38) OK

PrimateAI_0.2.nsa.idx (GRCh38) OK

REVEL_20160603.nsa (GRCh38) OK

REVEL_20160603.nsa.idx (GRCh38) OK

SpliceAi_1.3.nsa (GRCh38) OK

SpliceAi_1.3.nsa.idx (GRCh38) OK

TOPMed_freeze_5.nsa (GRCh38) OK

TOPMed_freeze_5.nsa.idx (GRCh38) OK

dbSNP_151_globalMinor.nsa (GRCh38) OK

dbSNP_151_globalMinor.nsa.idx (GRCh38) OK

dbSNP_153.nsa (GRCh38) OK

dbSNP_153.nsa.idx (GRCh38) OK

gnomAD_2.1.nsa (GRCh38) OK

gnomAD_2.1.nsa.idx (GRCh38) OK

gnomAD_gene_scores_2.1.nga (GRCh38) OK

phyloP_hg38.npd (GRCh38) OK

phyloP_hg38.npd.idx (GRCh38) OK

---------------------------------------------------------------------------

Peak memory usage: 52.3 MB

Time: 00:04:12.2