Collapse All | Expand All

Analysis

How do I run my data through the RNA-Seq Alignment App?

Run the RNA-Seq workflow (FASTQ only) on the MiSeq and stream the data to BaseSpace. The BaseSpace RNA-Seq Alignment App analyzes data from the TruSight RNA Pan-Cancer Panel, providing a simple results summary that includes a fusion table, variant table, and gene expression table.

You can also use your own pipeline for analysis.

What analysis tools are available for Nextera XT?

The appropriate analysis tool depends on the application. MiSeq Reporter includes various analysis workflows, including amplicon, assembly, metagenomics, and resequencing. BaseSpace Sequence Hub also offers various apps for de novo assembly, metagenomics, resequencing, and so on.

Can I create my own manifest file to analyze my data?

No.

How often are IMGT/HLA references updated? Can I update references in Assign 2.0?

IMGT reference updates are made available every 9 months, approximately 2 months after the IMGT release. They are available from the TruSight HLA v2 Support web page. Follow the instructions provided in the Assign 2.0 software guide to update the references in Assign 2.0. These updates are not performed automatically and can be installed at your convenience. Reference updates do not change the software and the software instance remains unchanged.

What is the difference between Call Rate and Poly Call Rate in the Samples Table of the PC Module?

Call Rate is carried over from the Samples Table in the GenomeStudio Genotyping Module in which the original genotyping project (.bsc) was created. Entries in the Call Rate column do not change when SNPs are clustered in the PC Module. In contrast, the Poly Call Rate is calculated from clustering SNPs in the PC Module and represents the percentage of SNPs for which a given sample was assigned to a cluster.

How do I set up a sequencing run on the NextSeq500/550 with TruSight Cardio libraries and analyze the generated data?

When using BaseSpace, set up the run in BaseSpace Prep tab. Select Nextera Rapid Capture from the dropdown menu under Library Prep kit to associate samples with the indexes provided in the TruSight Cardio Sequencing kit (NextSeq, 48 samples, Mid-output), E502, E503, E505, E506, and N701-12. Set up the run as a dual index paired-end 151-cycle sequencing run. FASTQ files streamed into BaseSpace can be analyzed using the BWA Enrichment App or the Issac Enrichment App (v2.0 and v2.1 custom manifest workflow).

When running the NextSeq in Standalone mode, enter the following parameters on the Run Setup Screen:

Read Type: Paired End

Read Length: Read 1: 151; Read 2: 151; Index 1: 8; Index 2: 8

When the run is complete, use the bcl2fastq2 converter for demultiplexing and FASTQ conversion. Create a sample sheet for demultiplexing using Illumina Experiment Manager. Choose NextSeq; Nextseq FASTQ only; Nextera Rapid Capture Enrichment Sample Prep kit. Perform data analysis using third-party software.

Can I install Assign 2.0 on a network drive, rather than on individual PCs?

Assign 2.0 TruSight HLA Analysis Software can be installed on a network drive, server, or individual PC. In addition, multiple types of installations can be installed in the same facility.

If I use BaseSpace Sequence Hub for run monitoring only, what files are sent?

The files that are sent to BaseSpace Sequence Hub are the InterOp folder, RunInfo.xml file, and RunParameters.xml file.

Which parameters can be configured to manipulate the default clustering algorithms?

You can configure the following parameters to manipulate the default clustering algorithms: Minimum Number of Points in a Cluster, Cluster Distance, and Maximum Number of Clusters in the SNP Table, which is in the Clustering Options dialog box.

How do I visualize my TruSight One targets in a genome browser?

TruSight One regions, targets, and probes are provided in *.bed file format and can be found by visiting the TruSight One Product Support page. The *.bed file can be imported into either the UCSC Genome Browser (genome.ucsc.edu) or the Integrated Genome Viewer (IGV) (www.broadinstitute.org/igv/).

What is the diff score and how does it relate to the p-value?

The diff score is a transformation of the p-value that provides directionality to the p-value based on the difference between the average signal in the reference group and the comparison group.

The formula is: DiffScore = 10*sgn(cond-ref)*log10p.

  • For a p-value of 0.05, DiffScore = 13
  • For a p-value of 0.01, DiffScore = 22
  • For a p-value of 0.001, DiffScore = 33

The p-value column is hidden by default. To display this column, use the Column Chooser.

Which assays and platforms are compatible with the PC Module?

The PC Module is compatible with GenomeStudio Genotyping projects created from Infinium assays run on the iScan System, HiScan System, BeadXpress Reader, or BeadArray Reader.

How do I know if I need to throttle BaseSpace, and how do I apply throttling?

The BaseSpace Broker is designed to upload data to BaseSpace as soon as the data are generated on the HiSeq local drive. It will use as much bandwidth as is necessary to keep up with the data being produced. Under typical HiSeq run conditions, the upload of run data for storage and analysis will average less than 10Mbit/sec.

In most cases, throttling of the BaseSpace Broker data upload is not necessary. Throttling can be necessary if greater control over network bandwidth usage is required, such as sites where instruments share the network with other users or sites with limited upload speed. Throttling might be necessary in scenarios where the local network connectivity is temporarily lost and then restored. This interruption causes the BaseSpace Broker to suddenly consume more network bandwidth as it attempts to catch up with transfer of accumulated data. If no throttling is applied in such cases, the BaseSpace Broker might consume all available bandwidth on the network until the backlog of data are cleared. If throttling is applied and if the local network allows, Illumina recommends throttling to higher than the 10 Mbit/sec minimum specification. A recommended value of 20 Mbit/sec (approximately 3Mbytes/sec = 24Mbits/sec) allows the BaseSpace Broker enough bandwidth to recover, even if some delays in data transfer occur.

If throttling is needed, provide the following instructions to your local IT administrator:

Throttling of BaseSpace is performed on the HiSeq computer by application, rather than by IP address, as follows:

  1. In Windows, open a cmd window and open the Local Policy Editor. Run the program gpedit.msc.
  2. Expand the Computer Configuration / Windows Settings nodes.
  3. Select Policy-based QoS.
  4. Right-click Create new policy.
  5. Enter a name, such as Limit BaseSpace upload.
  6. Clear the Specify DSCP value.
  7. Select Specify Outbound Throttle Rate and enter 3 MBps (3 Mbytes/sec, or 24Mbit/sec), which is sufficient to allow data transfer to catch up.
  8. Click Next.
  9. Select Only applications with this executable name and enter Illumina.BaseSpace.Broker.exe.
  10. This policy applies to any source IP and target IP addresses. Click Next.
  11. This policy applies to all ports and protocols. Click Finish.

What tools are offered for data analysis with TruSight DNA Amplicon?

The following software tools are available for TruSight DNA Amplicon: Illumina Experiment Manager, MiSeq Reporter, Illumina Amplicon Viewer, and VariantStudio. The MiSeq Reporter is simple, on-instrument software that demultiplexes indexes and samples, performs alignment of reads, and creates a variant report immediately upon completion of each sequencing run. The MiSeq Reporter also creates an assembly for each amplicon (on a per sample basis), performs SNP calling, and generates graphs and reports including the sequencing coverage per amplicon. The Illumina Amplicon Viewer performs data visualization from multiple runs and custom report generation off the instrument.

The Illumina VariantStudio desktop application is available with purchase of TruSight panels and includes a one-year license to the VariantStudio desktop application. VariantStudio imports variant call files generated during analysis of sequencing data and provides commands to annotate variants. VariantStudio filters results using various filtering options, classifies variants according to their biological impact, and exports results to a report.

What is a high confidence fusion call?

A high confidence fusion call means that a fusion meets the threshold filters, which are based on scores from calculated values of split read scores, paired read scores, break-end homology, and several other features.

For more information on the calculations, see the Local Run Manager RNA Fusion Analysis Module Workflow Guide.

How do I view my analysis results?

In BaseSpace, click your Projects folder. Select Analysis and select the app session name that you saved your analysis under.

What version of bcl2fastq is supported for the NovaSeq System?

bcl2fastq v2.19 is supported for processing NovaSeq data.

What does the Fusion Score mean?

The score reflects the confidence in the fusion call, where 0 is low and 1 is high. Scores > 0.6 are reported as high confidence fusion calls.

Is there sample data I can view?

Data sets are available on the BaseSpace Public Data page under the "Methyl-Seq" category.

Where can I find information about integrating probes on a methylation BeadChip with probes on the HumanHT-12 Expression BeadChip?

Will there be issues demultiplexing current TruSeq v2/LT (single-index) libraries if a dual index run is done?

No. To enable demultiplexing, specify the correct use-bases-mask and use the appropriate sample sheet.

How reproducible is the data?

We performed experiments to test for reproducible detection of fusion in control samples across multiple replicates and library preps and demonstrated the expected fusions (such as BCR-ABL1 in UHR) are consistently detected as high confidence calls. The exact values of the fusion confidence scores determined by the RNA Fusion Module can very slightly from prep to prep. Therefore, fusions that score just above the default 0.6 score threshold may fall just below in a replicate sample. In this case, it can be helpful to lower fusion score threshold slightly below 0.6.

Can I import raw intensity files (idats) into the Polyploid Genotyping Module?

No, it is not possible to generate a new project in the Polyploid Genotyping Module directly from idats. The polyploid workflow requires to first generate a genotyping project in the GenomeStudio Genotyping Module from the idats. The genotyping project (.bsc) can then be opened in the Polyploid Genotyping Module for polyploidy clustering.

What types of variants can TruSight Multiple Myeloma 42 detect?

The TruSight Multiple Myeloma 42 kit can detect single nucleotide variants (SNVs) and small indels. Use BaseSpace Variant Interpreter to filter and annotate variants.

Does the TruSight Myeloid Sequencing Panel detect translocations?

No, this panel is designed for the detection of SNVs and small insertions and deletions only. Translocations will not be detected with this assay.

Can I compare data from the TruSeq Stranded mRNA protocol to either the TruSeq RNA protocol or the Directional mRNA-Seq Sample Prep unsupported protocol?

The data is comparable to TruSeq RNA Library Prep v2 data. The TruSeq Stranded mRNA and Directional mRNA-Seq protocols are based on distinct ligation chemistries and the libraries they produce are expected to differ as a result. Direct sample-to-sample comparison is not recommended for these library types.

Where do I find the expression data?

In the RNA-Seq Alignment project, select a sample. In the Important Files for Download section, click the reference gene counts file.

What are my options for analyzing HiSeq 4000 data?

BaseSpace is the preferred analysis solution for the HiSeq 4000. For third-party analysis packages, you can use the bcl2fastq converter. CASAVA and HAS are not supported.

Which workflow do I use in Illumina Experiment Manager to create a sample sheet for my Nextera Rapid Capture Enrichment project?

Use the Enrichment workflow to create a sample sheet formatted for analysis of Nextera Rapid Capture Enrichment data. For instructions, see the Illumina Experiment Manager Software Guide.

What types of variants can TruSight Acute Lymphoblastic Leukemia 54 detect?

The TruSight Acute Lymphoblastic Leukemia 54 kit can detect single nucleotide variants (SNVs) and small indels. Use BaseSpace Variant Interpreter to filter and annotate variants.

What criteria should I use for LOH/CN analysis?

You should consider density, physical spacing, and Minor Allele Frequency (MAF). Loci with high MAF (~ 0.5) may be more informative than rare polymorphisms for this type of analysis. The control set must match the test sample population (MAF). You must have control samples with minimal amounts of chromosomal aberrations.

Is it recommended to normalize mRNA and miRNA together in the same project?

Illumina recommends normalizing mRNA data separately from miRNA data.

How do TruSight HLA and Assign deal with the intron sequence given that the IMGT/HLA database is missing the intron reference for 90% of the alleles?

Assign 2.0 breaks the sequence for each allele into 3 parts. The Core includes exons 2, 3, and 4 in Class I and exons 2 and 3 in Class II as well as sequence from known expression variants. The second part, Exons, includes all the other exons. By default, users only see sequence and phase mismatches from these 2 parts. It is simple to expand the view to see the third part, N-C, which includes the noncoding UTR and intron sequences. Even though the noncoding sequence is hidden from view by default, heterozygote positions within these regions are used for phase alignment. We also allow users to select the number of fields to review and report (Two, Three, or All). Ultimately, Illumina is supporting both IMGT and the 17th Workshop to solve the root of the issue, which is the missing noncoding reference.

What software can analyze the libraries?

The BaseSpace RNA-Seq Alignment App performs alignment and analyzes for gene fusions with STAR or TopHat. STAR is recommended for optimal fusion calling.

The BaseSpace TopHat Alignment App can also be used for alignment and fusion calling. The BaseSpace Cufflinks Assembly & DE App can be used to run differential expression analysis.

Third-party analysis tools are also available.

After what cycle is TruSeq control information available?

  • Control information for a read is processed after alignment (cycle 25) but is not reported until cycle 52. For an indexed read, control information is reported after cycle 52 of R2.
  • For runs shorter than 52 cycles, control reporting is accurate for all controls except CTE, which reports counts in all band sizes.

Can I edit a sequence in Assign 2.0 and can the edit be viewed in the report?

Yes, sequences can be edited using the Navigator functionality. Edits are flagged in the Coverage View and can be found using the forward and reverse arrows in the Navigator. There is an option to include sequence edits on the report, which shows the edit, base position, and user.

Why might unexpected fusions be called in my samples?

The Local Run Manager RNA Fusion module was designed to be a discovery tool. As such, it does not search for specific fusions, but performs an unbiased, whole genome alignment. To perform this whole genome alignment on a desktop sequencer or on a PC, it was necessary to create a compacted human genome reference – the Minimal Genome Reference. Although this provides the advantage of a local analysis solution, the Minimal Genome Reference does not retain the context that a contiguous whole genome reference has. Therefore in some cases, read-through transcripts may be called as fusions. In addition, an unbiased, whole genome alignment approach may inherently identify unexpected transcripts. It is always recommended to confirm findings with an orthogonal approach, such as RT-PCR.

How do I analyze my TruSight Multiple Myeloma 42 data?

Analyze TruSight Multiple Myeloma 42 data using either of the following options:

  • MiSeq Reporter TruSeq Amplicon workflow
  • TruSeq Amplicon BaseSpace App

Where can I find my GenomeStudio software license keys?

Unique GenomeStudio Software 2011.1 license keys were discontinued in July 2016. License keys for each of the GenomeStudio 2011.1 modules are included in a text file with the software installer download without charge. See GenomeStudio Software Downloads.

Will this assay detect chromosomal translocations that do not generate a fusion gene?

Chromosomal translocations resulting in overexpression or deletion of a transcript can be reflected in gene expression levels but would not create a fusion gene. The Local Run Manager RNA Fusion module is not designed for detection of gene expression changes. To detect these changes, the RNA-Seq Alignment App in BaseSpace Sequence Hub is recommended. In addition, it is recommended to confirm these findings in DNA.

I see lower coverage of GC regions than I was expecting with the TruSeq Nano DNA Library Prep protocol. What could be a factor?

Coverage of GC regions can be impacted by the model, settings, and performance of the thermal cycler used. Illumina has validated the Bio-Rad DNA Engine Tetrad 2, the Bio-Rad S1000, and the MJ Research PTC-225 DNA Engine Tetrad. Other thermal cyclers may differ in their performance across the genome.

What file extensions does Assign 2.0 generate?

Assign 2.0 has four output options:

  • Full Report—Shows all selected loci and samples with all perfect matches with the option to include edits, sequences, and mismatches. Available as MS Excel (*.xls, *.xlsx, or *.xlsm), text (*.txt), and XML (*.xml).
  • Summary Report—Shows only unambiguous typings for all the samples in the project, quality scores for each locus, and coverage for each locus. Available as MS Excel (*.xls, *.xlsx, or *.xlsm) and text (*.txt).
  • Project Files—Allows the project to be reloaded without reanalysis (alignment, phasing, and typing). Output as *.cgp, an XML-based file specific to Assign 2.0.
  • FASTA—Output as *.fasta.

How many fusion-supporting reads are required to call a fusion?

The BaseSpace RNA-Seq Alignment App with the STAR aligner may call a fusion if there are at least three unique reads that meet all the quality metrics, including the following threshold filters.

  • Split Reads + Paired Reads ≥ 3
  • Alt/Ref Reads ≥0.01
  • Fusion Contig Align Length (bp) > 16
  • Break-end Homology (bp) ≤ 10
  • Alternative Local Contig Align Fraction < 0.8
  • Coverage after fusion (bp) ≥100

However, a high number of nonfusion supporting reads (ie, wild type transcript) in that region would be expected to cause noise that can affect fusion calling.

If a network outage to the central file server storing our runs occurs, can the HiSeq 4000 System cache an entire 2 × 150 run?

Yes. The server storage size can store one run if there is network storage. Real-Time Analysis continues processing and resumes data transfer when the network is restored.

What are my options for analyzing NovaSeq sequencing data?

BaseSpace Sequence Hub is the preferred analysis solution for the NovaSeq Sequencing System. On-instrument analysis is not available. For third-party and user-developed analysis packages, you can use the bcl2fastq2 converter.

What tools are available for data analysis?

Use Sequence Genotyper to convert variant call files (VCF) into a genotype call file.

What is deconvolution?

It is the ability to distinguish between two or more clusters that are in close proximity to each other.

Can I compare data between the TruSeq Small RNA protocol and earlier Illumina Small RNA protocols?

As with any molecular biology techniques, a change in protocol will give a change in absolute results. We do not recommend comparing absolute counts (or counts normalized to total counts) between different preparation protocols. However, comparison of fold change between the two protocols should yield good correlation.

What is the error rate on MiSeq?

Based on the latest advances in Illumina's reversible terminator SBS chemistry, the MiSeq System is the most accurate sequencing instrument available, providing the world's highest output of perfect, error-free reads. For examples of MiSeq performance, see the MiSeq resources page.

How many swaths and tiles are on a 2-lane rapid run flow cell?

Scanning and analysis of a 2-lane rapid run flow cell creates two swaths per surface on two surfaces per lane. Each swath is divided into 16 tiles. For a 2-lane flow cell, there are a total of 128 tiles per flow cell.

What if I want to do an experiment that does not fit a MiSeq Reporter analysis workflows?

Set up your sample sheet to use the GenerateFASTQ workflow, which generates FASTQ files without proceeding to further analysis. FASTQ files can be exported for analysis with third-party software.

How do I set up a sequencing run on the MiSeq with TruSight Cardio libraries and analyze the generated data?

For sequencing of TruSight Cardio libraries on the MiSeq, use the MiSeq Reporter Enrichment workflow or the Generate FASTQ workflow. Illumina recommends preparing the sample sheet in the Illumina Experiment Manager (IEM). Libraries prepared with TruSight Cardio Sequencing kit (MiSeq, 12 samples) are dual-indexed, but only require a single i7 Index Read for demultiplexing. Set up the run for paired-end 151-cycle reads and a single 8 bp Index Read. FASTQ files can be uploaded to BaseSpace and analyzed using the BWA Enrichment App or the Isaac Enrichment App (v2.0 and v2.1 custom manifest workflow). Alternatively, use the MiSeq Reporter Enrichment workflow.

Does Assign 2.0 exclude primer sequences from consideration in the assignment of alleles?

Yes, the primer sequences are excluded from the analysis and allele assignment.

Can I use Assign to analyze data generated by non-TruSight HLA kit or non-Illumina platforms?

The Assignsoftware was specifically developed to work with data generated using TruSight HLA kits and Illumina MiSeq Systems. Other sequencing kits and platforms have not been tested and are not supported.

Which reference genome version was used to design this kit?

Human UCSC version hg19, which is the same as Genome Reference Consortium build 37 (GRCh37).

Where can I find technical information describing HiSeq v4 data compression options?

See the Illumina whitepaper, Reducing Whole-Genome Data Storage Footprint.

Why are regions outside the listed 15 genes included in the manifest and genome.vcf files?

These regions are used for normalization of amplification analysis for EGFR, ERBB2, and MET. This amplification software will be released at a later date.

Which variants are included in the PDF report?

Included variants are available in the report definition file. Download the file from TruSight Tumor 15 Product Files.

Can I import data from non-Illumina platforms into GenomeStudio software?

No. GenomeStudio software is intended for use only with Illumina data.

What specificity and coverage uniformity can I expect with this kit?

  • Specificity > 85%
  • Coverage Uniformity > 85%

What kind of quality scoring method does Illumina use?

A quality score (or Q-score) is a prediction of the probability of an incorrect base call. Based on the Phred scale, the Q-score is a compact way to communicate small error probabilities.

Given a base call, X, the probability that X is not true, P(~X), is expressed by a quality score, Q(X), according to the relationship:
Q(X) = -10 log10(P(~X))
where P(~X) is the estimated probability of the base call being wrong.

A quality score of 10 indicates an error probability of 0.1, a quality score of 20 indicates an error probability of 0.01, a quality score of 30 indicates an error probability of 0.001, and so on.

During analysis, base call quality scores are written to FASTQ files in an encoded compact form, which uses only one byte per quality value. This method represents the quality score with an ASCII code equal to the value + 33.

What is a low confidence fusion call?

Low confidence fusion calls are fusion calls listed as recurrent in the Mitelman database, but do not pass or meet the minimum threshold score. These calls might be true positive fusions expressed at lower levels. Assessing low confidence fusion calls might require Orthogonal approaches.

Is there a demo data set for the Infinium MethylationEPIC array?

Where is the status.htm file?

This report is not created by the MiSeq System. However, you can view similar information using Sequencing Analysis Viewer (SAV).

Can I compare expression data from different TruSeq Targeted RNA Expression runs?

Yes, if the data from different runs is normalized with same normalizer genes. The raw data from different runs cannot be merged together into a single dataset. For more information, see the MiSeq Reporter Software Guide.

What are the storage requirements using HCS v1.4 and Flow Cell v3?

Storage requirements for raw data are approximately 60% greater than current runs based on additional swath data and increased cluster density.

Do I need a specific version of the bcl2fastq software package?

Yes. To convert zipped .bcl files, use bcl2fastq v1.8.4. This version can also convert non-zipped .bcl files.

How does 3+1 quality binning work, and how does it compare to 7+1 without binning?

The 3+1 quality binning permits a reduction in the data file sizes generated by the NovaSeq System. The mapped Q-scores for each bin are Q12, Q23, and Q37. Testing has shown highly comparable performance in HWGS between NovaSeq 3+1 qualities and HiSeq X 7+1 qualities. Although third-party analysis solutions have shown comparable performance, Illumina recommends recalibrating any settings used.

Are there restrictions to the Maximum Number of Clusters the software can call?

There are no restrictions; however we recommend that the Maximum Number of clusters is set to match the biology of your samples.

I tried to cluster a set of SNPs, but no clusters are visible in the SNP Graph. What went wrong?

SNPs are only clustered for samples selected in the Samples Table (marked in blue). If no samples are selected in the Samples Table, SNPs are clustered for all samples in the Samples Table (except non-excluded samples). To cluster SNPs for all your non-excluded samples, make sure that no samples are selected in the Samples Table at the time of clustering (eg, by clicking onto an area in the SNP graph).

What is the minimal reference genome?

The minimal reference genome is a compact human reference genome that allows novel fusion calling. Per RefSeq, the minimal reference genome contains all exons in the human genome with the introns removed. It is based on the hg19 genome.

What ports, domains, and encryption does BaseSpace Sequence Hub use?

BaseSpace Sequence Hub uses SSL/https port 443 and the domains *.basespace.illumina.com and *.s3.amazonaws.com. Data streaming to BaseSpace Sequence Hub is encrypted using the AES256 standard. SSL is used for protection. For more information on encryption, see BaseSpace Security.

If local security policies must be modified to allow access to BaseSpace Sequence Hub, contact your IT representative.

My sample shows multiple fusion calls. How do I confirm a fusion call?

To confirm fusion calls, perform additional investigation or independent validation. Fusion transcripts between nearby genes on the same chromosome and strand can be a result of read-through transcription rather than genomic translocation. To reduce these calls, the fusion software is optimized to filter read-through transcripts, but can result in filtering of biologically relevant fusions between adjacent genes (eg, STIL-TAL1). Also, fusions called between genes of high homology, such as a gene and its pseudogene, can be artifacts of multiple alignment instead of genomic rearrangements. Illumina reccommends that the quality score and chromosomal location be assessed and confirmed using independent molecular biology approaches.

Are there any analysis workflow changes for TruSeq Nano DNA libraries in comparison to TruSeq DNA libraries?

The TruSeq Nano DNA Library Prep Kit uses the same workflow as TruSeq DNA Sample Prep Kit. No changes have been made in analysis.

Is there sample data I can view?

MiSeq datasets are available on the BaseSpace Sequencing Hub Public Data page. For MiniSeq datasets, contact your account manager.

How do I use BaseSpace for run monitoring?

The Run Monitoring BaseSpace option allows you to remotely monitor a run in progress by logging in to your BaseSpace account. You need to select the Run Monitoring option during run setup. Then, log in to your BaseSpace account from anywhere and view your run in the BaseSpace version of Sequence Analysis Viewer (SAV).

What are recurrent fusions?

Recurrent fusions are fusions identified in the Mitelman database as present in 2 or more cases with the same morphology and topography.

What type of files are output from data analysis?

Various data files and plots comprise the output from analyzing TruSeq RNA Access libraries. For details, see BaseSpace Core Apps for RNA Sequencing.

How do I filter and annotate my variants?

The Illumina VariantStudio filtering and annotation software is included with the purchase of TruSight DNA Amplicon Sequencing panels.

Can the MiSeq System generate FASTQ files?

FASTQ files are generated during secondary analysis by MiSeq Reporter for most analysis workflows. To generate only FASTQ files, specify the GenerateFASTQ workflow in the sample sheet, which generates FASTQ files and then exits secondary analysis.

What read lengths are recommended for sequencing Nextera Custom Enrichment libraries?

Based on performance data and range of library sizes, Illumina recommends 2 × 36 bp to 2 × 50 bp read lengths for runs on a HiSeq system. For MiSeq runs, 2 x 150 bp runs are recommended.

How do I analyze the data generated from sequencing TruSeq Cardio libraries?

You can use the MiSeq Reporter Enrichment workflow or the Generate FASTQ workflow. FASTQ files can be uploaded to BaseSpace and analyzed using the BWA Enrichment app or the Isaac Enrichment app (v2.0 and v2.1 custom manifest workflow).

Does Assign require an internet connection?

An internet connection is not required to use Assign. However, Conexio Assign allows you to launch NCBI Blast search on sequence reads directly from the interface. This feature requires an internet connection.

Can I change settings on the RNA Fusion Module?

Yes, you can change three settings:

Minimum Breakpoint Distance—Excludes reads that are in close proximity and may represent read through events rather than fusions.

Confidence Score Filter—Set by Illumina based on internal testing. The scores are calculated as a weighted average of individual features (for example, split reads, fusion contig alignment length, etc). If the fusion does not meet the confidence score, it is not shown, which helps to eliminate false positives. For detailed information about scoring calculations and what is being evaluated as part of the confidence score, see the Local Run Manager RNA Fusion Analysis Module Workflow Guide (document # 1000000010786).

Confidence Score Threshold—Value established by Illumina. Fusion calls with a confidence level above this threshold are considered "High Confidence."

What does it mean if a fusion is called in one replicate but not in another?

It can indicate a lower confidence in the fusion call, which can be due to low expression levels. An orthogonal approach to confirm results is recommended.

What percent of passing filter reads aligned to rRNA is expected?

The percent of reads passing filter (PF) aligned to human ribosomal RNA is typically ≤ 8%.

Where can I find more detailed information about my sample results?

Follow the directory path: RNA_Fusion_Analysis/samples/[SampleName/.

The sample name folder contains three additional folders: Align, GeneCounts, and MantaFusion. These folders contain the number of read counts with passing values and other detailed information.

Does the TruSight Myeloid Sequencing Panel detect partial or tandem duplicates?

MiSeq Reporter does not detect duplication events. Reads from this type of rearrangement are not expected to align and will likely be placed in the Unaligned folder by MiSeq Reporter.

What are the main output files?

The genome.vcf file is the main output file for variant calling. For further variant filtering, use BaseSpace Variant Interpreter or the predefined PDF report.

What is paired-end analysis?

Paired-end analysis sequences both ends of a DNA fragment. If the fragments are of known size, this method can facilitate de novo sequencing of repetitive elements and help to identify structural variation.

What percent strandedness is expected?

For good quality samples, > 98.5% strandedness is expected.

What software does Illumina offer?

Illumina offers the BaseSpace apps for RNA. See BaseSpace Apps.

Does Assign run on a Mac or Linux system?

No. Assign is tested and supported on Windows. Virtualization software like Parallels or VMware lets you run a Windows application on a Mac or Linux. Therefore, it is possible to run Assignthrough Parallels on a Mac. However, this method is an unsupported function.

What actual assay performance can I expect from my design?

DesignStudio returns high-confidence amplicon designs that have delivered unprecedented amplicon multiplexing performance. You can expect to see specificity > 70% and uniformity > 80%. In practice, we have observed specificity and uniformity > 90% for hundreds of designs.

Will there be issues demultiplexing current TruSeq libraries if a dual index run is done?

No. To enable demultiplexing, specify the correct use-bases-mask and use the appropriate sample sheet.

What third-party software can I use to analyze TruSeq RNA Access libraries?

BaseSpace Sequence Hub core apps for RNA include the industry standard TopHat and Cufflinks analysis pipeline. Additional, third-party analysis tools are also available.

Why are fusions reported in the fusions.csv file but not reported in the High or Low Confidence Fusion Calls tables?

Fusions are reported as High Confidence fusions when they meet the minimum threshold scores defined by Illumina. Fusions that meet these criteria are reported regardless of whether they are listed in the Mitelman database.

Fusions are reported as Low Confidence when they do not pass all quality filters or meet the minimum threshold score but are listed in the Mitelman database. Potential low quality fusions that are not listed in the Mitelman database can be found in the fusions.csv file.

What does TruSight DNA Amplicon data look like? Is demo data available?

For an example of data from a TruSight DNA Amplicon experiment, refer to the data set available via BaseSpace. This sample data set is from the TruSight Myeloid Sequencing Panel sequenced on the MiSeq with MiSeq v3 reagents.

How do I use BaseSpace for run monitoring only, such as SAV functionality in BaseSpace?

Run monitoring with BaseSpace is selected during run setup.

Which reference genome do I use?

Use the Homo sapiens (PAR-masked)/hg19 (RefSeq) or Homo sapiens (PAR-masked)/hg 19 (Gencode) reference genome. Because of different annotations within these genomes, results can vary by reference.

Does Assign 2.0 require an internet connection?

An internet connection is not required to use Assign 2.0. However, Assign 2.0 provides links to NCBI Blast, IMGT Allele Database, and AlleleFrequencies.net, which require an internet connection.

How do I format my sample sheet?

When using BaseSpace, sample sheet format can follow either HiSeq Analysis format or CASAVA format. For runs that require demultiplexing with either bcl2fastq 1.8.4 or CASAVA, a CASAVA-formatted sample sheet is required. This format is described in the bcl2fastq 1.8.4 User Guide (part # 15038058) and the CASAVA User Guide (part # 15011196).

Sample sheets for rapid runs include information for two lanes, as compared to eight lanes included in a sample sheet for a high output run. Sample sheets for rapid runs can be generated manually, using Excel or a text editor.

If you are using BaseSpace for data storage and analysis, a sample sheet is required for both rapid runs and high output runs. If using BaseSpace only for run monitoring and you are not indexing, a sample sheet is not required.

How are counts normalized in the RNA Sequencing Module?

See the GenomeStudio RNA Sequencing Module User Guide in the GenomeStudio Portal for more information.

Why might a fusion not get called?

Several factors can cause a fusion to not get called:
- Low expression levels of the fusion gene. More sequencing read depth or lower fusion score threshold may be required.
- Low quality of the sample. More sample input RNA may be required.
- Close proximity of 2 genes in the same orientation, on the same chromosome (eg, STIL-TAL1). Reducing the default breakpoint distance thresholds in the Local Run Manager module can help identify these fusions, but may also display more false positive calls.
- Differences in bioinformatics algorithms used.
- Differences in reference genomes. GENCODE has higher genomic coverage than RefSeq and fusions between exons that are not annotated are not called in RefSeq. For example, an EML4 transcript has an exon that is annotated in GENCODE but not RefSeq, meaning fusions at that exon would not be called when using the RefSeq reference.

Does Assign 2.0 run on Mac or Linux systems?

Assign 2.0 is tested and supported on Windows. Virtualization software like Parallels or VMware allows you to run a Windows OS on a Mac or Linux. Therefore, it is possible to run Assign 2.0 through these virtualization technologies.

How can I create subpanels for use in VariantStudio?

To create subpanels, generate gene lists in HGNC nomenclature for the regions of interest. Upload this subpanel gene list when importing VCF files into VariantStudio, and then filter variants based on the shortened gene list.

For reference, see the complete TruSight Cardio gene list.

Is there a GenTrain and/or cluster file for Infinium Methylation Arrays?

Since Infinium Methylation arrays are designed to compare relative methylation levels between two samples or sample groups (such as normal versus tumor, or pancreas cells versus liver cells), there is no GenTrain and/or cluster file for this product. It is similar to doing a paired-sample analysis.

How do I analyze my data using Illumina software?

If the run is being uploaded to BaseSpace or BaseSpace Onsite, the data can be analyzed using BaseSpace Core App- MethylSeq and BaseSpace Labs App- MethylKit.

MethylSeq provides alignment analysis and MethylKit provides sample to sample comparison analysis.

For more information about MethylSeq, see the MethylSeq BaseSpace App Documentation.

For more information about MethylKit, see the MethylKit BaseSpace Labs page.

Is there an on-instrument analysis solution for the MiSeq?

At the time of kit launch, an off-instrument Local Run Manager is provided for analysis. The off-instrument software is an interim solution until Local Run Manager for MiSeq is released in 2017.

How do I merge data from 2 flow cells?

Using CASAVA: To merge data from different flow cells (different runs), use the configureBuild script in CASAVA v1.8.2. First, align the data (samples) from each flow cell separately using configureAlignment. Then, include each sample directory as an input directory in the configureBuild.pl command line. Input directories are specified by the -id option, as detailed in the CASAVA v1.8.2 User Guide.

If you are using CASAVA, note that Illumina is discontinuing distribution of CASAVA software to better support new products available on BaseSpace. BaseSpace features analysis options for a large array of NGS applications.

Using BaseSpace: BaseSpace includes a Sample Merge function that allows you to merge data from a single sample originating from different flow cells. This merging is performed before alignment analysis of the sample data.

What is a matrix file?

The matrix file is used for base calling and accounts for cross-talk between dyes.

If a network outage to the central file server storing our runs occurs, can the HiSeq 3000 System cache an entire 2 × 150 run?

Yes. The server storage size can store one run if there is network storage. Real-Time Analysis continues processing and resumes data transfer when the network is restored.

What is the Polyploidy Clustering (PC) Module?

The GenomeStudio Polyploidy Clustering Module (PC Module) can identify clusters for samples where the standard diploid clustering algorithm is inappropriate or not useful, such as for polyploidy organisms like wheat and potato.

How do I make sure that my HiSeq is ready to send data to BaseSpace?

To upload data to BaseSpace from a HiSeq, a minimum upstream connection of 10 Mbit/second per instrument is needed. Network speed can be assessed by using free online tools such as www.speedtest.net.

How do I analyze my TruSight Lymphoma 40 data?

Analyze TruSight Lymphoma 40 data using either of the following options:

  • MiSeq Reporter Amplicon DS workflow
  • Amplicon DS BaseSpace App with configurable coverage settings

What are the recommended computing requirements to run TruSight HLA Assign 2.0?

Minimum Computing Requirements:

  • 1 Ghz or faster 64-bit Intel core processor, or equivalent
  • 16 GB RAM, minimum
  • 16 GB available hard disk space
  • Windows OS (Windows Vista, Windows 7, Windows 8, Windows Server 2008, or Windows Server 2012)
  • Microsoft Excel 97, or later, for generating reports

Can I export sequences from Assign?

Yes, you can export the consensus FASTQ sequences for each sample and each locus.

What are the data storage requirements for data generated from TruSight HLA?

A typical MiSeq run produces FASTQ.gz files totaling 1–5 Gb of sequence data.

What data quality can I expect from a run on the HiSeq 3000?

You can expect greater than 75% of all bases above Q30 with a 2 x 150 bp run. For more information, see HiSeq 3000/HiSeq 4000 System Specifications.

Can multiple users look at the same data simultaneously and independently in Assign 2.0?

Assign 2.0 can be installed on a shared server or network drive, and multiple users can independently launch and simultaneously use the same instance of Assign 2.0. These instances share only the settings file and the data under review can be from the same sample. However, edits, comments, and user-defined settings are not available between users.

Is Assign 2.0 an upgrade or new installation?

Assign 2.0 is a new installation and can be run side-by-side with Assign 1.0 for TruSight HLA.

How will my TruSeq controls be shown for multiplexed samples?

You can filter by Index in SAV (Sequencing Analysis Viewer) for indexed runs.

Can I save .cif files in HiSeq v4 mode?

The option to save CIF files is available for all modes except HiSeq v4.

What are the computing requirements for the Polyploid Genotyping Module?

The Polyploid Genotyping Module has the same computing requirements as other GenomeStudio Microarray modules listed in the GenomeStudio computing requirements.

Can I analyze rapid run data with CASAVA?

If you are using CASAVA, you can analyze rapid run data with CASAVA. If the zip BCL files option was chosen during run set up, you will need to use the bcl2fastq converter in place of the configureBclToFastq component of CASAVA. For rapid runs, you will align data from each flow cell separately and then merge the data at the configureBuild step.

If you are not using CASAVA, note that Illumina is discontinuing distribution of CASAVA software to better support new products available on BaseSpace. BaseSpace features analysis options for a large array of NGS applications.

How can I create subpanels for use in VariantStudio?

Subpanels can be created by generating gene lists in HGNC nomenclature for the regions of interest. The complete TruSight One gene list is available in the product Downloads. To filter variants based on the shortened gene list, upload this subpanel gene list when importing *.vcf files into VariantStudio.

What criteria determine clusters passing filter on Illumina sequencing systems?

To remove the least reliable data from the analysis results, often derived from overlapping clusters, raw data are filtered to remove any reads that do not meet the overall quality as measured by the Illumina chastity filter. The chastity of a base call is calculated as the ratio of the brightest intensity divided by the sum of the brightest and second brightest intensities.

Clusters passing filter are represented by PF in analysis reports. Clusters pass filter if no more than one base call in the first 25 cycles has a chastity of < 0.6.

Are there BED files available?

Yes. Download the BED files from the Product Files page for your product.

What analysis solutions are available in BaseSpace Sequence Hub?

RNA Seq-Alignment, TopHat Alignment, and Cufflinks Assembly & DE and RNA Express are some analysis options on BaseSpace.

How long does it take to process (align, phase, and type) TruSight HLA v2 FASTQ files with TruSight HLA Assign 2.0?

Assign 2.0 does all the processing on the initial load of data. When the data are available for review, the power of the machine does not make a significant difference during review and even machines with minimum specifications perform quickly. The power of the computer has significance in the time it takes to perform the initial processing, which includes importing reads, aligning reads, phasing heterozygote positions, and assigning typing results.

Here are few different configurations and average processing time based on those configurations:

Processor

Memory (RAM)

Flow Cell

Number of Samples

Processing Time

Intel Core i7-5600 2.60 GHz

16 GB

Nano

6

12 minutes

Intel Core i7-5600 2.60 GHz

16 GB

Micro

12

60 minutes

Intel Core i7-5600 2.60 GHz

16 GB

Micro

24

60 minutes

Intel Xeon X5560 2.80 GHz (x2)

96 GB

Nano

6

6 minutes

Intel Xeon X5560 2.80 GHz (x2)

96 GB

Micro

12

25 minutes

Intel Xeon X5560 2.80 GHz (x2)

96 GB

Micro

24

25 minutes

Does the NextSeq 550 System perform data analysis on the instrument?

The data set generated by the NextSeq 550 System is too large for on-instrument analysis. Data must be transferred to BaseSpace Sequence Hub or a local server for secondary analysis.

Is there an on-instrument analysis solution?

No, there is no on-instrument analysis solution.

What software is needed to support dual-indexing? Which versions?

  • HCS v1.5/RTA v1.13/SAV v1.8.4
  • SCS v2.10/RTA v1.13/SAV v1.8.4
  • MiSeq Reporter v1.1
  • CASAVA v1.8.2
  • Illumina Experiment Manager v1.0

Does Illumina recommend background subtraction for analysis?

Background subtraction is required when comparing data collected with different scanners or processing attributors (eg, date, reagent lot, instrument, operator, etc.). Background subtraction has a much smaller effect when you scan chips on the same scanner, and might not be necessary. Analyze a subset of data with and without subtraction, and choose the subset of data you prefer based on your results.

Is the Infinium MethylationEPIC BeadChip controls dashboard similar to the Infinium HumanMethylation450 BeadChip?

Yes, the same controls types are present.

What is the Recurrent Fusions Not Called table?

Recurrent Fusions Not Called is a table that displays recurrent fusions, per the Mitelman database, that were not detected. The table provides information to assess whether genes are involved in recurrent fusions, where the fusion was not called, and whethere it had sufficient read coverage.

When do quality scores appear during a sequencing run on the MiSeq System?

Quality scores appear after cycle 25. The software uses the first 25 cycles to determine the chastity of a base call, which in turn determines the quality filter.

What are the network requirements for data transfer from the HiSeq to a server?

You need a one gigabit connection per instrument between the instrument computer and the server. For more information, see the HiSeq System Site Prep Guide.

What is Local Run Manager?

Local Run Manager is on-instrument analysis software integrated with the system control software. Using Local Run Manager, you can record sample information and specify run parameters before starting the run. The name assigned to the run appears on the control software run setup screen for faster run setup.

When the sequencing run is complete, data analysis begins on-instrument automatically and performs analysis according to the analysis module specified for the run.

What can I use to analyze my data?

Perform analysis using the TruSeq Amplicon BaseSpace App or MiSeq Reporter TruSeq Amplicon Workflow.

Is a manifest required for data analysis? Where can I download a manifest?

Two manifest files (one per pool) are required to analyze sequencing runs. The manifest files are installed automatically on MiSeq (MCS/MSR 2.6 or later) or Local Run Manager.

Manifest files can be downloaded from the support pages Downloads section.

How do I analyze my data using third party software?

Illumina sequence base call output files (*.bcl) can be demultiplexed and converted to FASTQ format using the bcl2fastq converter software. The files can then be used for analysis with other third party software packages, such as Bismark and MethylKit.

Illumina cannot provide support for the use of third party software. Contact the software resources directly with any questions regarding the analysis and use of the software.

Are there third-party software programs that you can recommend for ChIP-Seq analysis?

There are a number of solutions available, including Bioconductor and MACS, which are available through Galaxy.

Can I expect changes when using Sequence Analysis Viewer (SAV) with MCS v2.3 and MiSeq Reagent Kit v3?

Intensity (Data by cycle) plots appear different due to non-linear exposure ramping. Non-linear ramping prevents exposure damage early in the read, which provides a boost later in the read when it is more necessary.

What software should be used for analysis of TruSeq stranded mRNA libraries?

Demultiplexing of dual-indexed sequencing runs requires CASAVA1.8.2. Alignment of single read runs may be performed using Casava 1.8.2. For alignment of paired-end runs, Illumina recommends TopHat/Cufflinks. For more information, see the RNA Sequencing Analysis With TopHat guide. TopHat is not an Illumina supported product, but an open source initiative. Additional, third party analysis tools are also available. Learn more about Illumina's RNA applications.

Does the install of the Polyploid Genotyping Module require a license key?

No, the Polyploid Genotyping module is a standalone software which does not require a license key.

What criteria does Assign 2.0 use to establish an allele assignment?

Assign 2.0 uses a perfect match approach to assigning alleles. To be assigned, the sample consensus sequence (consensus of all reads used at each base position) must match an IMGT/HLA exactly (both base call and phase). Even 1 mismatch in base call or phase results in no allele assignment. When there is a mismatch, the only way to make an allele assignment is to edit the sequence, if warranted. These edits are tracked and auditable.

What software can analyze TruSight RNA Fusion libraries?

MiniSeq System—The Local Run Manager RNA Fusion Analysis Module performs alignment with STAR and analyzes for gene fusions with Manta. The Local Run Manager RNA Fusion Module generates a summary report of fusions.

MiSeq System—The off-instrument version of Local Run Manager and the Local Run Manager RNA Fusion Analysis Module perform alignment with STAR and analyze for gene fusions with Manta. The RNA Fusion Analysis Module generates a summary report of fusions. Install the off-instrument Local Run Manager on a compatible PC first, and then install the RNA Fusion Module.

BaseSpace Sequence Hub—The BaseSpace RNA-Seq Alignment App performs alignment and analyzes for gene fusions with STAR/Manta or TopHat.

  • STAR/Manta is recommended for optimal fusion calling. When STAR is selected as the aligner in the BaseSpace RNA-Seq Alignment App, Manta is used as the fusion caller.
  • The BaseSpace TopHat Alignment App can also be used for alignment and fusion calling. The BaseSpace Cufflinks Assembly & DE App can be used to run differential expression analysis.

Third-party analysis tools are also available, including open source Manta and STAR.

Can I calculate PC and PPC errors in the Polyploid Genotyping Module?

No, this is not an option in the Polyploid Genotyping Module.

What are the highlighted cells in the Fusion Calls table?

The yellow highlighted cells are genes that the panel has targeted.

Do I need a dedicated server to analyze run data from the NextSeq System?

NextSeq 550 Systems running NCS v4 or later can use Local Run Manager software modules for on-instrument analysis.

Additional analysis tools are available on BaseSpace Sequence Hub. You can also configure the NextSeq System to transfer data to a local server and perform analysis using third-party software.

Do Illumina FastTrack Services provide analysis support?

The sequencing team performs downstream analysis for the cancer analysis service. For tertiary data analysis of standard whole-genome sequencing services, the Illumina Genome Network works with software partners who are experts in the field.

What percent rRNA is expected?

The expected percentage of passing filter reads aligned to human ribosomal RNA is typically ≤ 8%.

Is there sample data I can view?

Datasets are available on the BaseSpace Public Data page.

Can I use VariantStudio for variant annotation and filtering?

VariantStudio can be used for annotation and filtering.

Where can I find the manifest?

You can download the manifest file from the Product Files page for the kit.

How do I analyze my TruSeq Targeted RNA Expression data?

Use MiSeq Reporter for on-instrument analysis. This software provides analysis of replicates and allows for pair-wise comparison of targeted regions.

Are there any differences in analysis for Nextera XT samples?

A new PCR Amplicon analysis workflow is available in MiSeq Reporter v1.3 (MSR) for the MiSeq system. The PCR Amplicon workflow requires specifying a manifest which is created in the Illumina Experiment Manager. A manifest is a list of all the targeted regions and their chromosome start and end positions. The manifest specifies regions of interest (ROIs) for the aligner and variant caller, which results in faster analysis times and visualization of results specific for only the ROIs. Note that CASAVA is not compatible with Manifest files. Refer to the Illumina Experiment Manager User Guide (part # 15031335) for more information on sample sheet and manifest creation for Nextera XT libraries. The MiSeq software (MCS 1.2/MSR 1.3) can be downloaded from the Downloads tab of the Nextera XT Support page.

Why is there a higher error rate for the first few bases?

The first two or three bases in mRNA-Seq reads have slightly elevated error rates compared to genomic DNA samples. We believe that this is an effect of the random priming process. The bases at the beginning of each read were likely at the back end of the random primer, away from the extending polymerase, during the priming process. It appears that this observation is a measurement of the mismatch pairing that is tolerated on the other end of the primer during the extension process by the polymerase.

What steps are taken to prevent and identify PCR artifacts?

PCR artifacts are common when amplifying gene targets. Although artifacts can occur in the TruSight HLA v2 assay, the assay is optimized to amplify them at a very low rate.

Assign 2.0 provides visualization of the sequencing reads to assess read diversity. Assign 2.0 is also designed to seek read diversity and uses a broad range of reads, reducing the likelihood of PCR artifacts contributing to the allele assignment.

Do local proxies affect BaseSpace Sequence Hub?

No testing has been performed on the effects of local proxies on BaseSpace Sequence Hub access.

Do Illumina FastTrack Sequencing Services offer phasing sequence analysis?

Yes. Illumina FastTrack Sequencing Services offer a human phasing sequence analysis service. See the Phasing Analysis Service FAQ for more details.

Can I use MiSeq Reporter software to analyze data from a HiSeq System?

No. File directory structures from a HiSeq System are incompatible with MiSeq Reporter software.

However, the TruSeq Amplicon App is available in BaseSpace Sequence Hub and can be used to analyze the this kit.

If I send my data to BaseSpace Sequence Hub, what analysis options do I have?

You can use BaseSpace Apps to analyze data in BaseSpace Sequence Hub. Select the Apps tab in BaseSpace Sequence Hub to see available apps and descriptions.

How is contamination identified and measured?

Contamination appears as additional alleles in the results. Assign 2.0 calls the two most frequent alleles in the sample. All other base calls are either flagged or considered noise if they fall within the acceptable range. Base calls are flagged because they are higher than the noise, but have not achieved the frequency of the second most frequent base in the locus. Review flagged bases to quickly check whether contamination is present and to measure the degree present.

Reads from these bases are available in the viewer. Right-click to blast these reads against the NCBI database to help determine the likely source of contamination.

Can GenomeStudio software display non-human sequence data?

Yes. GenomeStudio can display data from non-human genomes.

How do I send run data to BaseSpace?

Run data can only be uploaded to BaseSpace if the BaseSpace option is selected during run setup in the HiSeq Control Software. See the HiSeq 2500 System User Guide (part # 15035786) for information on setting up a run with a connection to BaseSpace.

For more information on BaseSpace, or to set up a free BaseSpace account, see http://www.illumina.com/products/by-type/informatics-products/basespace-sequence-hub.html.

What is adapter trimming?

Shorter inserts can lead to sequencing into the adapter. Adapter trimming helps filter out the adapter sequence from the final sequencing data.

Select the adapter trimming option when creating a sample sheet in Illumina Experiment Manager (IEM) for use with MiSeq. The MiSeq Reporter (MSR) analysis software automatically trims the adapter sequence. For all other run types, use the adapter trimming option with the appropriate commands.

Is the TruSight HLA Assign 2.0 software an additional charge?

TruSight HLA Assign 2.0 software is included in the TruSight HLA v2 kits. Upon order of TruSight HLA v2, you receive an email with the link to download the Assign 2.0 software and a license file attached to the email. If there is no email associated with the order or the email goes to the wrong person, contact Illumina Customer Service (customerservice@illumina.com) and provide the order number to receive a license.

Why is directional sequence information useful?

Directional sequence information is crucial in the identification of antisense transcripts for overlapping genes and it also increases the percentage of uniquely alignable reads in poorly-annotated species. Having directional information also eases the alignment and assembly processes for bioinformatics analyses.

What data quality can I expect from a run on the HiSeq 4000?

You can expect greater than 75% of all bases above Q30 with a 2 x 150 bp run. For more information, see HiSeq 3000/HiSeq 4000 System Specifications.

How many tiles are imaged on a HiSeq 3000/4000 flow cell?

Scanning and analysis of a HiSeq 3000/4000 flow cell is performed in two swaths per surface on two surfaces per lane. Each swath is divided into 28 tiles. Therefore, each flow cell contains 896 tiles.

How do I determine which version of GenomeStudio software I am using?

With GenomeStudio software open, go to the Help menu and select About. The About screen includes GenomeStudio software version information.

Are there any recommended third-party software tools?

Currently, there are no recommended third-party software tools.

Can CASAVA be used to analyze MiSeq data offline?

Yes, CASAVA can be used to analyze MiSeq data offline. The MiSeq System output folders are readable by CASAVA without a need to modify any configuration files.

What are the recommended computing requirements to run Conexio Assign?

Use the following minimum computing requirements:

  • 1 GHz, or faster, 64-bit Intel core processor, or equivalent
  • 16 GB RAM, minimum
  • 16 GB available hard disk space
  • Windows OS (Windows Vista, Windows 7, Windows 8, Windows Server 2008, or Windows Server 2012)
  • Microsoft Excel 97, or later, for generating reports

How often are IMGT references updated? Can I update references in Assign?

IMGT references are updated twice a year, in January and July, and are available from the TruSight HLA Support web page. Follow the instructions in the user guide to update the references in Assign. There are no automated updates.

What happens if my data connection is interrupted during a run on the NextSeq system?

If data transfer is interrupted during a run, data are stored temporarily on the instrument computer until the connection is restored. When the connection is restored, transferring of data resumes automatically.

If the connection is not restored before the end of the run, data must be removed from the instrument computer manually before a subsequent run can begin.

Is Illumina data compatible with Bioconductor?

Yes, Illumina data is compatible with Bioconductor, a collection of R packages developed by researchers around the world and distributed for free.

I can't see a reference sequence in the IGV (Illumina Genome Viewer) or ICB (Illumina Chromosome Browser). How can I display a reference sequence?

See Chapter 5 of the GenomeStudio 2008.1 Framework User Guide, available on iCom and in the GenomeStudio Portal.

Does BaseSpace require a sample sheet?

If you choose to use BaseSpace Sequence Hub for run monitoring only and your samples are not indexed, a sample sheet is not required. If you want to use BaseSpace Sequence Hub for data storage and analysis, a sample sheet is required. The sample sheet can be in either HiSeq Analysis Software format or CASAVA format. When using BaseSpace Sequence Hub, combining indexed and non-indexed samples on a flow cell is not possible.

If a network outage to the central file server storing a runs occurs, can the NovaSeq 6000 System cache the entire 2 × 150 run?

Yes. The server can store as many runs as disk space permits. Real-Time Analysis continues processing and UCS resumes data transfer when the network is restored.

Can Sequencing Analysis Viewer (SAV) be used with the MiSeq System to view primary analysis results?

Yes. You can install SAV on another computer that is connected to the same network as the instrument. MiSeq data will appear when you select All or Lane 1 from the drop-down menu. For more information, see the Sequencing Analysis Viewer Software Guide.

Which metrics can I use to evaluate data in the SNP table to identify SNPs that require further editing and poor-performing SNPs?

You can evaluate and sort SNPs by Call Freq, # no calls, Poly 10%, and Poly 50%.

What do we need to align to the Minimal Reference Genome?

Calling novel fusion partners requires whole genome alignment. The RNA Fusion Module aligns reads to the Minimal Reference Genome using Spliced Transcripts Alignment to a Reference (STAR).

What are my options for analyzing HiSeq 3000 data?

BaseSpace is the preferred analysis solution for the HiSeq 3000. For third-party analysis packages, you can use the bcl2fastq converter. CASAVA and HAS are not supported.

What is bcl2fastq?

The bcl2fastq v1.8.4 conversion software is a separate piece of standalone software that is run on a Linux scientific computing system. The installer can be downloaded from the Illumina website. System requirements are outlined in the bcl2fastq User Guide (part # 15038058). If BCL files are zipped, then the use of the bcl2fastq v1.8.4 is required.

How do I confirm that a fusion is real?

The best way to confirm that an identified fusion is 'real' is to use an orthogonal approach. Additionally, assess the quality score and the chromosomal locations of the fusions to help indicate confidence.

Are there restrictions to the Minimum Number of Points in Cluster?

This value must be non-zero. Illumina recommends setting the Minimum Number of Points in Cluster to match the biology of your samples and the size of your data set. The general guideline is to set the value to 1–4% of the number of samples that are performing well in the data set.

Can GenomeStudio be used to view MiSeq data?

No. MiSeq Reporter is used to view MiSeq data. For an overview of the software, see the MiSeq Reporter page.

What does a paired read and a split read mean in the Fusion Calls table from the BaseSpace RNA-Alignment App results?

A paired read is a fusion where one read aligns to the left gene and the other read aligns to the right gene.

A split read is a fusion where one of the reads spans the fusion junction.

Do I need to begin clustering with one of the two default algorithms, OPTICS or DBSCAN?

No, you can also cluster by defining #clusters, if known based on the biology of your samples.

I cannot find a Report Wizard in the Polyploid Genotyping Module. How can I create reports from my data?

Data can be exported directly from the Samples Table, SNP Table, and Full Data Table for downstream analysis. Mark the columns and rows you wish to export and click the icon for "Export displayed data to file" to save selected table contents in *.txt or *.csv format.

Do I need to normalize my data for TruSeq Targeted RNA Expression?

Yes, you will need to select internal normalizer genes from RNA-Seq or Microarray data from samples similar to those that will be run on the assay. The genes should be invariant across the samples being tested. Assays specific to the normalizer genes need to be included in the final TOP panel.

What types of variants can TruSight Lymphoma 40 detect?

The TruSight Lymphoma 40 kit can detect single nucleotide variants (SNVs) and small indels. Use BaseSpace Variant Interpreter to filter and annotate variants.

How do I get access to VariantStudio?

Purchase of this kit includes access to a VariantStudio license. For more more information, contact Customer Service.

Alternatively, upload VCF files to BaseSpace and launch VariantStudio via BaseSpace.

Does changing parameters in the Clustering Options dialog box impact the existing cluster data?

No. New settings are only applied to any SNPs clustered after applying changes.

Do I need to cluster all SNPs in the data set by the same algorithm and clustering options?

No, you can use different algorithms and clustering options for different SNPs. The goal is to find optimal parameters for each SNP matching the biology of your samples.

Which metrics can I use to evaluate data in the Samples table to identify poor-performing samples to exclude from the data?

You can evaluate and sort samples by Poly Call Rate, Poly 10%, and Poly 50%. Illumina recommends using the scatterplot function in the Samples table to plot Poly 50% against Poly Call Rate to graphically visualize sample outliers.

What is the major difference between the DBSCAN and OPTICS algorithms? How do I choose between them?

OPTICS is an acronym for Ordering Points to Identify Clustering Structure. As a subalgorithm of DBSCAN, it was developed to be more robust to changes in input parameters. This trait makes OPTICS more suited for initial clustering.

DBSCAN is an acronym for Density-Based Spatial Clustering of Applications with Noise. This algorithm is more sensitive to initial input parameters, such as cluster distance. It is more suited for differentiating clusters that are very close together, and is typically applied to SNPs for which OPTICS does not yield satisfactory results.

How do I set up my sample sheet for the MiSeq?

Follow the instructions in the IEM TruSight Tumor 15 Quick Reference Card.

Do I need both genes to be targeted to detect a fusion gene?

No, only one of the gene fusion partners needs to be detected. The enrichment approach allows you to pull down the target and the partner fusion gene with it.

How does changing Cluster Distance in the Clustering Options dialog box affect my results?

Cluster distance specifies the maximum distance that samples can be away from each other and still considered part of the same cluster. Increasing cluster distance will result in fewer clusters that are larger in size, while decreasing cluster distance will result in more clusters which are smaller in size. A cluster distance of 0.06 is typically a good starting point for initial clustering.

How do I analyze my data for bisulfite sequencing applications?

Illumina suggests using a third-party methylation analysis solution such as Bismark, BSMap, or BS Seeker. See the Methylation Software section for more details.

Which peaks do I use from the Bioanalyzer for TruSeq Targeted RNA Expression?

Perform a region analysis on the Bioanalyzer and define the region from 150–250 bases to get molarity for the entire sample.

What are some third party data analysis tools for TruSeq ChIP samples?

There are a number of third party solutions available for the analysis of ChIP data. These include, but are not limited to, MACS, Avadis, and Partek. Note that Illumina cannot provide support for the use of third party software; contact the software resources directly with any questions regarding the analysis and use of their software. Additional literature references can be found on the Illumina Epigenetics webpage.

How can I get GenomeStudio software?

The software is provided without charge and can be downloaded from the Downloads page.

Can genotyping projects that were opened and modified in the Polyploid Genotyping Module be imported back into the GenomeStudio Genotyping Module?

No, a polyploid project (.pcm) can only be opened using the Polyploid Genotyping Module.

Does the RNA Fusion Module provide information about gene expression or variant calls?

The RNA Fusion Module does not provide variant call data. The RNA Fusion Module provides read counts in the samples/GeneCountsfolder. For detailed information on variants and gene expression, the BaseSpace RNA-Seq Alignment App is recommended.

Can I get variant information (e.g., SNP, indel) from the TruSeq Targeted RNA Expression sequence data?

This information can be found in the .bam files. However, MiSeq Reporter Software does not perform variant calling or generate .vcf files. cSNPs can be analyzed and visualized using the Illumina Genome Viewer in GenomeStudio.

Can Assign 1.0 Project Files (*.cgp) be imported into Assign 2.0?

Assign 1.0 Project Files (*.cgp) are not compatible with Assign 2.0. However, Assign 2.0 can import and analyze FASTQ files generated by TruSight HLA Sequencing Panel (version 1).

What analysis tools are available for TruSeq Genotype Ne?

Use the TruSeq Amplicon BaseSpace App or the Local Run Manager Amplicon DS analysis module to analyze TruSeq Gentoype Ne libraries.

What are Illumina's minimum hardware recommendations to run GenomeStudio software?

The table below includes Illumina's minimum hardware recommendations to run GenomeStudio software.

CPU Speed 2.0 GHz or greater
Processor 64-bit, with 2 or more cores
Memory 8 GB or more
Hard Drive 100 GB or larger
Video Display 1,280 x 1,024
Operating System Windows 7 or higher
Specific OS Requirements Microsoft .NET Framework 3.5
Network Connection 1 GbE or faster

How do I set up my run on the MiniSeq?

Follow the instructions in the Local Run Manager Software Guide.

What tools are offered for data analysis?

The following software tools are available for this kit: Illumina Experiment Manager, MiSeq Reporter, and the Illumina Amplicon Viewer.

MiSeq Reporter is a simple, on-instrument tool that demultiplexes samples, performs alignment of reads, and creates a variant report immediately upon completion of each sequencing run. MiSeq Reporter also creates an assembly for each amplicon (on a per-sample basis), performs per-SNP calling, and generates graphs and reports, including the sequencing coverage per amplicon.

Illumina Amplicon Viewer performs data visualization from multiple runs and custom report generation off the instrument. Analysis workflows are available for detecting both germline and somatic variation in sequenced samples.

How many reads do I need per sample?

~3 million reads per sample is recommended. More or less reads may be required depending on the expression level of the fusion.

Can I install Assign TruSight HLA Analysis Software on a network drive, rather than on individual PCs?

Assign TruSight HLA Analysis Software can be installed on a network drive. For network assistance, consult your facility IT administrator.

Is the sample sheet/library sheet optional or mandatory?

For MiSeq runs, a sample sheet is required at the start of the run to enable analysis. For HiSeq runs, creating and loading this sample sheet at the start of the run is optional, but is highly recommended in order to view data in the indexing tab of SAV during the run. If you do not load a sample sheet at the start of a run in HCS, you will not be able to view indexing data in SAV. Illumina recommends creating the sample sheet in the Illumina Experiment Manager (IEM) to confirm appropriate index combinations prior to performing library prep.

Can the Polyploid Genotyping Module call genotypes?

No, the Polyploid Genotyping Module performs cluster assignment, but does not call genotypes. This is because the assignment of genotypes polyploid species is highly dependent on the population and biology of the organism. Any downstream genotype assignment should be done with the biology and evolutionary history of the population taken into consideration.

What percent of sequencing reads is expected to be on target?

For good quality samples, > 90% of sequencing reads on target is expected for control samples.

Is GenomeStudio software supported on Windows Vista?

Yes, GenomeStudio is supported on Windows XP, Vista, and Windows 7.

Can I run differential expression analysis on my data?

Yes, use the RNA-Seq Differential Expression BaseSpace Sequence Hub app.

Are the Illumina adapter sequences available?

Illumina adapter sequences are provided in the Illumina Customer Sequence Letter. This kit uses the TruSeq adapters.

Can I use third-party analysis options with MiniSeq sequencing data?

Yes. Third-party analysis options require sequencing data in the FASTQ file format. Use the Local Run Manager Generate FASTQ analysis module to convert data for later use. The Generate FASTQ analysis module converts base calls to FASTQ files and then exits the workflow. No further analysis is performed.

Can I analyze .cif files with BaseSpace Sequence Hub?

No, .cif files cannot be analyzed with BaseSpace Sequence Hub. Additionally, it is not possible to output .cif files with HCS v2.2 on HiSeq v4 mode or Rapid Run mode with HiSeq v2 chemistry. The option to output .cif files is available in TruSeq v3 mode and Rapid Run mode with TruSeq chemistry.

What data quality can I expect from a run on the HiSeq X system?

You can expect greater than 75% of bases above Q30 with a 2 x 150 bp run using the following libraries:

  • A human library prepared with the TruSeq Nano DNA Sample Prep Kit (350 bp or 450 bp insert)
  • A human library prepared with the TruSeq DNA PCR-Free Sample Prep Kit (350 bp or 450 bp insert)
  • Illumina PhiX control library

What does the data look like? Is there demo data available?

For an example of data from an experiment, see the data set available on BaseSpace. The example data set is from the TruSeq Amplicon Cancer Panel sequenced with miSeq on FFPE-derived DNA.

What is the overlap in targeted genomic content between the Nextera Rapid Capture Exome and Expanded Exome Enrichment products?

Eight-nine percent of mega bases targeted in the Nextera Rapid Capture Exome Enrichment overlap with the 62.1 mega bases of regions targeted in the Expanded Exome Enrichment pool.

Download BED files outlining both the overlapping regions and the content unique to each product from the Product Files page for your kit.

Can I analyze v4 data with CASAVA?

If you currently use CASAVA, you can analyze HiSeq v4 data with CASAVA. You need to use the bcl2fastq v1.8.4 conversion software in place of the configureBclToFastq component of CASAVA.

For HiSeq v4 runs, perform alignment of data from each separately, and then merge the data at the configureBuild step.

If you are not using CASAVA, note that Illumina is discontinuing distribution of CASAVA software to better support new products available on BaseSpace. BaseSpace features analysis options for a large array of NGS applications.

Do Illumina FastTrack Services offer a cancer analysis service?

Yes. Illumina FastTrack Services offer a cancer analysis service for tumor-normal data set comparison. The standard offering is for a 40x normal and 80x tumor pair. See the Sequencing Service Process for more information.

Which aligner do I use in the RNA-Seq Alignment App?

The STAR aligner in the RNA-Seq Alignment App as been optimized for fusion calling and is recommended.

You can also use the TopHat Alignment App. Running both analysis options and comparing the data can be useful.

What software do I use to analyze CIF files generated with HCS v2.x?

Where *.cif files can be generated, you can use OLB v1.9.4.

Can thumbnail images be reanalyzed on the HiSeq System?

No. Thumbnail images are for visual inspection only to help diagnose problems with a run. They are not suitable for reanalysis.

What bioinformatics processes does Assign 2.0 use to process sequence data?

Assign 2.0 first performs alignment of the sequencing reads. The alignment is performed against a locus consensus sequence generated from the alleles for each locus. The heterozygous positions of these aligned reads are then phased. The first pass of phasing phases heterozygous positions within the same read. The second pass of phasing phases heterozygous positions fulling within the same read pair. The final phasing step layers paired reads to determine phase between heterozygous positions for which the first 2 passes were unable to elucidate phase. If all 3 of these passes fail to determine phase or if the data provides ambiguous phasing, phase is not assigned and a phase break is shown.

These phased alignments are then compared to the IMGT/HLA database within Assign 2.0 and can be assigned an allele, multiple alleles, or no alleles. If an unambiguous alignment to a single allele is made, then this result is displayed on the summary report. If the result is ambiguous (multiple perfect matches), all perfect matches appear in the report. If no perfect match is available, these receive a no call and require manual review and editing, if warranted.

What coverage level is expected?

More than 93.5% bases covered at 500x or higher.

How do I analyze my TruSeq Exome data?

TruSeq Exome Enrichment data can be analyzed using the HiSeq Analysis Software enrichment workflow for HiSeq data or MiSeq Reporter Enrichment workflow for MiSeq data. For more information, see the HiSeq Analysis Software or MiSeq Reporter support pages.

If the run is being uploaded to Basespace or BaseSpace onsite, the samples can be analysed with either the BWA Enrichment or ISAAC Enrichment apps. Make sure that the correct manifest is selected before running the app.

Alternately, Illumina sequence base call output files (*.bcl) can be demultiplexed and converted to FASTQ format using the bcl2fastq converter software. The files can then be used for analysis with other third party software packages (eg, BWA and GATK).

How do data compression options in HCS v2.2 change data analysis or data handling?

Because run output has zipped BCL files, you must use the bcl2fastq v1.8.4 conversion software to perform BCL to FASTQ conversion on your local Linux analysis system. This tool is run on Linux and has the same syntax, options, and functions (including demultiplexing) as the configureBclToFastq.pl script of CASAVA. The only difference is that it can be used to analyze either zipped or non-zipped BCL files.

If you send your data to BaseSpace Sequence Hub, BCL to FASTQ conversion and demultiplexing are performed automatically following the completion of the data upload.

How is the disease association determined in the RNA Fusion analysis module tables?

The disease associations are defined by data collected from the Mitelman Database on August 26, 2015. The RNA Fusion analysis module does not retrieve updates to the database and users cannot update the database. The module reports only the most frequently observed disease association of the fusion from scientific literature recorded in the Mitelman database as of August 26, 2015. Disease associations for fusions with undefined disease associations are reported as "NA" (not available). The disease association that the RNA Fusion analysis module provides, via the Mitelman Database, is for research use only and must not be used for any clinical decisions. The TruSight RNA Fusion System is classified as Research Use Only.

Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer (2015/08/26). Mitelman F, Johansson B and Mertens F (Eds.), http://cgap.nci.nih.gov/Chromosomes/Mitelman

What is a manifest file and where do I get one for my project?

Manifest files list genomic regions and coordinates targeted for enrichment with the library prep kit. Data analysis requires manifest files for alignment and variant calling in targeted regions. Each product has a _targeted regions and _probe manifest file.

Download manifest files from the Product Files page for the library prep kit.

Which software do I use for downstream analysis of clusters and for generating genotype calls?

Illumina does not provide recommendations for downstream analysis outside of GenomeStudio.

Where are the content manifest file downloads for my Nextera Rapid Capture Custom Enrichment assay pool?

You can download target and probe manifest files from the following locations:

  • The Export Manifest function in Project Dashboard of your DesignStudio project.
  • The Product Files section of Custom Products in your MyIllumina account.

How many swaths and tiles are on an 8-lane high output flow cell?

Scanning and analysis of a high output flow cell is performed in three swaths per surface on two surfaces per lane. Each swath is divided into 16 tiles. An 8-lane flow cell contains 768 tiles per flow cell.

How do I read the TruSeq Sample Prep controls?

RTA 1.9 (on GA) or 1.10 (on HiSeq) or higher processes the controls. To visualize them, SAV (Sequencing Analysis Viewer) 1.7 or higher must be used. RTA produces a new InterOp metrics file called ControlMetrics; this is the file SAV uses to get control counts.

Is the TruSight Cardio assay compatible with BaseSpace?

Yes, BaseSpace can be used to analyze data from TruSight Cardio runs. To set up the run, use the FASTQ workflow and stream the run into BaseSpace. Use teh BWA Enrichment app or the Isaac Enrichment app (v2.0 or v2.1) for analysis. After analysis, download VCF files from BaseSpace and into the VariantStudio software to report and filter variants.

What is the specification for the corrected error rate?

Noise Allele Frequency (error rate) ≤ 0.007%

What coverage level is expected?

Coverage may vary based on sample quality and multiplexing level. You are encouraged to review the DNA_SampleMetricsReport.txt and RNA_SampleMetricsReport.txt files produced by the TruSight Tumor 170 app for each analysis. This report contains a metric indicating the % of bases with > 100x coverage. A sequencing run with eight high-quality DNA and eight high-quality RNA samples (16 libraries) has demonstrated ≥ 99% bases with 100x coverage.

How do I analyze my data?

In BaseSpace Sequence Hub, use the SureCell RNA Single-Cell App to analyze your data.

How do I analyze my data?

TruSeq Exome Enrichment data can be analyzed using the HiSeq Analysis Software enrichment workflow for HiSeq data or MiSeq Reporter Enrichment workflow for MiSeq data. For more information, see the HiSeq Analysis Software or MiSeq Reporter support pages.

If the run is being uploaded to Basespace or BaseSpace Onsite, the samples can be analyzed with either the BWA Enrichment or ISAAC Enrichment apps. Make sure that the correct manifest is selected before running the app.

Alternately, Illumina sequence base call output files (*.bcl) can be demultiplexed and converted to FASTQ format using the bcl2fastq converter software. The files can then be used for analysis with other third party software packages (eg, BWA and GATK).

How long does the TruSight Oncology 500 analysis take on a server with the minimum recommended specs?

Approximately 16 hours.

What software can be used for the analysis and what information do they provide?

Use the RNA-seq Alignmentand Cufflinks Assembly & DE apps on BaseSpace Sequence Hub to analyze data.

The RNA-seq Alignment app will perform read alignment, variant calling, read counting and detect RNA fusions. The Cufflinks Assembly & DE app on the other hand, will perform novel transcript assembly, differential expression analysis and FPKM abundance estimates.

On instrument analyses are currently not supported.

How can I add Illumina DNA Prep to the MiniSeq Local Run Manager analysis software?

Download the Illumina DNA Prep definition file from the Product Files page of the Illumina DNA Prep support site to a known location such as the desktop. Open the Local Run Manager application and select the Gear (System Settings) icon. From here, select Library Prep Kits and select the Add Library Prep Kit option. Navigate to the location of the template file and add the file. The new kit will now appear under the Library Prep Kit dropdown menu when creating a run.

Can I analyze the data with my own software pipeline?

Customers can use their own pipeline. However, we do not provide resources to facilitate this.

Can I run the TruSight Oncology 500 Local App using Singularity?

Yes, the software is available as a Singularity image. See the TruSight Oncology 500 Local App User Guide for more information.

What analysis file types are generated by the SureCell RNA Single-Cell app?

Major output files include aligned reads in BAM format, gene counts per cell and per sample, and PDF reports containing single-cell RNA metrics. Consult the SureCell RNA Single-Cell app user guide for more detailed information.

How do I obtain the Local Docker or Singularity software for the TruSight Oncology 500 Local App?

Review the system requirements in the TruSight Oncology 500 Local App User Guide. When requirements are met, contact Illumina Technical Support to request the link to download the Docker or Singularity image.

Can I change the settings of the Trusight Oncology 500 analysis pipeline to detect larger indels?

No, the algorithm settings cannot be changed.

What are my options for analyzing iSeq 100 sequencing data?

Local Run Manager is the preferred analysis solution for the iSeq 100 System because it simplifies the end-to-end workflow. You enter all sample information into Local Run Manager and it is automatically sent to the control software. After the run is complete, you can analyze the data locally in Local Run Manager, copy files to BaseSpace Sequence Hub for analysis, or both.

Using BaseSpace Sequence Hub independently of Local Run Manager is also an analysis option. For third-party and user-developed analysis packages, you can use the bcl2fastq2 converter.

What is % Stranded in my analysis data?

The RNA-seq Alignment app reports the % Stranded reads, which is a calculation of the percent aligned Read 1 reads that are mapped to the antisense gDNA strand, and the percent aligned Read 2 reads that are mapped to the sense gDNA strand.

Which Illumina library prep kits are compatible with the SureCell RNA Single-Cell app on BaseSpace Sequence Hub?

The app is designed specifically of analysis of samples prepared with the SureCell WTA 3' library prep kit. The app has not been validated for analysis of samples prepared with any other kit.

If a network outage occurs, can the iSeq 100 System cache an entire 2 x 150 run?

Yes. The local instrument storage can store data until the local storage limit is reached, so starting runs with ample free space is recommended. The analysis and data transfer software continue processing and resume data transfer when the network is restored.

What analysis does the SureCell RNA Single-Cell app perform?

The SureCell RNA Single-Cell app is designed to analyze samples prepared using the SureCell WTA 3' Library Preparation kit. This app performs read alignment, cell and transcript assignment, cell and gene counting, filtering, and calculates and reports single-cell metrics. The SureCell RNA Single-Cell workflow consists of the following major steps. Consult the SureCell RNA Single-Cell app user guide for a detailed breakdown of the workflow:

  1. Align read 2 against the full reference genome using STAR
  2. Generate an annotated BAM file with each alignment tagged with both the cell barcode and the UMI
  3. Count UMIs to generate gene expression counts
  4. Filter cells based on the number of UMIs per cell barcode
  5. Calculate alignment, cell and gene metrics
  6. Generate aggregate and per-sample PDF reports

How can I set Local Run Manager to trim the adapter sequences for this kit during analysis?

During run creation, select the Show advanced module settings option. From here select the Add Custom Setting option. For read 1 adapter trimming, enter Adapter as the option name and the appropriate sequence. For Illumina DNA Prep the sequence is CTGTCTCTTATACACATCT.

What are the sizes of data files from the HiSeq?

For a dual flow cell 2x101 cycle run (200 Gb) on the HiSeq 2000 using HCS v1.3 and prior, you can expect 2 TB of intensity data (optionally transferred to a server), 250 GB of base call and quality score information, and 1.2 TB of space for alignment output not including 6 TB of disk space used for temporary files removed before completion of alignment. Using HCS v1.4 and Flow Cell v3, storage requirements for raw data are approximately 60% greater than current runs based on additional swath data and increased cluster density.

Why is my report in BaseSpace Variant Interpreter missing variants?

The coverage requirements for Amplicon DS are 1000x per amplicon and at least 300x per strand. Amplicons that do not meet these requirements are listed in the VCF file, but are removed during filtering and annotation in BaseSpace Variant Interpreter.

I have noticed double peaks in methylation value that seem to be related to Infinium I versus Infinium II probe designs. What is causing these double peaks?

What you are seeing are histograms of the beta values in bins of 0.02 steps and categorized by Infinium design type. The difference in beta value ranges between the Infinium I and Infinium II assay design types cause the double peaks. In general, the beta peaks at the extremes of Infinium I probes tend to be further out than the beta peaks for Infinium II.

The beta beak differences do not affect the final analysis of the project. Individual CpG assays are not intended to be compared directly with other CpG assays, as each probe (or probe set for Infinium I designs) has different binding characteristics and behaves differently than any other probe or probe set. Rather, each assay is compared between two samples or sample groups (ie, in determining a relative rather than an absolute methylation value).

How do I determine which genes are accurately detected?

Filter the genes using the Detection p-value. Setting detection at .99 (p value <0.01) means that there is a 1% false positive rate.

Which version of bcl2fastq Conversion Software is supported for the NovaSeq 6000 System?

Use bcl2fastq2 Conversion Software v2.19 or later to processing output from the NovaSeq 6000 System.

How is alignment performed?

The SureCell RNA Single-Cell app uses the STAR aligner to align read 2 against the whole reference genome. Each aligned read in the BAM file is tagged with the cell barcode and UMI sequences from read 1.

How do I analyze data?

Use the BaseSpace Sequence Hub cloud environment or a local Docker App to analyze TruSight Tumor 170 libraries.

How do I analyze data from a run on the MiSeq?

The system is designed to support multiple workflows inclusive of data analysis, which is performed on-instrument upon completion of the run. Output file formats are *.bcl, FASTQ, BAM, *.vcf, *.csv, and *.txt.

How do I visualize my TruSight Cardio targets in a genome browser?

TruSight Cardio regions, targets, and probes are provided in the *.bed file format. The *.bed file can be imported into either the UCSC Genome Browser (genome.ucsc.edu) or the Integrated Genome Viewer (IGV) (www.broadinstitute.org/igv/).

How do I set up the sample sheet for MSR analysis?

Use the Enrichment workflow to create a sample sheet formatted for analysis of TruSight One data. For instructions, see the Illumina Experiment Manager User Guide.

What are the data storage requirements?

Labs typically store zipped FASTQ files (*.fastq.gz) and Assign project files (*.cgp). The total size of all FASTQ files for a run mostly depends on the sequencer and the sequencing flow cell that was used and does not depend on the number of samples.

Flow Cell

Samples

FASTQ File Size

FASTQ File Size Variance

Assign Project File Size

Assign Project File Size Variance

MiSeq v2 Nano (300 cycles)

6

350 MB

+/- 200 MB

100 MB

+/- 50 MB

MiSeq v2 Micro (300 cycles)

12

1.5 GB

+/- 500 MB

200 MB

+/- 50 MB

MiSeq v2 Micro (300 cycles)

24

1.5 GB

+/- 500 MB

350 MB

+/- 75 MB

MiSeq v2 (300 cycles)

48

5 GB

+/- 2 GB

750 MB

+/- 100 MB

MiSeq v2 (300 cycles)

96

5 GB

+/- 2 GB

1.5 GB

+/- 500 MB

What is the difference between the BaseSpace RNA-Seq Alignment and TopHat Alignment apps? Which app should I run?

The main difference between the apps is that the RNA-Seq Alignment App is specifically designed for optimal fusion calling using STAR.

You can try both analysis methods and compare results.

How do I deselect MiSeq Reporter when I start a run?

You can specify the GenerateFASTQ workflow in your sample sheet, which creates FASTQ files and then exits secondary analysis. For more information, see the MiSeq Sample Sheet Quick Reference Guide.

What are CBCL files?

CBCL files (*.cbcl) are concatenated BCL files that save space and optimize output. Instead of generating BCL files on a per tile basis, tiles from the same lane and surface are aggregated into 1 CBCL file for each lane and surface.

How do I analyze my data?

Analyze HiSeq data using the HiSeq Analysis Software enrichment workflow for HiSeq data. Analyze MiSeq data using MiSeq Reporter. See the HiSeq Analysis Software or MiSeq Reporter support pages for more information.

Alternately, Illumina sequence base call output files (*.bcl) can be demultiplexed and converted to FASTQ format using the bcl2fastq converter software. The files can then be used for analysis with other third party software packages (eg, BWA and GATK). If the user is analyzing with the use of basespace, we recommend the use of BWA or Isaac enrichment analysis tools. If any subsampling is required, use the FASTQ Toolkit.

What percent duplicates is expected?

The expected range of percent duplicates is ≤ 25% for controls (UHR) at 0.5M subsampled reads.

Does the analysis also work with longer or shorter read lengths than 2 x 101 bp?

There is no minimum read length determined, but the analysis is optimized for 2 x 101 bp. The minimum read length required for the analysis is recommended to be 2 x 101 bp. Shorter lengths down to 75 bp may work, but has not been fully verified.

What variant callers can be used?

You may use any variant caller that you are familiar with, such as Strelka and GATK. However, you may need to make modifications to allow for the very high Q scores (Q70) of the error-corrected BAM files.

The UMI Error Correction App does not have variant calling capability but the Local TruSight Tumor 170 Analysis App accepts converted FASTQs from collapsed BAM files. Again, modifications may be needed.

Due to compatibility issues, the BaseSpace Sequence Hub TruSight Tumor 170 Analysis App does not take converted FASTQs from collapsed BAM files as input.

Why are SNPs included in the manifest of the methylation BeadChip?

SNPs were included on the BeadChip so investigators could generate a DNA fingerprintof their samples as an added level of quality control. You can find further information in the Infinium HD Methylation Assay Protocol Guide. SNP assays on the BeadChip are not mentioned in the assay guide and only briefly described in the GenomeStudio Methylation Module. Follow this method to confirm the identity of samples from the same individual:

  1. Highlight the SNP assays in the sample methylation profile, right-click, and select Show only selected rows.
  2. For any given pair of samples that are supposed to be from the same individual, plot the beta values in a scatter plot from the sample methylation profile.

Is the output of bcl2fastq compatible with CASAVA v1.8.2?

If you are using CASAVA, it is compatible. However, bcl2fastq v1.8.4 must be used in place of the configureBcl2fastq step in CASAVA. The output of bcl2fastq v1.8.4 is in the fastq.gz file format organized into project and sample directories as specified in the sample sheet. This output is compatible with the configureAlignment and configureBuild components of CASAVA v1.8.2. The sample sheet format required for bcl2fastq v1.8.4 is equivalent to CASAVA v1.8.2 sample sheet format, and is described in the bcl2fastq v1.8.4 User Guide (part # 15038058).

If you are not using CASAVA, note that Illumina is discontinuing distribution of CASAVA software to better support new products available on BaseSpace. BaseSpace features analysis options for a large array of NGS applications.

Can I export sequences from Assign 2.0?

Sample consensus sequences can be exported from Assign 2.0 in MS Excel (*.xls, *.xlsx, or *.xlsm depending on MS Excel version), text (*.txt), or FASTA (*.fasta) formats.

Is there a manual way to include or exclude samples from clusters?

Yes, in the SNP graph, use the curser to draw a box around the samples you wish to manually edit, right-click and choose the cluster samples should be assigned to, or NC (no call) if you wish to remove samples from any clusters.

I do not have GenomeStudio installed on my computer, but would like to work with the Polyploid Genotyping Module. Do I need to install GenomeStudio prior to installing the Polyploid Genotyping Module?

Yes. See the GenomeStudio downloads for the latest GenomeStudio installer. You can choose to install the GenomeStudio Framework, which does not require a licence key, by selecting the respective box in the install wizard. GenomeStudio Genotyping Module does not need to be installed on the same computer on which the Polyploid Genotyping Module is installed. However, the polyploid workflow does require generating a genotyping project in the GenomeStudio Genotyping Module prior to taking the data to the Polyploid Genotyping Module for polyploidy clustering.

Do I need new license keys for modules that I do not already own?

Yes. If you do not already own the module, you need to purchase new license key(s).

Can secondary analysis run on MiSeq while a run is in progress?

If a new sequencing run is started on the MiSeq before secondary analysis of a previous run is complete, secondary analysis will be stopped automatically. MiSeq computing resources are dedicated to either sequencing or analysis, and the system is designed in such a way that a sequencing command overrides an analysis command. Secondary analysis can later be requeued from the MiSeq Reporter Analyses tab.

Do I need to use the UMI Error Correction App? Can I write my own script?

You must use the UMI Error Correction App to successfully collapse reads containing Illumina UMIs.

Which version of the bcl2fastq Conversion Software is supported for the iSeq 100 System?

Use bcl2fastq2 Conversion Software v2.20, or later, to process data from the iSeq 100 System.

What species are supported?

For analysis, the Single-Cell RNA app supports multiple species, including human (hg19), mouse (mm10), rat (rn5), zebrafish (danRer7), fly (dm3) and c. elegans (de10). Human (hg19) and mouse (mm10) mixed species experiments are also supported.

What are my options for analyzing NovaSeq sequencing data?

BaseSpace Sequence Hub is the preferred analysis solution for the NovaSeq Sequencing System. On-instrument analysis is not available. For third-party and user-developed analysis packages, you can use the bcl2fastq2 converter.

Is a manifest required for data analysis? Where can I download a manifest?

The manifest is fixed within the TruSight Tumor 170 app. You can download the manifest for viewing from the TruSight Tumor 170 support page.

Which Linux OP system is recommended? Does the TruSight Oncology 500 Local App run on both Ubuntu and CentOS?

The TruSight Oncology 500 Local App User Guide recommends CentOS 7.3 or higher. Ubuntu is expected to work but is not fully verified.

What are the largest indel size that was tested internally with the small variant calling algorithm?

During design verification testing, Illumina successfully tested as long as 18bp for insertions and 18bp, 24bp and 27bp deletions.

Are there any guidelines for supporting analysis outside of BaseSpace Sequence Hub?

Yes. Contact Illumina Technical Support for more information.

Can I use MiSeq Reporter software to analyze data from a HiSeq system?

No. File directory structures from a HiSeq system are incompatible with MiSeq Reporter software.

However, the TruSeq Amplicon App is available in BaseSpace and can be used to analyze the this kit.

Are MNVs (Multiple nucleotide variants) called?

Yes, MNVs up to 3 base pairs are called.

How do I set up MiSeq and HiSeq sample sheets for SureCell WTA 3′ samples?

Use Illumina Experiment Manager (IEM) v1.13 or later. Sample sheets generated using IEM will contain the correct UMI settings for BaseSpace or bcl2fastq2 v2.18 or later.

What if I cannot use the BaseSpace Sequence Hub cloud environment to analyze my data?

BaseSpace Sequence Hub is the only analysis option available at release. A local solution will be available in the future. Contact your Illumina sales representative for details.

What percent duplicates is expected?

The percent duplicates varies depending on sample quality and pexity. A sequencing run with eight high-quality DNA samples and eight high-quality RNA samples (16 libraries) has demonstrated ≤ 85% duplicates.

When interpreting the data, what conclusions can be drawn from the following scenarios? Are the data usable? What can I do to correct any observed issues?

Meets Median Target Coverage Threshold

Meats Mean Family Depth Threshold

Meets Noise Allele Frequency Threshold

Conclusion
Yes Yes

Yes

Data are usable for ultralow AF detection (~0.4%).
Yes Yes No Unlikely scenario. If observed, the sample may have significant DNA damage before library prep.
Yes No Yes Slightly insufficient sequencing however, data might still be good to use for variant detection.
Yes No No Insufficient sequencing for DNA input amount. Family size is too small to effectively reduce error rate.
No No Yes Low DNA input or problem in library conversion efficiency. Slightly under sequencing. Not recommended for ultralow AF detection due to low MTC.
No Yes Yes Low DNA input or problem in library conversion efficiency. Sufficient sequencing. Not recommended for ultralow AF detection due to low MTC.
No Yes No Low DNA input or problem in library conversion efficiency. Sufficient sequencing but sample may have significant DNA damage before library prep.
No No No Low DNA input or problem in library conversion efficiency. Under sequencing. Family size is too small to effectively reduce error rate.

Does the UMI Error Correction software work with RNA?

No. The UMI Error Correction software only aligns to the whole human genome (hg19). Whole transcriptome alignment is not possible at this time.

What can I use to analyze the AmpliSeq for Illumina Sample ID Panel?

Use BaseSpace DNA Amplicon 2.0 or higher and Local Run Manager DNA Amplicon Analysis Module v1.1 or higher can be used to analyze Sample ID.

Can I compare data from the TruSeq Stranded mRNA protocol to the TruSeq RNA Sample Prep v2?

Yes, TruSeq Stranded mRNA data is comparable to TruSeq RNA Sample Prep v2 data although TruSeq RNA v2 libraries do not contain stranded information.

What actual assay performance can I expect from my design?

DesignStudio returns high confidence amplicon designs that have delivered unprecedented amplicon multiplexing performance. Since each design is unique and sample input can vary, performance of the design will need to be tested empirically.

How much disk space is recommended for end-to-end analysis?

HiSeq 2500 and HiSeq 4000 – 1 TB

What tools are offered for data analysis?

Local Run Manager and BaseSpace Sequence Hub have apps available for analysis. The DNA Amplicon Analysis App and RNA Amplicon Analysis App are available on BaseSpace Sequence Hub. Further analysis can be performed on any variant calls using BaseSpace Variant Interpreter. Local Run Manager has a similar DNA Amplicon Analysis Module and RNA Amplicon Analysis Module which utilizes the same workflow and algorithm as the BaseSpace Sequence Hub Apps.

The DNA Amplicon analysis workflow can be used to perform alignment and variant calling and the RNA Amplicon analysis workflow for fusion calling. Additionally, OncoCNV caller, a BaseSpace Lab Apps is available for CNV analysis.

What tools are offered for data analysis?

Illumina recommends using the third party MiXCr Labs app available in BaseSpace Sequence Hub.

What tools are offered for data analysis?

The DNA Amplicon analysis workflow can be used to perform alignment and variant calling. Local Run Manager and BaseSpace Sequence Hub have apps available for analysis. The DNA Amplicon Analysis App is available on BaseSpace Sequence Hub. Further analysis can be performed on any variant calls using BaseSpace Variant Interpreter. Local Run Manager has a similar DNA Amplicon Analysis Module which utilizes the same workflow and algorithm as the BaseSpace Sequence Hub Apps.

Can I detect indels and amplifications as in TruSight Tumor 170 workflow?

The UMI Reagents are not paired with a variant caller. A customer-supplied variant caller can be used to detect indels and amplifications.

How is the UMI Error Correction Local App installed?

The UMI Error Correction Local App User Guide describes the system requirements and App usage. To obtain the local application, contact your Sales Specialist or Field Bioinformatics Specialist.

Does having "hotspots" in my panel mean that only those variants can be detected?

No. The hotspots are variants that have been tested and can be detected using the panels and the Illumina DNA Amplicon analysis. Other variants within the targeted region may be detected as well.

Do I need a separate manifest file for the AmpliSeq Sample ID Panel for Illumina?

Sample ID manifest information is built into Use BaseSpace DNA Amplicon 2.0 and Local Run Manager DNA Amplicon Analysis Module v1.1. User only needs to click the sample ID option for analysis. Additionally, the sample ID manifest file is available in the AmpliSeq for Illumina Panel Downloads section in the support website.

What is the allele frequency of the variants that can be detected?

The TruSight UMI Reagents reduce the inherent sequencing error rate below 0.007% which is intended to enable variant detection down to 0.4% with sufficient median target coverage.

Note: Since the UMI Reagents and error correction software do not include a variant caller, variant detection of 0.4% cannot be assured. For more information, see the Error Rate Reduction with the TruSight Oncology UMI Reagents for Variant Detection Tech Note.

Does the UMI Error Correction software work with the RNA content from the TruSight Tumor 170 probes?

The TruSight Tumor 170 RNA content has not been tested. However, this probe set would not be expected to pull down fusions from DNA.

What tools are offered for data analysis?

The RNA Amplicon analysis workflow can be used for differential expression analysis. The RNA Amplicon Analysis App is available on BaseSpace Sequence Hub. Local Run Manager has a similar RNA Amplicon Analysis Module which utilizes the same workflow and algorithm as the BaseSpace Sequence Hub Apps.

How will I need to change my bioinformatics pipeline analysis?

For more information, contact Illumina Technical Support.

How is the measure for on-target bases (specificity) determined?

On-target bases shows the percentage of total sequenced bases that map to target regions in the reference genome. This metric reflects the percentage of bases from amplicons that a) were designed, synthesized, and pooled and b) generated sequence data mapping to the target regions.

How do I change analysis parameters?

BlueFuse Algorithm settings can be modified by going to Tools | Array Configuration | Array Algorithm settings.

For more information on the customizable algorithm settings, see the Array Algorithm Settings section of the BlueFuse Multi Software Guide.

Can FFPE samples be analyzed with the CytoSNP-850K Array?

FFPE samples can be run on CytoSNP-850K with the Infinium FFPE QC Kit and the Infinium FFPE Restore Kit.

FFPE-specific manifest and cluster files can be downloaded on the CytoSNP-850K support page.

How do I batch import CytoSNP-850K data?

A Batch Import file can be created using the CytoSNP-850K Lab Planner tool. Once the batch import .txt file has been generated, go to File|Batch|Batch Import, locate the batch import file, then click Open.

More information can be found in the Import 24sure and BeadArray Experiments section of the BlueFuse Multi Software Guide.

How do I analyze data with a different cluster file in BlueFuse Multi?

To analyze data with a different cluster file in BlueFuse Multi, the .gtc files will have to be regenerated from the raw .idat scan files. This can be performed with the Beeline Software.

What do orange dots mean in the B Allele Frequency plot?

An orange SNP dot in the B Allele Frequency plot indicates that the genotype is a "No Call" due to not being called as AA, AB, or BB during genotype analysis. A SNP designated NC does not mean the data point is not informative for the analysis, as these SNPs are included in the calling algorithm.

How do I mask SNPs?

A mask file can be created in the standard BED file format to exclude genomic regions from region calling and visualization. For more information about genome masking in BlueFuse Multi, see the Genome Masking section in the BlueFuse Multi Software Guide.

How do I perform duo or trio genotype analysis?

Introduced in BlueFuse Multi v4.4, new views are available that allow users to perform Duo and Trio genotype analysis using Illumina BeadChip experiments. Detailed information can be found in the Genotype Analysis section in the BlueFuse Multi Software Guide.

I have questions related to the MiSeq Reporter software.

The MiSeq Reporter analysis software processes base calls generated on-instrument during the sequencing run by Real Time Analysis (RTA) software, and produces information about alignment, structural variants, and contig assemblies for each genome requested and each sample based on the analysis workflow specified in the sample sheet.

I have questions related to the Basespace Sequence Hub (BSSH).

BaseSpace Sequence Hub is the Illumina cloud computing environment for analyzing sequencing data. You can stream data directly from your sequencing run from HiSeq, NextSeq, MiSeq, and MiniSeq systems.

Where can I find my alignment files (eg, BAM files) from my analysis of RNA panels containing fusions?

Illumina software packages, including BaseSpace Sequence Hub Apps, do not provide alignment files as output from the analysis. At this time, only the final reporting of the results from the analysis are provided. For more details, consult the software's documentation.

Is there any information about potential false negatives or uncalled fusions from analysis of RNA panels containing fusions?

No. The software only reports detected fusion events. For information on which gene pairs are evaluated for your panel, see the panel's data sheet.

Are there non-encrypted manifest files available for my RNA panels (custom or fixed) containing fusions?

No. Manifest files for any RNA panel containing fusions are unavailable in a non-encrypted format. Only the encrypted manifest file is available.

Where can I find the breakpoint details for fusion panels (custom or fixed) included in the design?

Information about exact breakpoints contained in all RNA fusion panel designs is not provided. The result files produced by Illumina software analysis tools provide details of any RNA fusion events identified by the software. For information on which gene pairs are evaluated for your panel, see the panel's data sheet.

What Illumina analysis software can I use for TruSight Oncology 500 libraries?

For a list of compatible analysis software, see the TruSight Oncology 500 compatibility page.

Which server processors are best for analyzing my data with the local app?

Use SandyBridge, an equivelant, or better. Older processors do not have required data compression instruction sets and might cause analysis to fail.

Can I run the local app for DNA analysis concurrently with the local app for RNA analysis?

No. Because of software constraints, run the TruSight Oncology 500 Local App for DNA separately from the TruSight Tumor 170 Local App for RNA.

What feature improvements are in included in the Trusight Oncology 500 Local App v2?

Bioinformatic analyses for the following:

  • RNA libraries
  • FASTQ generation
  • Variant annotation
  • Read alignment and read stitching

Other improvements:

  • Increase in overall fusion sensitivity and specificity
  • Increase in microsatellite instability (MSI) algorithm accuracy
  • Decrease in analysis time

Can I run the TruSight Oncology 500 Local App v1 for DNA analysis concurrently with the TruSight Tumor 170 Local App for RNA analysis?

No. Because of software constraints, run them separately.

Are there any changes in the TruSight Oncology 500 ctDNA software algorithm used to call small variants (SNV, indel, MNV) that are different from TruSight Oncology 500?

The baseline used for background subtraction of noise was regenerated with cfDNA samples to account for differences in the intended sample types. Additionally, variant calling sensitivity thresholds were modified to enable detection/calling of lower allele frequency variants.

How many runs can a DRAGEN server analyze at one time?

Only one run can be analyzed at a time on a DRAGEN server.

What Illumina analysis software can I use for TruSight Oncology 500 ctDNA libraries?

The TruSight Oncology 500 ctDNA analysis workflow uses an off-instrument analysis software run on the DRAGEN server that was developed to analyze data produced by the TruSight Oncology 500 ctDNA assay. This analysis software generates sample outputs which include high level sample metrics, variants detected, TMB (tumor mutational burden) and MSI (microsatellite instability) scores.

A BaseSpace Sequence Hub evaluation app is also available. This application is intended for assessment use only and Illumina has no obligation to provide technical support for this App. Access will be limited to 30 days.

For a list of compatible analysis software, see the TruSight Oncology 500 ctDNA compatibility page.

How long does the TruSight Oncology 500 ctDNA analysis take on a DRAGEN server?

Approximately 18-22 hours for S4 runs with 24 libraries and approximately 9-12 hours for S2 runs with 8 libraries.

Can I change the settings of the Trusight Oncology 500 ctDNA analysis pipeline to detect larger indels?

No, the algorithm settings cannot be changed.

How are DNA fusions detected from DNA samples?

TruSight Oncology 500 ctDNA enrichment probes were designed to tile across both exonic and selective intronic regions to capture potential breakpoints. Some intronic regions are difficult to adequately cover as they can be very large and would require probe coverage that would dramatically increase panel size and reduce throughput. Therefore, only select gene fusions will be reported from the TruSight Oncology 500 ctDNA assay.

Can I analyze TruSight Oncology 500 High Throughput libraries on the NextSeq 500/550 with the Local Run Manager TruSight Oncology 500 Analysis Module?

No, the Local Run Manager TruSight Oncology 500 Analysis Module is not compatible with TruSight Oncology 500 High-Throughput libraries.

How long does the TruSight Oncology 500 High Throughput analysis take on a server with the minimum recommended specifications?

Analysis time depends on the number of nodes used for analysis. Refer to the TruSight Oncology 500 High Throughput Compatible Products page for an example of analysis times for an S2 run.

Can I change the settings of the Trusight Oncology 500 analysis pipeline to detect larger indels?

No, the algorithm settings cannot be changed.

Can I sequence TruSight Oncology 500 High Throughput libraries on the NextSeq 2000?

No, the TruSight Oncology 500 products are not compatible with sequencing on the NextSeq 2000.

How do I obtain the TruSight Oncology 500 Local App v2.1?

Customers may request a link to download the software directly from Illumina Technical Support or from your Illumina representative.

Can I analyze TruSight Oncology 500 High Throughput data with my own software pipeline?

Customers can use their own pipeline. However, Illumina does not provide resources to facilitate this.

What raw data files are included for long-term storage?

BCL, FASTQ, and VCF files are included for long-term storage.

Where I can find the raw sequencing data from a run (such as FASTQ or VCF files)?

When a sequencing run is complete, there will be an output folder containing all of the sequencing files. The path for this can be found in the "Sequencing Information" tab of Local Run Manager. In that path, there is an "Alignment_#" folder that contains a zipped file. After unzipping the file, the VCF and BAM files will be available. FASTQ files are available on the local drive (on the MiSeqDx computer). The default path for this is D:\Illumina\MiSeqAnalysis\{RunID}\Data\Intensities\BaseCalls.

How do I use the Enrichment App v3.1 with a custom enrichment panel from a third-party vendor?

When using a custom panel, use the Enrichment Manifest File Template to edit the target content and upload into BaseSpace Sequence Hub for analysis. See Enrichment v3.1 Online Help for further instructions.

What is the maximum number of samples supported per analysis on Enrichment App v3.1?

Enrichment App v3.1 supports up to 96 samples per analysis.

How do I select the analysis parameters in the Enrichment App v3.1?

See the Set Analysis Parameters section in the Enrichment v3.1 Online Help for more information.

Which human genome assembly is used in the Enrichment App v3.1?

Human genome assembly hg19 is being used for analysis in Enrichment App v3.1.

Does Enrichment App v3.1 support Copy Number Variant (CNV) analysis?

Yes, the Enrichment App v3.1 supports CNV analysis. However, CNV analysis was not tested with Illumina DNA Prep with Enrichment.

What is the recommended analysis workflow?

Illumina recommends the Enrichment App in BaseSpace Sequence Hub. For more information on how to use the Enrichment App, see the Enrichment App Online Help.

What analysis options are available on the NextSeq 1000/2000 System?

Analysis is available through the on-board DRAGEN Bio-IT Platform or on the cloud using BaseSpace Sequence Hub. The following DRAGEN pipelines are available:

  • DRAGEN BCL Convert
  • DRAGEN Enrichment
  • DRAGEN Germline
  • DRAGEN RNA
  • DRAGEN Single Cell RNA

What sequencing modes are available on the NextSeq 1000/2000 System?

  • Cloud mode—Plan your run with Instrument Run Setup on BaseSpace Sequence Hub. The run is selected from a list of planned runs in NextSeq 1000/2000 Control Software. During sequencing, cBCL data are uploaded to BaseSpace Sequence Hub. The selected analysis workflow is initiated automatically within the cloud. Run data and analysis results are also provided in the cloud.
  • Hybrid mode—Plan your run with Instrument Run Setup on BaseSpace Sequence Hub. The run is selected from a list of planned runs in NextSeq 1000/2000 Control Software. The selected analysis workflow is then initiated through the on-instrument DRAGEN.
  • Local mode—Plan your run with a sample sheet v2 file format locally. The selected analysis workflow is initiated automatically through the on-instrument DRAGEN or manually through BaseSpace Sequence Hub apps after run completion when Proactive, Run Monitoring and Storage is selected.
  • Standalone mode—Plan your run without a sample sheet, following the instructions in NextSeq 1000/2000 Control Software to generate cBCL data.

What are the available application name options when creating a Sample Sheet v2?

The options are based on the available DRAGEN pipelines:

  • [DRAGENEnrichment_Settings] and [DRAGENEnrichment_data]
  • [DRAGENGermline_Settings] and [DRAGENGermline_data]
  • [DRAGENRNA_Settings] and [DRAGENRNA_data]
  • [DragenSingleCellRNA_Settings] and [DragenSingleCellRNA_data]

What BaseSpace apps can be used for analysis?

  • DRAGEN RNA Pipeline- Performs fusion analysis and RNA quantification.
  • RNA-Seq Alignment- Aligns RNA-seq reads. Quantifies gene expression, calls small variants and gene fusions. Input for differential expression app.
  • Cufflinks Assembly & DE-Quickly assess novel transcript isoforms and gene expression levels from RNA-Seq Alignment results.

What is the analysis workflow recommended for use with Illumina DNA PCR-Free, tagmentation Library Prep?

Illumina DNA PCR-Free data can be evaluated with the DRAGEN Germline Pipeline. Other pipelines may be used but may provide divergent results.

What adapter sequences should be used with Illumina DNA PCR-Free libraries?

The adapter trimming sequence for both Read1 and Read1 is:

CTGTCTCTTATACACATCT+ATGTGTATAAGAGACA

[RPIP only] Are there demo data reports available for this panel?

Yes. Available on the IDbyDNA Explify RPIP App on BSSH. Includes a sample .pdf report and screen shots of file outputs.

[RPIP only] Which pathogens and antimicrobial resistance genes will be reported by the Explify RPIP BaseSpace App?

Results list those targeted respiratory pathogens that have at least the minimum level of support that indicates they may be present in the sample. This decision is based on the pathogen sequence data generated and interpreted based on the number of supporting reads and machine learning.

For each of the listed pathogens, a Confidence Score is included based on this interpretation. Results also include normalized read and relative abundance information for detected pathogens. Lastly, antimicrobial resistance genes are listed together with the associated bacteria and antibiotics against which they can confer resistance.

How do I analyze the Infinium Mouse Methylation BeadChip data?

The GenomeStudio Methylation Module extracts beta values and provides clustering for Infinium Mouse Methylation BeadChip data. GenomeStudio controls normalization with the Mouse Methylation BeadChip is not supported at this time. Also, the BeadArray Controls Reporter software tool is not supported at this time.

Notably, there are many freeware applications in Bioconductor, such as SeSAMe, which provide enhanced normalization and visualization options. Illumina does not support these freeware solutions directly.

Why are SNPs included in the manifest of the Mouse Methylation BeadChip?

SNPs were included on the BeadChip so investigators could generate a DNA fingerprint of their samples as an added level of quality control. Specifically, for the Mouse Methylation BeadChip, SNP probes are included to help identify the strain origin of the DNA being analyzed. Refer to the Infinium HD Methylation Assay Reference Guide for further information. SNP assays on the BeadChip are not mentioned in the assay guide and only briefly described in the GenomeStudio Methylation Module.

What software do I use with Infinium Global Diversity with Enhanced PGx?

There are two software options available- Illumina Microarray Analytics- PGx Analysisfor cloud-based analysis and Illumina Microarray Analytics- Array Analysis CLI for local analysis.

Can I use Assign 2.0 to analyze data not generated by TruSight HLA kits or sequencing platforms that are not from Illumina?

Assign 2.0 software was developed to work with data generated using TruSight HLA kits and Illumina Instrument. Other sequencing kits and platforms have not been tested and are not supported. Assign 2.0 is compatible with FASTQ files generated from both the TruSight HLA Sequencing Panel (version 1) and TruSight HLA v2 Sequencing Panel.

What is the Mitelman Database?

The Mitelman Database is a reference for common fusions.

How do I analyze the Infinium MethylationEPIC array data?

The GenomeStudio Methylation Module extracts beta values and provides clustering and normalization for Infinium MethylationEPIC array data. In addition, there are many freeware applications in Bioconductor, such as chAMP and RnBeads, which provide enhanced normalization and visualization options. Illumina does not support these freeware solutions directly.

Is the TruSight One assay compatible with BaseSpace Sequence Hub?

Yes, BaseSpace Sequence Hub can be used for the analysis of TruSight One runs. Illumina recommends using the Illumina Experiment Manager, and selecting Targeted Sequencing and the subsequent Enrichment workflow. Download VCF files from BaseSpace Sequence Hub projects to interpret variants in BaseSpace Variant Interpreter.

How does licensing work and how long is my software license valid?

The license key provided when ordering the kit expires 3 months from the order date. The license can be used for an unlimited number of samples, unlimited number of systems, and unlimited number of users. After the key has expired, previously saved projects (*.cgp) can be analyzed without an active license but new FASTQ files cannot be imported. To request a new key, contact Illumina Customer Service (customerservice@illumina.com).

I noticed that specificity controls are not included for mouse DNA samples. Why is this the case?

Our technical support teams and customers have noted that samples showing abnormal data in these control probes almost always show abnormal behavior in other sample independent control probes. As these controls add no additional information, Illumina decided to remove these probes.

Can Illumina provide any demonstration data that can help me get started with analysis for the Mouse Methylation BeadChip?

The product files section of the Infinium Mouse Methylation BeadChip Support Site contains a number of IDAT files that can be downloaded and analyzed using GenomeStudio and other third-party analysis software.

The IDAT files include those for DNA input titrations for B16 and NIH3T3 cell lines, replicates for ApcMin and Mlh1 -/- strain derived tumor DNA, DNA derived from mouse spleen and liver tissue, samples with an increasing amount of DNA methylation, and samples where different proportions of human and mouse DNA have been mixed prior to application on the Mouse Methylation BeadChip.

For more information, see the Infinium Mouse Methylation Demo Dataset in Product Files.

It seems that there are human and mouse controls on the Mouse Methylation BeadChip. Why were human controls included on the array and how can I distinguish between human and mouse controls?

Specific studies, like work with patient-derived xenograft samples, usually result in mixtures of human and mouse DNA. We included human and mouse controls so that researchers could assess the performance integrity of DNA from both species on the same BeadChip. Human control probes have an addendum to their name, ie, HSA, while mouse control probes have a similar addendum noted as MUS.

Periodically, the genome build for a given organism is updated. How would such an update impact the Mouse Methylation BeadChip manifest?

Certain fields in the manifest may change in the manifest if it is updated for a new genome build. For instance, the 'Strand' Forward/Reverse column is dependent on Genome Build and may change based on new positional assignments.

I noticed that the ILMN_ID for probes has changed. What do the numbers and letters on the ILMN_ID refer to?

The IlmnID is a composite of multiple information fields: the name of the probe (locus target identifier, followed by an 8-digit number), whether the probe targets the top or bottom strand (denoted as "T", or "B", respectively), whether the probe targets the bisulfite converted strand or complementary strand after amplification (denoted at "C" or "O", respectively), the Infinium probe design type (Type I probes are denoted as "1", while Type II probes are denoted as "2"), and the number of times the probe was synthesized for array representation (denoted using a numeric number that is greater than zero).

Examples of locus target identifiers are shown below:

  • CG probe = cg
  • CHG probe = ch
  • SNP probe = rs

Regarding an example of an IlmnID: cg12345678_TC13.

This probe would be a CG probe with an eight-digit code that relates to the 122mer probe sequence. The eight-digit code and suffixes are independent of genome build revisions. The following "T" indicates that the probe targets the top strand. The adjacent "C" indicates that the probe targets the strand that is initially bisulfite converted. The "1" indicates that the probe has a Type I Infinium design. The "3" indicates that the probe was synthesized three times for representation on the Microarrays.

For more information, see the Infinium Mouse Methylation BeadChip Manifest File Release Notes in Product Files.

What controls are included on the Infinium HTS iSelect Methyl Custom BeadChip?

The standard GenomeStudio and SNP controls that are available on the EPIC BeadChip are included.

How do I analyze the Infinium HTS iSelect Methyl Custom BeadChip data?

The GenomeStudio Methylation Module extracts beta values and provides sample clustering analysis for Infinium HTS iSelect Methyl Custom BeadChip data. The BeadArray Controls Reporter software tool is not supported at this time.

Notably, there are many freeware applications in Bioconductor, such as SeSAMe, which provide enhanced normalization and visualization options. Illumina does not support these freeware solutions directly.

How long does it take from the start of the run until I have cluster density metrics?

Depending on cluster density, metrics appear at the beginning of cycle 20. For MCS v2.2 or earlier versions, metrics appear at the beginning of cycle 7.

Why is GC high in the first few bases?

It is normal to observe both a slight GC bias and a distinctly non-random base composition over the first 12 bases of the data. For example, you might see it in the IVC (intensity versus cycle number) plots that are part of the output of Pipeline analysis software. In genomic DNA sequencing, the base composition is usually quite uniform across all bases, but in mRNA-Seq, the base composition is noticeably uneven across the first 10 to 12 bases. We believe this effect is caused by the "not so random" nature of the random priming process used in the protocol. This might explain why there is a slight overall G/C bias in the starting positions of each read. The first 12 bases probably represent the sites that were being primed by the hexamers used in the random priming process. The first 12 bases in the random priming full-length cDNA sequencing protocol (mRNA-seq) always have IVC plots that look like what has been described. This is because the random priming is not truly random and the first 12 bases (the length of two hexamers) are biased towards sequences that prime more efficiently. This is normal and expected.

What Illumina software is available for TruSeq ChIP data analysis?

Use CASAVA and MSR for demultiplexing. However, they are not intended for TruSeq ChIP analysis.

On the HiSeq, when does analysis of a sequencing run start?

Image analysis occurs in real time, phasing estimates and base calling start after cycle 12, and base calling and quality scoring starts after cycle 25.

What data compression options are available for high output runs using HCS v2.2?

Two data compression options are zipping of BCL files and binning of Q-scores. Other run folder files are unchanged. These options are available during run setup in HCS v2.2. If you are using BaseSpace for data storage and analysis, BCL files are zipped automatically. Due to the size of the run folder with the extra cycles and shorter run durations, zipped BCL files are required for HiSeq v4 runs. This setting cannot be turned off. You can select or deselect Q-score binning depending on your preference.

What does it mean if you have a fusion detected but there are 0 reference reads for Gene 1 or 2?

Reference reads refer to reads/read-pairs that support structurally normal genes at that fusion breakpoint. The 0 means that no evidence of structurally normal genes is found the RNA-Seq data; this is distinct from whole gene read counts, which are shown in the last table of the report.

Additional rearrangements and duplications that occur in concert with the fusion could also affect this number.

Is the Assign software validated?

Assign has undergone standard software development and testing requirements applied to all Illumina software tools developed for Research Use Only.

The Assign software has not been submitted for regulatory approval. The software has not undergone analytical validation or clinical validation required for software tools that are regulated as a medical device. Therefore, it is important that you follow procedures for validation of the software according to your Institution, Local, State, and Federal guidelines.

Automation

Which automated liquid handlers can be used?

In general, Illumina DNA Prep is designed to be compatible with most automated liquid-handling systems.

All Illumina automation partners are developing methods for at least one existing automation platform. Contact the individual partners for more information. To view a list of partners, see High-Throughput Library Prep Automation.

Where can I find the available Illumina Qualified methods?

Methods designated "Illumina Qualified" are developed, distributed, and supported by our automation partners. Visit High-Throughput Library Prep Automation to learn more about the partnership program and partners.

What kit configurations are supported for automation?

Methods are written and tested by our automation partners to support both 24 & 96 sample kit configurations, including a full plate run with up to 96 reactions. The 96-sample kit is designed to be automation-friendly, including additional volumes necessary for automated processing.

Are the reagent fill volumes suitable for automation?

All reagents provide sufficient overage volume to support automation requirements in the 96-sample kit configuration. Additionally, the 96-sample kit configuration should support at least two runs of 48 samples each (2 runs x 48 samples).

We recommend utilizing Illumina Qualified* methods from our partners, benefits include:

  • Reduced development & implementation costs.
  • Utilize tested & standardized methods.
  • Choose solutions from leading vendors in laboratory automation.

*Illumina Qualified indicates that libraries prepared with this method have been shown to perform comparably to those prepared manually.

Does the tubed index configuration support automation?

No, automation is only supported by the 24-plex and 96-plex index plates provided by Illumina DNA Prep.

Can Illumina DNA PCR-Free library preparation be automated?

Yes, the kit is designed to be automation compatible. See the Illumina Introduction to Sequencing Library Prep Automation for more details

Compatibility

Is the GenomeStudio ChIP-Seq Module compatible with CASAVA 1.8?

No. CASAVA 1.8 generates BAM files which are not compatible with the current version of the GenomeStudio ChIP-Seq Module. Investigators using the ChIP-Seq Module should run CASAVA 1.7.

Can I have both VariantStudio v2.1 and v2.2 installed on the same computer?

Yes. VariantStudio v2.1 and v2.2 can be installed concurrently on the same computer. The two versions of VariantStudio operate independently; information is saved in different application spaces. VariantStudio v2.1 installs as "Illumina VariantStudio 2.1", while v2.2 installs as "Illumina VariantStudio 2.2" and does not overwrite the v2.1 installation.

Using the default installation folder for both versions guarantees concurrent usability. If you decide to define a custom installation path, do not use the same installation folder for both versions.

Which version of OLB do I need to process my TruSeq data?

Use Off-Line Base Caller v1.9, or later. Earlier versions cannot process controls.

Will I need to re-validate HCS 1.5/RTA 1.13 if upgrading from HCS 1.4/RTA 1.12?

Validation of HCS 1.5/RTA 1.13 should not be required if you are upgrading from HCS 1.4/RTA 1.12. Changes in RTA do not affect data quality, and changes in HCS updated the user interface to enable dual indexing. Refer to the HCS 1.5 Release Notes for additional information about new features in this software package.

Which version of Off-Line Basecaller do I use to reprocess dual-indexed data from CIF files?

Use Off-Line Basecaller (OLB) v1.9.3 or later to analyze data generated from a HiSeq 2000, HiSeq 1000, or HiScanSQ System.

If I am running a software version older than HCS 1.4.8, do I need to serially upgrade to 1.4.8 prior to installing HCS 1.5?

You can upgrade directly to HCS1.5/RTA1.13 from HCS1.4/RTA1.12. If you are running an older version of HCS, contact Illumina Technical Support for assistance in upgrading.

Does VariantStudio support TruSight Tumor data?

Yes. The Amplicon-DS workflow in MiSeq Reporter software is used for the alignment and variant calling of TruSight Tumor data. The resulting VCF files can be imported into VariantStudio. Illumina recommends that you use the merged VCF file from the Amplicon-DS workflow. In these files, the variant call information from both the forward pool and reverse pool has been merged.

Can I install Windows 7 updates on the MiSeq?

Updates to Windows are disabled. Illumina cannot guarantee system performance with future Windows 7 updates.

What is the support policy for VariantStudio?

Illumina supports the previous version of VariantStudio software for 12 months after a new version is released.

The content of the Annotation Database is aggregated from a broad range of external sources, including public and private sources. At the time of a release for the Annotation Database, the content is locked down and versioned.

Due to the nature of private annotation sources, our ability to maintain annotation content for the 12-month period might not be possible. In these rare occurrences, Illumina communicates the changes as soon as possible to reduce any potential impact on your pipeline.

What browsers are supported for MiSeq Reporter?

MiSeq Reporter can be viewed with the following web browsers: Firefox 13.0.1+, IE 11+, and Safari 5.1.7+.

Can I import VCF files produced by non-Illumina software products into VariantStudio?

Yes. However, if analysis software other than Illumina analysis software is used to generate data, the VCF file might not contain the columns required by VariantStudio. See the VariantStudio User Guide (document # 15040890) for information about VCF file requirements.

What software is required to sequence dual-indexed libraries?

Sequencing dual-indexed libraries requires upgrading to HCS v1.5/RTA v1.13, or later.

Category Software Instrument Required
Instrument Software and Sequencing Run QC HCS v2.0/RTA v1.17.20/SAV 1.8.20/Recipe Fragment v1.3.54, or later HiSeq or HiScanSQ Yes
Analysis CASAVA v1.8.2, or later HiSeq or HiScanSQ Yes
Sample Sheet Creation Illumina Experiment Manager v1.3, or later HiSeq or HiScanSQ Recommended

Will CASAVA 1.8.2 be able to process both single-index and dual-index TruSeq runs?

Yes. Specific input commands to CASAVA 1.8.2 determine whether the data are demultiplexed as single-index or dual-index. However, single and dual indexes cannot be combined in the same CASAVA command. Separate instances of CASAVA are required to process single-index and dual-index runs (i.e.: run CASAVA twice) and use 2 different sample sheets, and demultiplex the TruSeq HT lanes separately from the TruSeq LT/v2 lanes. Different use base masks are required to demultiplex different lanes. See the CASAVA User Guide for more information.

If I reannotate my samples from VariantStudio v2.1 projects in VariantStudio v2.2, can I keep both sets of annotations?

No. If samples are reannotated in VariantStudio v2.2, all annotations for variants are replaced with the annotations from the newer version of the Illumina Annotation Service. If you do not want to lose previous annotations for these samples, create a project for the samples that you want to preserve with the old annotations.

What versions of instruments, reagents, and software are compatible?

Go to the sequencing Version Compatiblity Reference at www.illumina.com/VersionCompatibility.

Can my sample report templates be used in other VariantStudio projects?

Yes. Report templates created and saved in VariantStudio are available to other VariantStudio projects.

Does VariantStudio run on a Mac or Linux system?

No. VariantStudio is tested and supported on Windows 7 or later. It is possible to use virtualization software such as Parallels or VMware to run a Windows application on a Mac or Linux. However, this method is an unsupported function. If you choose to run VariantStudio using a virtual system, allocate at least 2 GB of RAM to the system and close any other programs.

Can I use VariantStudio with data from other sequencing platforms?

No. The Illumina VariantStudio End User License Agreement (EULA) states that VariantStudio must be used solely to analyze data generated from an Illumina sequencing instrument.

If I continue to use VariantStudio v2.1, do I get the updated Illumina Annotation Service?

No. VariantStudio v2.1 is programmed to use annotations from the previous version of the Illumina Annotation Service. It does not connect to the new version that is available with VariantStudio v2.2. If you want to use the new version of the Illumina Annotation Service, use VariantStudio v2.2 and reannotate any samples in projects from VariantStudio v2.1.

Can I import VCF files produced by non-Illumina software products into BaseSpace Variant Interpreter?

Yes, the software accepts VCF files produced by the Genome Analysis Toolkit (GATK) v1.6.

Are my saved projects from VariantStudio v2.1 compatible with VariantStudio v2.2?

Yes, VariantStudio v2.1projects are compatible with VariantStudio v2.2. Samples, annotations, including any imported custom annotations, and filter favorites are all carried over to VariantStudio v2.2.

Design

How do I visualize my Nextera Rapid Capture Enrichment designed targets in a genome browser?

Use the DesignStudio table Export function to save designed regions, targets, and probes in *.bed file format. Exported *.bed files can be imported into a genome browser, such as the Broad Institute Integrative Genome Viewer (IGV), which can be downloaded and run on your local PC computer, or the online UCSC Genome Browser.

Note:

  • The UCSC Genome Browser may not accept very large *.bed files with many annotations.
  • Illumina recommends adding the track name annotation to your *.bed files when viewing multiple custom tracks, such as probes and gaps, in the UCSC Genome Browser. The *.bed files exported from DesignStudio do not contain a header with a defined track name. Therefore, when loaded into the UCSC Genome Browser, they appear with the generic track name of 'User Track' and description of 'User Supplied Track' and will overwrite any other previously uploaded unnamed tracks.

Which species are available in DesignStudio to create a Nextera Rapid Capture Custom Enrichment project?

The human genome (hg19 build) is available for custom enrichment projects.

How do I order probes for TruSeq Targeted RNA Expression?

Use DesignStudio to order probes. You will need a MyIllumina account.

Is there a limit to the amount of contiguous sequence that can be designed against in DesignStudio?

There is a 24 kb limit of contiguous sequence that can be entered for a given target region so as not to overwhelm the design server memory resources. For designs spanning large stretches of DNA, Illumina recommends Nextera Rapid Capture Custom Enrichment, a hybridization-based enrichment for targeted resequencing.

What is the success rate for TruSeq Targeted RNA Expression designs?

Illumina internal testing found that > 90% of nonvalidated assays correlate with RNA-Seq data. Assays marked validated have demonstrated results consistent with fold changes observed by RNA-Seq in testing across human tissues.

Which reference genome was used to generate probes?

UMD 3.1 of Bos Taurus.

What is the estimated genomic pull-down region from a probe designed in DesignStudio for Nextera Rapid Capture Custom Enrichment?

The estimated region is 280 bp total with the probe at the center.

Can I reorder the same TruSeq Targeted RNA Expression oligo pool multiple times?

Yes. Your orders are saved in DesignStudio and can be reordered or used for an add-on project. If you need a large number of samples (> 10,000), contact Inside Sales.

What is the amplicon size in the TruSight Acute Lymphoblastic 54 assay?

The TruSight Acute Lymphoblastic Leukemia 54 amplicons are 250 bp in length.

Why do my designed amplicons include SNPs in the probe regions when Avoid SNPs is turned on?

When Avoid SNPs is turned on, DesignStudio considers the location of polymorphisms and avoids them when possible. If this is not possible, it will place the probes in regions that do not interfere with their binding.

Is the TruSeq Targeted RNA Expression oligo design biased to the 3′ end of mRNA?

No, DesignStudio assays target splice junction sites across the entire transcript. Additional designs are available for cSNPs or gene fusions. The following databases were used for assay designs:

Species

Database

Human junction

hg19 + refseqs transcript model

Human cSNP

dbSNP build 137, common SNPs ≥1% MAF

Mouse junction

mm9, mm10 + refseq transcript model

Mouse cSNP

dbSNP build 128 for mm9, dbSNP build 137 (common SNPs) for mm10

Rat junction

rn5,rn4

Rat cSNP

dbSNP build 125 for rn4

Can I remove targets from Illumina predesigned Nextera Rapid Capture Enrichment panels?

The add-on workflow does not allow removal of targets from fixed content panels or previously ordered custom designed panels.

Is it possible to load a custom genome in DesignStudio for Nextera Rapid Capture Custom Enrichment?

No, it is not.

Which MiSeq reagent kits are recommended for use with TruSight HLA libraries?

The following three MiSeq reagent kits that support ≥500 cycles (2 x 250 bp):

Sequencing Kit Catalog Number TruSight HLA Samples Run Time (Hours)
MiSeq Reagent Kit Nano v2 (500 cycles) MS-103-1003 6 28
MiSeq Reagent Kit v2 (500 cycles) MS-102-2003 24 39
MiSeq Reagent Kit v3 (600 cycles) MS-102-3003 Untested 60

How many SNPs are targeted?

265 SNPs are targeted.

Are designs equivalent between TruSeq Custom Enrichment, Nextera Custom Enrichment, and Nextera Rapid Capture Custom Enrichment projects?

Probe design algorithms have been optimized in DesignStudio for Nextera Rapid Capture Enrichment for better coverage across target regions, leading to fewer design gaps. Additional target probe spacing options have also been added for Nextera Rapid Capture Enrichment designed content.

How does DesignStudio select probes?

Optimal probes are chosen using an algorithm that considers melting temperature (Tm), % GC, length, secondary structure, uniqueness in the genome, and the presence of underlying SNPs (based on dbSNP). For more information, see the DesignStudio online help.

Will variants be called across the entire length of the sequenced amplicon?

Variant calling is only performed in the amplicon region lying between the upstream and downstream probe locations. The probe regions are not included in variant calling. MiSeq Reporter uses the manifest file for each design to avoid calling variants in probe regions that are synthetic oligonucleotides and not biologically relevant.

What do the different target Desired Probe Spacing options represent, when designing probes across my Nextera Rapid Capture Custom Enrichment targeted regions?

The Desired Probe Spacing setting allows you to adjust the way DesignStudio places enrichment probes within the region of interest. Increasing this setting allows DesignStudio to place probes more closely to one another within a given region. The target center-to-center spacing for neighboring 80 mer probes at each setting is:

  • Standard—230 bp
  • Intermediate—180 bp
  • Dense—120 bp
  • Adjacent—80 bp
  • Overlapping—60 bp
Note that altering the Desired Probe Spacing setting allows the DesignStudio algorithms to place probes according to the spacing listed, but does not guarantee this spacing. DesignStudio does not place probes that are incompatible with other design rules just to achieve desired spacing.

How can I improve the designability and coverage in my DesignStudio Project?

  • Increase the size of the target. Increasing the size of the target to design against can rescue regions that appeared to be undesignable. The increased size of a target gives DesignStudio a little more flexibility to fit a higher scoring amplicon over the desired target bases.
  • Use the Avoid SNP feature. With this option, DesignStudio is more aggressive when placing amplicons by putting less importance of the location of SNPs in the design process. This can help rescue undesignable regions if DesignStudio avoids regions with low frequency or poorly annotated SNPs in the database, leading to low designability. DesignStudio uses dbSNP 138 or 1000 Genomes (human hg19), SNP 128 (mouse mm9), SNP 125 (rat rn4), and Ensembl UMD3.1 SNPs (bovine) to position probes that avoid SNP and indel locations to improve accuracy and performance of the design for that region.
  • Change the context of the panel. For example, putting a highly homologous or high GC rich target sequence into the same multiplex design can be problematic for designing probes to amplify each target discretely. Moving problematic regions into a separate design can frequently improve the designability.

I see a design gap in my Nextera Rapid Capture Custom Enrichment target region. Is there anything that I can do about this?

Design gaps can occur due to complex genomic regions, low probe specificity, skewed GC content, and other technical factors inherent to probe-based enrichment. The gap can be resubmitted for design as a New Target Region using the gap start and stop chromosome coordinates.

How does oligo design work for TruSeq Targeted RNA Expression? Can I design custom oligos to any gene of my choosing?

The TruSeq Targeted RNA Expression assay consists of predesigned probes. Once a gene is selected, DesignStudio provides a list of predesigned probes. Due to the predesigned nature of these probes, DesignStudio design time is negligible. Options for probes include:

  • Transcript specific
  • 5' end
  • 3' end
  • cSNP

How does Illumina define a validated assay?

Assays marked validated in DesignStudio have been shown to give results consistent with fold changes observed by RNA-Seq in testing across human tissues.

How long are the upstream and downstream probe sequences?

The length of the upstream and downstream probe design can vary between 22 bp and 30 bp, regardless of the size of the amplicons being designed.

Is there a reason my favorite gene is not included in the TruSeq Targeted RNA Expression predesigned probe list? Will it be added in the future?

Every effort was made to include as many transcripts as possible. For human and mouse transcripts, refseq transcript models were used together with the hg19 and mm9 +mm10 genomes, respectively.

How is Target Region Coverage calculated in DesignStudio?

Target Region Coverage is calculated as the proportion of bases in the requested Target Region that the designed amplicon sequences. The region that the designed amplicon sequences includes the nonvariable flanking probes and sequence in between.

Can I add custom genomic content to Illumina predesigned enrichment panels with Nextera Rapid Capture Custom Enrichment?

Yes, the add-on workflow in DesignStudio for Nextera Rapid Capture Custom Enrichment allows you to add 350–67,000 custom probes to Illumina pre-designed enrichment panels (approximately 80 Kb–15 Mb of custom genomic content). Get started designing your project in DesignStudio.

Is there an option to avoid indels in the design process?

Yes. The Avoid SNPs design option also avoids indels from the given variant database being used when possible (eg dbSNP for human designs).

For important regions with known indels, Illumina recommends inspecting the probes using the linked UCSC tracks per item in the grid. In certain cases, probes overlapping indels can be the best or only option to provide coverage of that region.

How do I designate dual indices on a TruSeq sample sheet?

Designating dual indices on the sample sheet depends on whether a MiSeq or CASAVA sample sheet will be used. For MiSeq sample sheets, each index is entered into its own column. For CASAVA sample sheets, the indices are input in the format of "Index1-Index2". See the MiSeq Sample Sheet Quick Reference Guide or CASAVA User Guide for more information.

Illumina recommends that you use the Illumina Experiment Manager (IEM) to generate your sample sheet. The appropriate index format is automatically entered based on the selected sample sheet type.

What is the estimated success for sequencing custom targets with Nextera Rapid Capture Custom Enrichment?

While coverage depth of individual target regions can vary, all targeted regions are represented in the final enriched library.

A very low percentage of targets might not be represented in the final enriched library due to complex genomic regions, low probe specificity, skewed GC content, and other technical factors inherent to probe based enrichment and SBS. DesignStudio includes design warnings to highlight probes with an elevated risk of underperformance.

What criteria were used for the design of human cSNP assays and what database was used for TruSeq Targeted RNA Expression?

Human dbSNP 135 was used for the designs and the following criteria were considered for cSNP assays:

  • cSNP must be common (ie, those with ≥ 1% MAF have been included)
  • There should be no other cSNP in the probe design area
  • Only 1 SNP per target is allowable
  • cSNP must be covered in all transcripts to a gene

What genes are covered in the TruSight Myeloid assay?

The TruSight Myeloid Sequencing Panel targets 54 genes (either full coding sequence or targeted exons) that are frequently mutated in myeloid malignancies. Refer to the TruSight Myeloid Sequencing Panel Data Sheet for a full list of genes.

If I choose to design my project with SNP avoidance turned off, what effect will it have on my data?

If an upstream or downstream amplicon primer region overlaps an actual variant in the input genomic DNA, the assay can experience reduced specificity and/or uniformity relative to other amplicons in the multiplexed reaction.

What is a TruSeq Targeted RNA Expression fixed content project?

Fixed content projects are presynthesized oligo pools that you can order. The pools have been wet-bench tested and designed against various cellular and disease-causing pathways.

How do I know if I should be concerned about the Nextera Rapid Capture Custom Enrichment design in my targeted regions?

DesignStudio identifies Design Warnings to alert you to designs that are targeting areas of the genome that carry a higher performance risk. These warnings (e.g., low specificity, poor GC content) are displayed in the Regions, Targets, and Probes tables in DesignStudio.

What is the amplicon size in the TruSight Lymphoma 40 assay?

The TruSight Lymphoma 40 amplicons are 150 bp in length.

How many transcripts can I target with TruSeq Targeted RNA Expression?

Your order can contain 12–1000 probes.

What is the amplicon size in the TruSight Myeloid assay?

TruSight Myeloid amplicons are 225–275 bp in length.

How long does it take to ship Nextera Rapid Capture Custom Enrichment reagents after placing the order in DesignStudio?

Typically, it takes 3 –4 weeks to manufacture and ship a Nextera Rapid Capture Custom Enrichment Kit order.

How can I share my Nextera Rapid Capture Custom Enrichment design content with my collaborator?

Custom target designs can be shared by saving Target Regions from your DesignStudio project to a *.csv file using the Export function. The exported *.csv file can be sent to your collaborator and imported into a new DesignStudio project, using the file upload method, to view the Target Regions of interest. Probe sequences can also be shared using the Export function of the Probes table.

Can I change the design strand for individual targets or probes in Nextera Rapid Capture Custom Enrichment?

No. DesignStudio uses only the forward strand for designing Nextera Rapid Capture Custom Enrichment assays for optimal performance across the probe panel.

How long are the amplicons?

The amplicons are 120–170 bp.

What DNA strand is used to design my targets in DesignStudio for Nextera Rapid Capture Custom Enrichment?

The forward strand is used.

How much genomic content can I target in my Nextera Rapid Capture Custom Enrichment project?

Nextera Rapid Capture Custom Enrichment is available for designs that include 2000 –67,000 total probes. Depending on probe density within the designed regions, this represents about 0.5 –15 Mb of genomic content.

What is the benefit of increasing the Desired Probe Spacing in targeted regions of my Nextera Rapid Capture Custom Enrichment DesignStudio project?

Increasing the Desired Probe Spacing setting (from Standard toward Overlapping) increases the density of probes designed across the submitted target region. For most designs, this means there are more total probes within the region of interest and the likelihood of enriching that region is increased.

What is the minimum gap size required between adjacent target regions in DesignStudio?

A gap equivalent to a maximum amplicon size is required between target regions. Regions with a smaller gap size are merged to improve design performance and prevent unfavorable probe to probe interactions.

The following table shows the maximum and minimum amplicon lengths for a given amplicon size setting.

Setting

Minimum

Maximum

150

125

190

175

170

190

250

225

275

425

400

450

What is the amplicon size in the TruSight Multiple Myeloma 42 assay?

The TruSight Multiple Myeloma 42 amplicons are 250 bp in length.

What is a TruSeq Targeted RNA Expression add-on project?

An add-on project uses a fixed content pool, which is either a fixed or custom panel, that you can add 12–1000 custom targets to.

How many times can I add additional content to a previously ordered Nextera Rapid Capture Custom Enrichment panel?

Additional target regions can be added to exiting content panels as many times as you like.

Why are some targets difficult to design in DesignStudio?

Homologs: Having homologs in the same design can lead to low designability. Split homologs into separate CAT pools.

GC Content: Regions with greater than 80% GC content can be difficult to design against, particularly when these regions are greater than 500 bp in length.

Homopolymer Sequences and Repetitive Elements: DesignStudio avoids these regions to make sure that probes have better specificity in the genome.

Poor Specificity: DesignStudio will assess the specificity of probes and exclude those which will not provide satisfactory on-target coverage.

Why are the number of primer pairs per pool indicated on the tube and box labels different than the number of amplicons per pool indicated in DesignStudio?

The number of amplicons per pool in DesignStudio reflects the number of unique amplicons in each pool. The number of primer pairs per pool on the tube and box labels reflects the total number of oligos per pool. Either value can be used when preparing libraries according to the AmpliSeq for Illumina On-Demand, Custom and Community Panels Reference Guide (Table 4. X cycles and X minutes). If the values fall into different cycle categories, the higher PCR cycle number is recommended.

What is the maximum number of genes I can order in an On-Demand panel?

We have set an ordering maximum of 500 genes or 15,000 amplicons per panel due to manufacturing restrictions. We are always making improvements, so this limit is likely to increase. You may be able to order larger designs in the future.

What is the minimum number of genes I can order in an On-Demand panel?

We've set an ordering minimum of 1 gene or 24 amplicons per panel. Designs must also have at least 2 pools and 12 amplicons per pool.

Do On-Demand panels support UTR-only genes? What about pseudogenes?

No. On-Demand panels only support genes containing CDS regions. Pseudogenes are not supported.

What annotation source and version is used to recognize gene symbols when creating an On-Demand Panel?

Illumina uses RefGene v74 as the source of annotations.

Have all possible gene combinations been tested for primer-primer interactions?

No. The number of possible combinations is astronomical. It is not feasible to test for all possible combinations in the lab. However, through computer-based searches, we have reduced the occurrence of primer-primer interactions as much as possible. In addition, when synthesizing many genes simultaneously in large batches, we have observed less than 1% amplicon drop-out due to suspected primer-primer interactions.

How do I set stringency settings in DesignStudio for my Spike-In panel?

AmpliSeq for Illumina On-Demand panels are selected from a pretested catalog of genes, the sample type, max amplicon length, and stringency are optimized and preset for your convenience.

What is "Gene Amplicon Uniformity"?

Gene amplicon uniformity is the percentage of amplicons for a gene with greater than 0.2 times the mean coverage of all amplicons targeting that gene. It represents the observed wet-lab uniformity calculated from NextSeq data with the Illumina DNA Amplicon workflow.

Are untranslated regions (UTRs) included in an On-Demand gene's design?

No, only the coding DNA sequence (CDS) region of a gene is included as part of an On-Demand gene design.

What is the padding used for On-Demand gene designs?

The padding for every On-Demand gene design is 5 bp on the 5′ and 3′ ends of the exon.

What is the best library prep kit for my application?

Determine the best kit for your needs based on your project type, starting material, and method of interest using the Library Prep and Array Kit Selector tool.

My coverage was insufficient per sample.

Illumina provides an online Sequencing Coverage Calculator tool that calculates the reagents and sequencing runs needed to arrive at the desired coverage for your experiment, based on the Lander/Waterman equation.

For more information about calculating coverage estimates, see Estimating Sequencing Coverage.

My custom libraries did not cluster.

There are several possible reasons why your custom libraries did not cluster. For more information on the indexing strategies for each instrument, refer to the Indexed Sequencing Guide.

  1. Incompatible adapter sequences. You can research appropriate sequences with the Illumina Adapter Sequences document. This document provides the nucleotide sequences that comprise Illumina oligonucleotides used in Illumina sequencing technologies. These sequences are provided for the sole purpose of understanding and publishing the results of your sequencing experiments.
  2. The primers were not in the appropriate wells. Refer to the MiSeq System Custom Primers Guide. You may also want to spike in your custom primers. For more information, see the bulletin Spiking custom primers into the Illumina sequencing primers.
  3. All libraries prepared with the current Illumina library preparation kits are compatible with all Illumina sequencing platforms. Sequencing libraries prepared with non-Illumina library preparation methods may require additional optimization on different sequencing platforms. For more information, see the bulletin Considerations when migrating non-Illumina libraries between sequencing platforms.

What is the typical MiSeq output run folder size?

For more information, see the bulletin Approximate sizes of sequencing run output folders.

Can I use custom primers on the MiSeq?

Refer to the MiSeq System Custom Primers Guide.

You may also want to spike in your custom primers. For more information, see the bulletin Spiking custom primers into the Illumina sequencing primers.

How many rows does the target region file input method in DesignStudio allow?

The maximum number of rows allowed is 1000. You can submit multiple files per project.

[RPIP only] Is whole genome information for the microorganisms detected by this panel?

Whole genome information is provided for SARS-CoV-2 and Influenza A viruses only and targets partial genomes for the other respiratory pathogens.

[RPIP only] What is the recommended read length?

Read lengths of 2x76 and 1x75, 1x101 have been tested extensively.

File Format

Does MiSeq Reporter recognize *.fasta or *.fa?

Yes, MiSeq Reporter recognizes both extensions.

What export formats are available from Illumina VariantStudio?

In VariantStudio, you can export information displayed in the Variants table, which includes variants and associated annotations in TSV file format. Sample reports are generated in either RTF or PDF file formats.

What genomes and databases are used for alignment and variant detection?

The MiSeq comes with pre-installed databases, which include miRbase, dbSNP, and refGene. Also included are eight genomes: arabidopsis, cow, DH10b, hg19, mouse, rat, yeast, and s. aureus. You can upload your own references in fasta format (*.fasta or *.fa).

What are the requirements for importing whole genome data or large gVCF files?

Batch uploads from a network or local directory are limited to 100 files or 10 GB, whichever is greater. There are no limits on importing files from BaseSpace Sequence Hub.

Does BaseSpace Variant Interpreter support the genome VCF (gVCF) file format?

Yes. The software supports VCF and gVCF file formats. For more information, see the BaseSpace Variant Interpreter Online Help.

How should I specify the splice junction set?

As of CASAVA v1.7, eland_rna uses the refFlat.txt.gz or seq_gene.md.gz file to generate the splice junction set automatically.

Which export formats are available from BaseSpace Variant Interpreter?

You can export reports in PDF file format.

Is the version of MiSeq Reporter software used for analysis recorded in the run folder?

The MiSeq Reporter software version can be found in the following files located at the root level of the run folder: the log file AnalysisLog.txt, the CompletedJobInfo.xml file, and the workflow-specific results file (e.g. ResequencingRunStatistics.xml).

What are the requirements for importing whole genome data or large gVCF files?

Importing large gVCF files requires more memory and takes longer to import. Use the following information as a guideline to estimate required RAM. This information is based on a Quad-Core Xeon processor with 16 GB RAM.

  • Exome: 500-800 MB RAM per annotated exome
  • Whole genome: 3.5 GB RAM per annotated whole genome, 4 GB RAM recommended in addition to system memory. For example, if you are importing three annotated whole genome files, you need at least 12 GB but 16 GB is recommended to account for system memory.
  • Whole genome gVCF without hom-ref positions: 4.0 GB RAM per annotated whole genome in addition to system memory. VariantStudio imports non-hom-ref positions but it takes longer than 6 hours to go through all 300-400 million lines in the gVCF file. If you are not using the hom-ref import option, Illumina recommends using standard VCF files.
  • Whole genome gVCF with hom-ref positions: VariantStudio populates a maximum of 10 million rows in the Variants table. Therefore, it is not possible to import all positions. When all 10 million lines are populated, > 5 GB RAM without annotation is required. With annotation, 6-7 GB RAM is required. Alternatively, you can pre-process gVCF files for regions of interest before importing to VariantStudio.

Why can't I find my ChIP-Seq data when I start a ChIP-Seq project in GenomeStudio?

The project creation wizard initially looks in the data repository directory, not the directories containing individual runs or projects, so you must point it at the correct directory. If there are sorted.txt files in multiple locations, such as \chip-data\run1\GERALD\sorted.txt and \chip-data\run2\GERALD\sorted.txt, then you should direct the wizard to \chip-data. It will search through any subfolders and display available runs.

What is SRF?

SRF (Single Read Format) is a generic format for DNA sequence data. The format is defined at http://srf.sf.net.

Does VariantStudio support the genome VCF (gVCF) format?

Yes. VariantStudio v2.1 and v2.2 support VCF and gVCF file formats. For more information, see the VariantStudio User Guide.

Can the MiSeq generate FASTQ files?

FASTQ files are generated during secondary analysis by MiSeq Reporter for most analysis workflows. To generate only FASTQ files, specify the GenerateFASTQ workflow in the sample sheet, which generates FASTQ files and then exits secondary analysis.

General

Are UTR or promoter regions covered in the TruSight One Sequencing Panel?

TruSight One kits target the coding region only and do not include UTR or promoter regions.

Do the TruSeq Stranded Total RNA with Ribo-Zero Globin or Plant work on degraded RNA?

All Ribo-Zero kits are compatible with degraded RNA. For more information, see the TruSeq Stranded Total RNA Sample Preparation Guide.

Does TruSeq RNA Sample Prep provide information about the originating strand from which the DNA was transcribed?

In the random priming process to generate cDNA with the best coverage, this information is not retained. If originating strand information is required, refer to the Directional mRNA-Seq Sample Preparation Guide or other published applications.

Is there LIMS support for Infinium Methylation BeadChips?

LIMS support is available for the Infinium MethylationEPIC BeadChip and the Infinium Mouse Methylation BeadChip.

Is requantification of DNA samples after bisulfite conversion recommended?

If you use at least 250–1000 ng DNA for the bisulfite conversion, requantification is not necessary. It is critical to quantify the input DNA concentration with PicoGreen to make sure that you add sufficient DNA to the bisulfite conversion reaction. Bisulfite conversion renders DNA less complementary. Therefore, much of the DNA is denatured and more difficult to quantitate accurately.

Where are the content manifest file downloads for my TruSeq Targetetd RNA Expression assay pool?

Use the Export Manfest function in Project Dashboard of your DesignStudio project to download manifest files. The manifest file is available after your order ships, not when the order is placed or designed in DesignStudio.

How many samples can be processed per TruSeq Nano DNA LT and HT Library Prep Kit?

The TruSeq Nano DNA LT Library Prep kits contain sufficient reagents for 24 samples and the TruSeq Nano DNA HT Library Prep Kit contains reagents for 96 samples. Illumina recommends using the LT kit if processing less than 24 samples at a time and the HT kit if processing more than 24 samples. Both the LT kit and HT kit can be used with either the Low-Sample (LS) or High-Sample (HS) protocols.

Does VariantStudio perform variant classification?

No, you perform variant classification. With VariantStudio, you can provide annotations that can be used to classify variants within VariantStudio. After you assign a classification category to a variant, the information is saved to the Classification Database so that it can be applied to the same variant observed in other samples. Classifications saved in the Classification Database can be applied automatically to other samples through the Apply Classifications from Database menu.

Can I view data from more than one sample at a time in the ICB?

Yes, data from a second or third sample can be plotted concurrently by selecting Settings | Trio View from the menu.

How many indexes are available for the TruSeq ChIP Library Preparation Kit?

This kit is available in a Set A and a Set B, each containing 12 indexes. When used together, sets A and B provide a total of 24 unique indexes.

Is the BaseSpace Variant Interpreter software validated?

BaseSpace Variant Interpreter has undergone the standard software development and testing requirements that are applied to all Illumina software tools developed for Research Use Only. Illumina has not submitted the software for regulatory approval and it is not a medical device. It is important that you follow procedures for validating the software per institution, local, state, and federal guidelines.

How can I combine two regions in close proximity into one larger region?

When you create a merged table, you can specify how close two regions must be in number of base pairs.

What cluster generation kits are compatible with TruSeq RNA Access libraries?

For kit compatibility information, see the Illumina Version Compatibility Reference.

What differences can I expect if I am analyzing a tumor sample versus a typical congenital sample?

Depending upon the number of division cycles, a tumor sample typically has a large number of aberrations across the length of the whole genome. You will see many regions of loss and gain in copy number. cnvPartition will identify many found regions in these samples. Genotyping call rates in these samples may be as low as 60%. The percentage of defects in these samples can be as high as 50% (or even higher). Congenital samples tend to have fewer aberrations and may only have <10 large deletions and duplications. Genotyping call rates in these samples should be above 95% more likely to be 98% or greater. The percentage of defects in these samples should be less than 1%.

How do I make sure that I have an accurate amount of input DNA?

The Nextera XT protocol is optimized for 1 ng of input DNA total. Illumina strongly recommends quantifying the starting genomic material. Nextera XT library prep kits use an enzymatic DNA fragmentation step and thus can be more sensitive to DNA input compared to mechanical fragmentation methods. The ultimate success of the assay strongly depends on using an accurately quantified amount of input DNA library. Therefore, the correct quantitation of the DNA library is essential.

To obtain an accurate quantification of the DNA library, quantify the starting DNA library using a fluorometric-based method specific for duplex DNA, such as the Qubit dsDNA BR Assay system. Use 2 µl of each DNA sample with 198 µl of the Qubit working solution for sample quantification. Avoid methods that measure total nucleic acid content (e.g., nanodrop or other UV absorbance methods) because common contaminants such as ssDNA, RNA, and oligos are not substrates for the Nextera XT assay.

Which regions does the panel include?

The following table lists the genes and exons included in the panel.

Gene

Target or Region

Potential Disease States

AKT1

Exon 3 (partial)

Breast

BRAF

Exon 15 (partial)

Melanoma, Colon, Lung

EGFR

Focal Amplification, Exon 12 (partial),
18, 19, 20, 21 (partial)

Lung

ERBB2

Focal Amplification, Exons 14 (partial),
17, 18, 19, 20 (partial), 21 (partial), 24, 26

Breast, Lung

FOXL2

Exon 1 (partial)

Ovary

GNA11

Exon 5 (partial)

Melanoma

GNAQ

Exon 5 (partial)

Melanoma

KIT

Exons 8, 9, 10, 11, 13, 14, 17, 18

Gastric, Melanoma

KRAS

Exon 2 (partial), 3 (partial), 4

Colon, Gastric, Lung

MET

Focal Amplification

Lung, Colon, Gastric

NRAS

Exon 2 (partial), 3 (partial), 4

Colon

PDGFRA

Exon 12, 14, 18

Gastric, Melanoma

PIK3CA

Exon 10, 21

Lung, Breast, Prostate

RET

Exon 16

Lung

TP53

Full CDS

Lung, Melanoma, Ovary, Colon

Does Illumina supply automated protocols for use with their gene expression products?

Automation of the DASL Assay is available. However, Illumina currently does not offer automation for the Direct Hyb protocol. Contact Ambion for information about automation of the TotalPrep protocol.

Which organisms does TruSeq Targeted RNA Expression support?

TruSeq Targeted RNA Expression kits currently support only human, mouse, and rat organisms.

What are typical output concentrations from the pre-enriched Nextera Rapid Capture Enrichment library and the fully enriched final library?

Typical output concentrations from the pre-enriched library are 40–125 ng/µl.

Typical output concentrations from the final enriched library are (nM calculation assumes a 400 bp library size):

  • 1-plex: 5–50 nM (1.5–13 ng/µl)
  • 6-plex: 45–130 nM (12–34 ng/µl)
  • 12-plex: 100–250 nM (26–66 ng/µl)

Do you recommend an RNA purification protocol?

Illumina does not recommend a specific RNA purification product. However, any product that yields pure, intact RNA of good quality that retains (at least) most of the small RNAs should work well with our miRNA assay. We have generated good data with RNAs extracted with the Ambion kit, Qiagen kit, Trizol, etc. For any given study, it is ideal to isolate the RNAs using a single method.

Are the probes directed to a single strand?

For the majority of target regions, a single strand is captured and sequencing data are highly stranded except in a subset of regions.

What is the difference between the TruSeq HS and LS protocols?

The TruSeq high sample (HS) protocol requires additional ancillary equipment, but reduces touch points and is ideally suited for projects that include more than 48 samples prepared at one time.

The low sample (LS) protocol requires minimal ancillary equipment, but requires more hands-on manipulation and is best suited for projects that include 48 or fewer samples.

What is the median statistically significant, detectable fold change of the GEX BeadChips?

Expect 1.35 fold.

Can we use total-RNA extracted from blood?

Yes, total-RNA extracted from blood has been successfully used with this product.

What is the throughput of the HiSeq X system?

Each system can generate 1.6–1.8 Tb in less than 3 days with greater than 75% of bases about Q30 from a 2 x 150 bp run. This throughput enables 16 genomes covered at 30x per run per system. For more information, see HiSeq X Series Specifications.

Do we need to upgrade our servers for analysis of HiSeq data?

You may need to upgrade your computing infrastructure to accommodate the new data output rate (Gigabases/hr) of the HiSeq 1500/2500 in Rapid Run mode, which is roughly twice the data output rate of a High Output run. However, Illumina offers two new methods to reduce the data output rate by more than 50% without affecting data quality, allowing you to leverage your existing infrastructure to store either twice the amount of data with the same amount of storage (High Output mode) or to keep the Rapid Run mode data output rate the same as your HiSeq 1000/2000 data output rate. The recommended configuration for one HiSeq is an IlluminaCompute Standard appliance. More information is available here or from your account manager.

How many samples are there per index pair in one TruSeq HT sample prep kit?

Because each well of the 96-well HT adapter plate is single-use only, only one sample per index pair can be generated.

Where can I get the adapter, primer, and index adapter sequences for TruSeq library prep kits?

The Illumina Adapter Sequences Document lists all adapter sequences. Note that the oligos are not sold separately and primer sequences are proprietary.

Where can I find a marker list of all the CpG sites on the array?

Download the manifest file for the array from the Product Files page of the BeadChip support pages.

Which normalization method does Illumina recommend for miRNA data?

If technical replicates are used in a SAM or across multiple SAMs, Illumina recommends sample scaling normalization followed by quantile normalization. If there are no technical replicates, we recommend quantile normalization.

What quantitation methods does Illumina recommend for final library quantitation?

Use a dye specific to double-stranded DNA, such as Qubit or PicoGreen.

Can I perform comparisons across large groups of samples for cohort or population analysis?

No. Comparisons of two or more groups of samples cannot be performed using the application.

Why doesn't the Infinium HumanMethylation450 BeadChip contain specific distances from the CpG site to the transcriptional start site (TSS) of the listed genes?

The array density is much higher in the Infinium HumanMethylation450 BeadChip. There are a large number of CpGs for which multiple transcripts are listed and for which the CpG site can fall into different annotation categories. Consequently, the task of calculating TSS distances would be a large bioinformatic undertaking, and the numbers would have to be modified every time the genome was updated. Additionally, the column "UCSC_RefGene_Group" has content about the location of the CpG relative to specific regions and features of the associated genes, which is in many ways richer than the simple distance relative to transcriptional start site.

Can degraded DNA samples be used?

No, this kit requires high quality gDNA as starting material. Use of degraded DNA may result in low yields and loss of sample during bead clean-up steps.

How large is the output file for an Infinium MethylationEPIC array?

On average, ~2.75 Gb with JPEG files, which is the default setting.

Should QC be performed after the in vitro transcription reaction (IVT)?

Illumina recommends quantitation of amplified RNA by fluorometry using RiboGreen reagent and additional qualitative analysis with the Agilent Bioanalyzer or by electrophoresis through agarose gel. Measuring A260 absorbance with a spectrophotometer offers a less precise RNA quantitation method.

If I use mismatches=1 for CASAVA when dual indexing, does the software account for one mismatch per index or one mismatch total for both indexes?

The software accounts for one mismatch per index.

The argument in CASAVA is a comma-delimited list of the number of mismatches allowed for each read (for example: 1,1). If one value is provided, all index reads allow the same number of mismatches (the default is 0). The index reads are treated separately.

Is this product robust against degraded RNA?

Yes. Good data has been generated using RNA extracted from FFPE tissue samples. Technical reproducibility is highly similar to intact RNA. In addition, Illumina has profiled artificially-degraded RNA samples (95°C heat for 30 minutes), with good reproducibility. The profiles generated with artificially degraded samples are comparable to those generated with corresponding intact RNA samples.

Where can I find probe coordinates for the probes in the human and mouse MAPs?

Illumina has the probe coordinates (ie, chromosomal location information) for most of the miRNA, including those not in the Sanger miRBase. You can find this information in the *.bgx file.

For custom Infinium can I assay SNPs that are next to each other (i.e., < 60 bp apart)?

Yes. In the Infinium Assay, the effect of underlying polymorphisms is not critical to overall performance.

How much time do typical rapid runs take?

Read Length Estimated Rapid Run Time (Hrs)*
1 × 50 bp no index 9
1 × 50 bp dual index 11
2 × 100 bp no index 27
2 × 100 bp dual index 30
2 × 150 bp no index 40
2 × 50 1bp dual index 43
*Systems with SN < 7000895 will require additional time

What do the column headers GeneSymbol, GID, and Accession reference on the gene list for Illumina's standard BeadChips, and where do the numbers come from?

Descriptions for all of the column headers can be found in the document Bead Manifest Field Descriptors located on the documentation CD included in the startup kit.

Must the Ethnicity field comply to any specific requirements?

No, this field is optional, or can be completed with any user-defined categories. Our intent is to provide the most comprehensive and varied dataset, therefore we do not limit ethnicities at this time to certain entries.

What is in the tube labeled "Index adapter" in the TruSeq Targeted RNA Index Kit?

This tube contains the PCR primers used to amplify the cDNA amplicons. The primer sequences include sequencing primer binding sites and the indices/barcodes.

What technologies does Illumina use for the Genotyping Services business?

Illumina has created a highly multiplexed SNP genotyping system using proprietary BeadArray technology. We have also developed a fail-safe sample-tracking LIMS system to ensure error-free processing.

How does the Isaac human WGS workflow compare to BWA / GATK in terms of sensitivity and specificity?

Isaac has slightly lower sensitivity and specificity, but is much faster than BWA/GATK.

Conflicts

Conflict Rate

Sensitivity

Isaac

6318

0.139%

94.5%

BWA+GATK

5315

0.126%

95.8%

Does Illumina recommend using the TruSeq Stranded Total RNA with Ribo-Zero Globin kit on any sample other than blood?

TruSeq Stranded Total RNA with Ribo-Zero Globin kit has been specifically tailored for blood samples from human, mouse, or rat. In addition to removing cytoplasmic and mitochondrial rRNA, it will also deplete globin mRNA. Illumina recommends using theRibo-TruSeq Stranded Total RNA Gold kit for samples other than blood.

What correlation is expected from technical replicates?

At 100- 200 ng total RNA input, R2 = 0.97 is expected.

Can I use the Zymo EZ DNA Gold kit for bisulfite conversion?

Yes, you can, but you must use this kit for your entire project. That means you cannot use a mixture of samples that have been converted with different kits on the same arrays, or within the same project.

Is two-round amplification an option?

Illumina has not tried any two-round amplification kits. However, some customers have reported successful use of two-round IVT kits with Illumina gene expression arrays.

Can I perform sample-to-sample comparisons?

Yes. Sample-to-sample comparison is done through the Cross Sample Subtraction Filter. This process filters variants that are present in one sample but not the other.

What molecules are removed with the TruSeq Stranded Total RNA kit with Ribo-Zero Plant?

The ribo-zero removal mix targets cytoplasmic (25S, 18S, 5.8S, and 5S rRNA), Chloroplast (23 S, 16S, 5S, and 4.5S), and mitochondrial rRNA (18S and 5S).

Why are both Infinium I and Infinium II probes used for this BeadChip? Does the design affect data output and quality?

Does BaseSpace Variant Interpreter automatically assign variant classification?

No. For germline variants, the software uses a simple rule set to predict the classification:

  • Variants are automatically ranked and prioritized based on their annotations, predicted consequence (loss of function, missense, or noncoding) and population allele frequency (using the highest frequency from ExAC, 1000 Genomes, and EVS).
  • Variants are typically assigned the ClinVar pathogenicity unless the variant is greater than 5% population frequency, in which case they are assigned Benign per ACMG guidelines.
  • Variants without ClinVar entries that are common (over 1% population frequency) are assigned Likely Benign.
  • Variants without ClinVar entries that are uncommon (less than 1% population allele frequency) are assigned Likely Pathogenic if loss-of-function (frameshift, stop-gained or essential splice), VUS if a missense or near-splice, and Likely Benign if noncoding.

This pathogenicity autoscoring is only a suggestion. Review these predictions, the provided annotations, and all evidence for the variant before assigning your final interpretation.

What are the advantages of the Infinium Linkage-12 panel?

Infinium Linkage-12 uses the powerful PCR-free Infinium Assay chemistry and protocol at a very attractive price point (the most cost-effective on the market for linkage analysis).

Can I study copy-neutral LOH with KaryoStudio? Can cnvPartition do this?

Yes, it is possible to study copy-neutral LOH with KaryoStudio and cnvPartition. To ensure that you have the latest version of the cnvPartition algorithm, contact Illumina Technical Support. Note that the amount of copy-neutral LOH present across a typical genome can be quite large. Illumina recommends setting the filter to a large size to limit the number of regions found by the algorithm.

What is the minimum shelf life of TruSight HLA v2 reagents?

TruSight HLA v2 reagents are shipped with a minimum of three months of shelf life.

What do the status bar colors indicate on the front of the MiSeq?

Green indicates the instrument is ready to run, blue indicates the instrument is running, and orange indicates the instrument needs attention.

What is the typical library size distribution?

The size of the final product is ~250–300 bp. A larger fragment size is expected for good FFPE RNA (> 350 nt), while a smaller fragment size is expected for poor FFPE RNA.

Are original cBot manifolds compatible with cBot 2?

No. Original manifolds do not align correctly on the new instrument.

Which regions are targeted with the oligos in this kit?

Download the target regions files from the Product Files page.

What is the size of the TruSeq Exome Enrichment capture probes?

TruSeq Exome Enrichment capture probes are ~95 bases.

What level of sample plexity is supported for TruSeq Exome Enrichment?

TruSeq Exome Enrichment supports pre-enrichment pooling of up to six samples. Refer to the TruSeq Exome Enrichment Data Sheet for additional information.

Can the Infinium HumanMethylation450K BeadChip or the Infinium MethylationEPIC BeadChip distinguish 5-hydroxymethylcytosine from 5-methylcytosine?

Illumina has not validated the array for 5-hMc. However, publications have used the Infinium HumanMethylation450K array for 5-hMc analysis and it is possible that this protocol will works on the Infinium MethylationEPIC Microarrays.

For more information, see Nazor, Kristopher L., et al. "Application of a low cost array-based technique—TAB-Array—for quantifying and mapping both 5mC and 5hmC at single base resolution in human pluripotent stem cells." Genomics 104.5 (2014): 358-367.

Can I perform family-based genetic disease analysis in VariantStudio?

VariantStudio supports family-based filtering of father, mother, affected child, and affected or unaffected siblings. In VariantStudio v2.2, the analysis requires input of the proband and one other sample, either a parent or a sibling. This process filters variants that are consistent with a particular inheritance mode, including autosomal dominant, autosomal recessive, X-linked recessive, and de novo mutation.

Are there any performance differences in sequencing coverage for GC-rich amplicons?

Although most amplicons of interest are not likely to be high GC-content, coverage of high GC-content amplicons might have more variability compared to other amplicons.

Should I use miRNA-enriched samples?

Illumina recommends using total RNA. The enrichment process reduces the precision of the assay, increases the noise due to technical variation, and requires more starting material.

Can I import custom annotations?

Yes. You can import custom annotations in tab-delimited text (*.txt) file format into the software from the Custom Annotations tab in Settings. For more information, see the BaseSpace Variant Interpreter Online Help.

Can a USB device be plugged into the MiSeq System during the run?

Illumina recommends that you wait until the completion of a run before inserting a USB device into the MiSeq System.

What is new in the GenomeStudio microarray modules (GT, GX, M, and PT v1.0)?

For details, see the GenomeStudio Software 2009.2 Release Notes, available in iCom and the GenomeStudio Portal.

Can VariantStudio be used in a clinical lab, such as a CLIA lab?

Yes, but only after validating the software according to Institution, Local, State, and Federal guidelines before using the Illumina VariantStudio software.

How can I obtain KaryoStudio?

Contact your account manager for information about getting new versions of KaryoStudio software.

What is the difference between TruSeq Methyl Capture EPIC and TruSeq DNA Methylation?

TruSeq DNA Methylation is used for whole genome bisulfite sequencing and uses different chemistry than TruSeq Methyl Capture EPIC.

Are there any safety considerations with the laser system on the HiSeq?

The HiSeq is a Class 1 laser instrument as evaluated by IEC 60825-1 Edition 1.2. Under normal operation, the operator is not exposed to laser light.

Is performance impacted based on the RNA isolation technique used?

Illumina has tested only RNA isolated using RNEasy from Qiagen, one of the most frequently used methods. We do not anticipate a major impact on performance with other appropriate, well-established isolation techniques.

What is a beadpool manifest?

Also called a SNP manifest, a beadpool manifest is a file containing the SNP-to-beadtype mapping and all SNP annotations.

For the Infinium assay, the beadpool manifest is a BPM file in binary format.

What applications can I run on a MiSeq System?

The MiSeq System is ideal for amplicon sequencing, targeted resequencing, small genome sequencing, and clone checking. It is capable of performing 16S ribosomal RNA gene sequencing, ChIP-Seq (TF Binding), and small RNA sequencing.

How many samples should I run for clustering on custom Infinium content?

Illumina recommends that you run at least 100 samples including both replicates and trios.

Can I assay all SNP classes ([A/T], [C/G], [A/C], [A/G], [T/C], [T/G]) with custom Infinium content?

Yes, all biallelic SNPs can be assayed using a combination of Infinium I and II probe designs.

How do I find the largest aberration in my samples?

Click the Size column header to sort the data in the Found Regions table.

I am getting an error when running the Upload Test Application. What should I do now?

Save a screenshot of the error, along with any logs that were created from the Test Application, and send the information to techsupport@illumina.com. If the Test Application has completed, these files are saved in the same folder on your computer that the program is in.

What are the advantages of using the Decode File Download Utility instead of the CDs shipped with my BeadChips?

This software allows you to download an unlimited number of files at one time automatically without physically handling the CDs. This can be done overnight, over a weekend, or one BeadChip at a time, and you can easily select only those files that have not previously been downloaded. You can also select BeadChips based on purchase order (PO) numbers, or you can barcode-scan one or many BeadChips as you receive them. Using this software will likely save you many hours of hands-on time moving and copying files from CDs.

What is the typical library size distribution for final libraries?

The expected size range for final libraries is ~250 bp to ~1 kb. The expected median insert size is 180–200 bp. The reference guide provides an example Bioanalyzer trace.

Do I need to remove the ribosomal RNA prior to labeling?

It should not be necessary to remove rRNA before labeling. The ribosomal RNA does not get amplified in the protocol and should represent a very small percentage of the final product.

Does this kit use dual-index adapters and how long are the indexes?

Yes, this kit contains dual index adapters - 24 i7 indexes and 2 i5 indexes.

The i7 index is 6 bp and i5 index is 8 bp. Although numbering of indexes seems similar to other kits, the index sequence is different and can be found in the reference guide.

How do I download the Decode File Client?

The Decode File Client installer can be found on Illumina's support website.

Does the TruSeq Stranded Total RNA with Ribo-Zero Plant kit work on all plant species?

The TruSeq Stranded Total RNA with Ribo-Zero Plant kit removes rRNA from a broad variety of plant RNA samples. It has been laboratory tested against Arabidopsis, maize, wheat, rice, corn, and soybean. In silico testing, it has also demonstrated compatibility with Japanese honeysuckle (Lonicera japonica), mirror bush (Coprosma repens), moss (Physcomitrella), tomato (Solanum lycopersicum), Vietnamese coriander (Persicaria odorata), and water clover (Marsilea vistita).

Use RNAMatchMaker to find which Ribo-Zero kit is compatible with your organism of choice. The analysis will be done in silico and does not guarantee rRNA removal.

What is the difference between TruSeq DNA and RNA Sample Prep v1 and v2/LT Kits?

  • No changes to workflow, increase in index capability
  • Fill volumes and new consumables to support automation
  • Each kit contains 12 of 24 unique indexes and each index reaction sufficient for eight individual samples

Can I order through MyIllumina FastTrack Services?

Yes. Use the following catalog numbers: FT-260-1002 for Infinium OncoArray FastTrack Service, FT-260-1012 for Infinium OncoArray+ FastTrack Service (for the OncoArray plus add-on)

Can I get bead-level data?

Yes, bead-level data is available. Contact Technical Support for assistance with this feature.

What quality control method is recommended for the final libraries?

Assess the final library quality with either an Advanced Analytical Technologies Fragment Analyzer using a NGS Fragment Analysis Kit or Agilent Technologies 2100 Bioanalyzer using a DNA 1000 chip.

How was the content for this panel chosen?

The content for this panel was selected through collaboration with experts and key opinion leaders and by referencing publically available databases such as the Mitelman database and The Cancer Genome Atlas (TCGA).

What are .idats?

An *.idat file is an intensity data file. It contains statistics for every bead type on your BeadChip. The statistics in an *.idat file include the number of beads, the mean, and the standard deviation for each color sample. There is one *.idat file per sample per channel.

Where can I find Safety Data Sheet (formally MSDS) information for a TruSight One kit?

Visit the Safety Data Sheets (SDS) page and enter the applicable search terms.

Are there known ambiguities resulting from TruSight HLA v2 sequencing?

There are three amplicon ambiguities found using the IMGT/HLA 3.23 database with TruSight HLA v2.

DRB1*12:01:01 and DRB1*12:10 are ambiguous due to the forward primer location.

  • DRB1*12:01:01 and DRB1*12:10 are both part of the DRB1*12:01:01 G group.
  • They are distinguished from each other in a single base position in Exon 1 (IMGT Codon -16, exon 1 base position 40).
  • The DRB1*12:10 codon -16 is ATT (Ile) and the DRB1*12:01 codon is GTT (Val).
  • The TruSight HLA v2 DRB1 amplicon begins in intron 1 and does not cover exon 1.
  • Coverage of exon 1 was excluded from the assay because it would require a 14 kb amplicon that would be challenging to amplify reliably.

DPB1*13:01:01 and DPB1*107:01 are ambiguous for similar reasons to the ambiguity previously described.

  • DPB1*13:01:01 and DPB1*107:01 are both part of the DPB1*13:01:01 G group.
  • The TruSight HLA v2 amplicon does not cover exon 1.
  • These alleles are distinguished from each other at 2 base positions in exon 1 (IMGT Codon -22, base position 24 and Codon -14, base position 47).
  • Codon -22 is GCG (Ala) in DPB1*13:01:01 and GCA (Ala) in DPB1*107:01.
  • Codon -14 is ACG (Thr) in DPB1*13:01:01 and ATG (Met) in DPB1*107:01.
  • Coverage of exon 1 was excluded from the assay because it would require an amplicon that would be challenging to amplify reliable.

DRB1*08:01:01 and DRB1*08:01:03 are ambiguous in IMGT/HLA database version 3.23.

  • As of March 2016 and the release of IMGT/HLA 3.24, 08:01:03 allele has been removed from the database because "sequence shown to contain errors and be identical to DRB1*08:01:01."

In addition to these amplicon ambiguities, conditional ambiguities arise in DPB1 and DQB1. We define a conditional ambiguity as an ambiguity present only when two alleles are paired, but may not be present when one or both alleles are paired with other alleles. These conditional ambiguities appear in DPB1 due to lack of polymorphic sites in intron 2. For example, DPB1*04:01 paired with a DPB1*04:02 is ambiguous with DPB1*105:01 and DPB1*126:01 due to loss of phase across intron 2, as the gap between heterozygous positions is too great to phase. However, DPB1*04:01 paired with DPB1*16:01 is unambiguous for both alleles.

What quantification method is recommended for the final libraries?

Use qPCR to quantify libraries. For more information, see the Sequencing Library qPCR Quantification Guide.

Can other thermal cyclers be used with this protocol?

Coverage of GC regions can be impacted by the model, settings, and performance of the thermal cycler used. Illumina has validated the Bio-Rad DNA Engine Tetrad 2, the Bio-Rad S1000, and the MJ Research PTC-225 DNA Engine Tetrad thermal cyclers. Other thermal cyclers can differ in their performance across the genome.

What is the typical quantity of the final libraries?

The expected quantity is 3–12 ng/L, resulting in 10–40 nM.

What are the advantages of TruSeq DNA Nano?

The key advantage to using TruSeq DNA Nano is that only 100 ng of starting genomic DNA is required. This is increasingly important in studies that have limited amounts of DNA or where DNA needs to be split into multiple applications. In addition, TruSeq DNA Nano is a comprehensive solution containing all of the necessary reagents, barcodes, and size selection beads for convenient processing. The improved workflow includes bead-based size selection, therefore eliminating the time of gel-based methods. TruSeq DNA Nano also produces premier library quality, offering an excellent solution for studies requiring the highest coverage.

Are the controls the same in the TruSeq DNA and RNA sample prep kits?

The same controls are used, but at different concentrations.

Will my data be kept confidential?

Yes. All data generated by Illumina will be kept strictly confidential. The data belongs solely to the submitting investigator or company. Illumina will only accept anonymized samples with no patient association.

How long are the DMAP (decode map) files available for download?

Files appear on the server approximately 24 hours after they are shipped. Illumina guarantees that files remain on the server until the BeadChips are expired. Expiration dates can be found on the BeadChip packaging label.

In VariantStudio v2.2, can I choose to continue to use the previous annotation database version?

No. VariantStudio v2.2 is programmed to use annotations from the version of the Illumina Annotation Service that was updated for VariantStudio v2.2.

What kind of support is provided for ChIP-Seq?

Illumina provides the same support for ChIP-Seq as for standard genomic DNA resequencing. However, we do not provide support on antibodies or the ChIP portion.

Where can I find source code for the Isaac algorithms? Are they open source?

Source code for the Isaac algorithms is availble here: github.com/sequencing.

The software is released is under Illumina Open Source Software License available on github.com/sequencing/licenses/. The license terms are intended to make the software accessible to a variety of bioinformatics developers, computational biologists, and other users in the research community, and to encourage community development of the software. For questions about the license, contact Isaac-admin@illumina.com.

A peer-reviewed application note is available on the Bioinformatics website:

Raczy C, Petrovski R, Saunders CT, Chorny I, Kruglyak S, et al. (2013) Isaac: Ultra-fast whole genome secondary analysis on Illumina sequencing platforms. Bioinformatics 10.1093/bioinformatics/btt314

What command line changes are required in CASAVA for processing the dual-index reads?

In the demultiplexing workflow (configureBcltoFastq.pl), the --use-bases-mask parameter will need to be included if dual indexing is indicated in the sample sheet: Example: --use-bases-mask=Y*,I*,I* (for single-end runs) or --use-bases-mask=Y*,I*,I*,Y* (for paired-end runs). See the CASAVA User Guide and release notes for additional user-input commands that are required in CASAVA.

Are UTRs targeted?

Yes. This kit targets 160 bp of the 5 ? and 3 ? UTR of every targeted gene, which ensures that the full gene of every targeted gene is covered.

Does Illumina provide classified variants?

No, Illumina does not provide classified variants. Using BaseSpace Variant Interpreter, you can upload classified variants from an external source or manually classify variants in samples that are being analyzed.

Can I modify the classification categories in BaseSpace Variant Interpreter?

No. The software uses the following default classification schemes for tumor and germline samples:

  • Somatic analysis—FDA guidance available, general guidance available, inclusion criteria for clinical trial, and other reportable variant.
  • Germline analysis—Pathogenic, Likely Pathogenic, Variant of Unknown Significance (VUS), Likely Benign, and Benign. These categories follow ACMG guidelines.

What sequencing primers are compatible?

All sequencing primers included in all TruSeq cluster kits are compatible.

I uploaded my data and it has been reviewed and released by Illumina. Why are some of my samples missing?

We review the data for many criteria before releasing to the database. Any samples that do not pass our call rate criteria are not included in the released data. Additionally, any user-entered information (eg, ethnicity or positive phenotype) that is unclear or not compliant with the Health Insurance Portability and Accountability Act (HIPAA) is not included in the released data.

What is the throughput of the HiSeq 3000 system?

Each system can generate up to 750 Gb in 3.5 days with greater than 75% of bases above Q30 from a 2 x 150 bp run. This throughput enables up to 6 genomes at 30x per run per system. For more information, see HiSeq 3000/4000 System Specifications.

How many bases constitute a region under a peak?

See the GenomeStudio ChIP Sequencing Module User Guide available in the GenomeStudio Portal for information.

What is the difference between v1 and v2 kits? Can I use them together?

The different index kit versions (v1 and v2) are not chemically different. The v2 kits were introduced to provide more index combinations. When Sets A–D are combined, up to 384 unique index combinations are possible. If you plan to multiplex more than 96 samples, the v2 index kits are recommended.

The v1 and v2 index kits provide some of the same indexes. For information about which indexes are included with each kit, see the Illumina Adapter Sequences Document.

Is miRNA compatible with LIMS?

Illumina does not currently offer LIMS support for miRNA.

Can I download data from different sample types such as the HumanHap550 and the HumanHap300 in the same report?

Illumina strongly advises against this, as each product contains a different SNP list. Instead, download samples from a single array type and version number in each session.

What is the typical library size distribution for TruSeq Exome libraries?

TruSeq Enrichment Exome libraries show a tight peak of 200–400 bp.

Is the MiSeq System scalable?

MiSeq Systems offer scalable throughput based on read length. Illumina continues to increase read lengths, imaging area, and cluster density with improved detection and resolution. For more information, see the MiSeq Product Information Sheet.

How many samples can be processed per TruSeq Small RNA Library Prep Kit?

TruSeq Small RNA Library Prep kits are ordered as a core box and an index box. Each core box contains enough reagents to process 24 DNA samples and each index box contains 12 unique indexes, with sufficient index for two individual samples.

What is the smallest batch size I can prepare with this kit?

The smallest supported batch size is 16 samples (15 samples plus one control). The 16-sample minimum accounts for up to six freeze/thaw cycles of the 96-sample kit.

What are Illumina FastTrack Services?

Illumina FastTrack Services combine high-performance Illumina platforms with expert Illumina scientists. Illumina FastTrack Services deliver high-quality genotyping and sequencing data to support your research projects. You reap the benefits of Illumina technology with a personalized service that delivers your data quickly and at a reasonable cost.

Illumina FastTrack Services offer a broad range of services, from whole-genome genotyping to custom content genotyping, human whole-genome sequencing, and human phasing analysis services.

Does Illumina offer kits containing only the oligos?

Oligo-only kits are not available for this library prep.

What is the difference between the TruSeq DNA and TruSeq DNA PCR-Free protocol?

The TruSeq DNA PCR-Free protocol does not include gel size-selection or PCR amplification. Instead, it uses bead-based size selection.

The TruSeq DNA PCR-Free protocol takes about one day to complete, while the standard TruSeq DNA protocol takes two days to complete.

Are AMPure XP beads supplied with the kit?

All beads required for the protocol are supplied in the kit.

Do I need to specify a genome folder in my sample sheet for the metagenomics workflow?

No reference genome is necessary for the MiSeq metagenomics workflow.

What is the typical library size distribution for Nextera Rapid Capture Enrichment libraries?

Nextera Rapid Capture Enrichment libraries generally range from ~150–1000 bp, with the main peak at ~300–350 bp.

The kit does not include inline controls. What positive control can be used?

Illumina recommends using one or more of the following recommended positive control samples. These positive control samples can be used for methylation status control.

Normal samples: HCC1187 normal (BL) (ATCC, catalog # CRL2323-D), NA12878 (Coriell Institute, catalog # NA12878)

Cancer samples: HCC1187 breast cancer tumor (ATCC, catalog # CRL2322), HeLA (Biochain, catalog # D1255811), Jurkat (Biochain, catalog # D1255815)

What is the throughput of the NovaSeq System?

Sequencing an S2 flow cell on the NovaSeq 6000 System generates 2 TB data in under two days. For detailed performance parameters, see NovaSeq 6000 Sequencing System.

What are the Illumina FastTrack Sequencing Services deliverables for whole-genome sequencing?

The WGS analysis pipeline v3.0 uses Isaac Aligner, and Isaac Variant Caller to generate several outputs. These outputs include sequencing reads with reduced-resolution Q-scores in BAM format, and variant data in both VCF and genome VCF (gVCF)1 file format. The somatic small-variant calling component of the cancer analysis pipeline uses Isaac Aligner and Strelka2 to generate somatic small-variant data in VCF format. These informatics pipelines enable significantly increased alignment efficiencies and reduced data footprints, without compromising the quality of the data and variant calls.

A WGS sequencing analysis training package, a cancer analysis training package, and a human phasing training package are available in BaseSpace. A services user guideline and a quick video describing the deliverables are also available in BaseSpace.

  1. sites.google.com/site/gvcftools/home
  2. www.ncbi.nlm.nih.gov/pubmed/22581179

What coverage will I get with TruSeq Targeted RNA Expression?

The amount of coverage is dependent on the level of expression of your targets in the samples you are analyzing. TruSeq Targeted RNA Expression assays target specific regions of each transcript and the level of expression depends on whether the region targeted is common between all isoforms of a gene or is transcript specific. Therefore, the amount of coverage obtained for each target is highly variable.

What version USB is on the MiSeq?

The MiSeq System is equipped with USB 2.0.

How are TruSight HLA v2 reagents shipped?

TruSight HLA v2 Sequencing Panel (24 Samples), catalog # 20000215, includes three boxes of reagents. Box 1 includes all the pre-PCR reagents including PCR mix, polymerase, buffers, and PCR primers. Box 1 is shipped on dry ice and is stored frozen at -25°C to -15°C. Box 2 contains purification and normalization beads used post-PCR and is refrigerated at 2°C to 8°C. Box 3 contains post-PCR buffers and tagmentation reagents. Box 3 is shipped on dry ice and is stored frozen at -25°C to -15°C.

TruSight HLA v2 Sequencing Panel (24 Samples Automated), catalog # 20005170, includes four boxes of reagents. Boxes 1 through 3 are identical to the ones previously described. The fourth box of auxiliary reagents includes additional purification beads required for dead volume minimums on many liquid handlers. The auxiliary reagents are refrigerated at 2°C to 8°C.

What level of sample plexity is supported?

The number of samples pooled pre-enrichment depends on the kit:

Kit

Enrichment
Reactions

Plexity

Catalog
Number

TruSeq Methyl Capture EPIC - LT

3

4

FC-151-1002

TruSeq Methyl Capture EPIC - HT

12

4

FC-150-1003

For more information, see the TruSeq Methyl Capture EPIC Library Prep Reference Guide.

Can samples generated with the TruSeq Stranded Total RNA Sample Prep Kits with Ribo-Zero Globin or Plant be multiplexed in the same lane as human TruSeq RNA samples?

Yes, but the indexes are the same. Make sure that all samples have different index combinations.

How reproducible is the data?

We performed experiments that compared TruSight RNA Pan-Cancer to RNA Access, TruSeq mRNA, and TruSeq Total RNA. TruSight RNA Pan-Cancer was found to be concordant with the other applications (> 0.95 R2). High, medium, and low quality FFPE samples were also run and were found to be robust in performance (0.99 R2). In addition, comparisons between data from UHR (reference RNA) generated in different laboratories was highly concordant (R2 ≥ 0.97).

Will TruSeq DNA PCR-Free work with degraded DNA samples?

No, this kit requires high quality gDNA as starting material. Use of degraded DNA may result in low yields and loss of sample during bead clean-up steps.

What are the annotation sources?

BaseSpace Variant Interpreter uses the following annotation sources: dbSNP, Catalogue of Somatic Mutations in Cancer (COSMIC), ClinVar, 1000 Genomes, Exome Variant Server (EVS), Exome Aggregation Consortium (ExAC), PolyPhen, and SIFT.

I created a query using the 'Edit Query' button, and then clicked 'OK'. Why didn't my list of samples change?

The Edit Query function only sets up the query to be performed. After the query is set, click Perform Query to update your results.

Which index adapters are methylated?

TruSeq DNA Single Index Set A and B are methylated and suitable for bisulfite sequencing applications.

Can I run GenomeStudio software on a VMware virtual machine?

Yes. However, this is not officially supported by Illumina due to very slow performance.

How can I zoom into a specific region of interest?

You can zoom into a specific region of interest by clicking on the specific aberration that is shown within the Found Regions table. The chromosome viewer then displays a closer view of the aberration of interest.

Alternately, use the zoom functions available in the toolbar or drag and stretch the red box on the ideogram to zoom in for a closer view of your data.

What are the Illumina FastTrack Microarray Services deliverables?

Illumina provides all samples and markers in a GenomeStudio project workspace. The Project Scientist zeros poorly-performing samples and markers. The samples and markers remain available, enabling you to make your own determination.

Illumina provides genotyping data files, which indicate the bi-allelic genotyping call with each genotype on a separate row. The exported data files are customizable to your preference; optional data fields include intensity values, allelic strand formats, etc.

The deliverables also include intensity files and locus and DNA summary files that show project statistics. Illumina delivers all data via a secure FTP site. The project scientist provides ongoing support for questions about the data.

Is BaseSpace Variant Interpreter available as a command-line tool?

No. BaseSpace Variant Interpreter is a Software as a Service (SaaS) solution with a graphical user interface that allows variant exploration, annotation, filtering, and reporting without bioinformatics expertise.

How are ChIP-Seq runs analyzed with Illumina software?

ChIP-Seq runs should be aligned using the eland extended module of Gerald in CASAVA1.7. CASAVA 1.8 is not compatible with ChIP-Seq analysis. In CASAVA1.7, the Gerald config.txt file should include the line "WITH_SORTED true" to generate the sorted.txt files for each lane. CASAVA itself does not need to be run for ChIP-Seq runs. The GenomeStudio ChIP-Seq module requires the sorted.txt files and Summary.htm from Gerald.

Does the MiSeq System require an uninterruptible power supply (UPS)?

The use of a UPS is optional. However, a UPS is highly recommended to protect the instrument in the case of a power surge. For more information, see the MiSeq System Site Prep Guide.

cnvPartition did not find the exact breakpoint of my found region. Is there a way to adjust this?

Yes, you can adjust the start and stop positions of a found region by right-clicking the found region and adjusting the genomic positions. You may need to use the zooming and panning functions to determine the exact position to which you would like to adjust the found region.

What are the lab temperature requirements for the MiSeq System?

You should maintain a lab temperature of 22°C 3°C.

Which variants are included in the output predefined report?

The output predefined report includes the following genes. These genes are also listed on the TruSight Tumor 15 product page.

Gene

Target or Region

Potential Disease States

AKT1

E17K

Breast

BRAF

V600E/K/R/M/D/G

Melanoma, Colon, Lung

EGFR

Exon 19 and Exon 20-insertions, deletions,
and indels G719A/C/S, L858R, L861Q,
S7681, T790M

Lung

ERBB2

p.E770_A771insAYVM

Breast, Lung

FOXL2

C134W

Ovary

GNA11

Q209L

Melanoma

GNAQ

Q209L

Melanoma

KIT

Exons 9, 11, 13, 14, 17

Gastric, Melanoma

KRAS

Codons 12, 13, 19, 59, 61, 117, 146

Colon, Gastric, Lung

MET

N/A

Lung, Colon, Gastric

NRAS

Codons 12, 13, 59, 61, 117, 146

Colon

PDGFRA

Exons 12, 14, 18

Gastric, Melanoma

PIK3CA

Exons 9, 20

Lung, Breast, Prostate

RET

M918T

Lung

TP53

Full CDS

Lung, Melanoma, Ovary, Colon

Can custom primers be used on the HiSeq 3000 System?

Custom primers have not been tested for use on the HiSeq 3000 System.

How do I access BaseSpace Variant Interpreter through my company firewall?

BaseSpace Variant Interpreter is a web-based application. To make sure that you can access it through your company firewall, open port 443.

Is requantification necessary?

Yes, it is good practice to requant after the dilution is made to confirm the amount before starting the assay.

Where are data stored after import into VariantStudio?

The desktop version of Illumina VariantStudio stores data locally in the VariantStudio project file. During the import of VCF files, data are loaded to the desktop memory. To save the data, designate a location to save the project. The project can be reopened later.

Is training required/recommended for TruSight HLA v2? What training options are available?

Training is highly recommended but not required. TR-204-0024 is on-site customer training for TruSight HLA v2 library preparation. Customers receiving on-site library prep training have had a much better experience and have implemented the solution faster than labs not receiving training.

What should I do if I get an error indicating that the program can't be found? For example: -bash-3.1$ GAPipeline-1.3.2/bin/illumina2srf -o lane_1.srf / /Data/IPAR_1.3/Bustard1.3.2_01-03-2009/s_1_*_qseq.txt -bash: /GAPipeline-1.3.2/bin/illumina2srf: No such file or directory

If you get this message, you should explicitly install the io_lib. To do this, run the following command from the Pipeline Install directory: /GAPipeline_1.3.2 make WITH_IO_LIB=1 install, then retry the command from Q2, above.

What are the minimum system requirements to run VariantStudio?

The minimum system requirements to run VariantStudio are Windows 7 or later, 64-bit CPU, 2 GB RAM minimum (4 GB recommended), and 25 MB hard drive space for installation.

Importing whole genome data or large gVCF files requires more RAM. For more information, see the entry in the File Formats section on this page, What are the requirements for importing whole genome data or large gVCF files?.

Is there a clear minimum coverage (%) that an alternate allele has to reach to be called a heterozygous (minimum minor allele frequency of 20% or 30%)?

The Isaac variant caller is based on a Bayesian model and assigns probabilities to different possible variant calls, so no specific minimum exists.

What molecules are removed with the TruSeq Stranded Total RNA kit with Ribo-Zero Globin?

The ribo-zero removal mix targets cytoplasmic rRNA, mitochondrial rRNA, and globin mRNA.

What cnvPartition confidence values cutoffs are recommended?

Users should set confidence value thresholds based on the type of analysis they are doing, and their individual goals for the results. In general, confidence values of 35 or higher are typically found to represent the most reliable results from this algorithm. Values lower than 35 may represent false-positives and should be validated by eye in the Chromosome Browser. If you would like to be a bit more conservative in your found regions, you should begin by looking at regions with confidence scores of 50-100, or even greater.

Are there recommended filter settings for analyzing data in VariantStudio?

There are no fixed guidelines for variant filtering in VariantStudio. Appropriate filter settings depend on the intended use of the assay and must be determined for each project. See the VariantStudio User Guide for more information on the filtering options.

Why is the message "Sequence Not Found" sent after I submit SNPs chosen using ADT?

This message means that the SNP sequence cannot be found in our internal database, which is a filtered version of dbSNP. We filter out any SNPs that are not appropriate for the Illumina platform, such as insertions/deletions, multiple nucleotide SNPs, and SNPs with ambiguous or multiple localizations.

What is the expected level of resolution?

This sequencing panel provides high-resolution HLA typing results to at least two fields of resolution. It usually achieves three or four fields of resolution, depending on the allele and available HLA nomenclature.

Do I need to purchase AMPure beads?

No, Sample Purification Beads (SPB) that are used for size-selection and clean-up steps are included.

Can custom primers be used on the MiSeq System?

Yes. The MiSeq reagent cartridge includes three empty reservoirs for custom primers. You have the option of using a custom primer for Read 1, the Index 1 Read, and Read 2. For more information, see the MiSeq System Custom Primers Guide (document # 15041638).

Because only a single strand is captured, is information on the opposite strand complementary to the CpG sites lost?

Lister, Ecker, and colleagues (Nature 2009) found that 99% of CpG sites on the complementary strand share the same methylation state. Therefore, methylation of the complementary strand can be inferred from single strand methylation data in most cases.

Does GenomeStudio software currently offer features for small RNA Sequencing and Tag Sequencing data analysis and visualization?

Not at this time.

Can qPCR be used to quantitate small RNA libraries?

Yes. For the most accurate measurements, the control library used to generate a standard curve for qPCR should be as similar as possible to the library being measured, which makes it necessary to use a small RNA library as a control library.

How many boxes come in a kit?

For both the LT and HT configurations, a kit contains 5 boxes.

What is a TruSeq Targeted RNA Expression manifest file?

Analyzing TruSeq Targeted RNA Expression data requires a manifest file (*.txt). The manifest file contains the probe and target sequences for your order. MiSeq Reporter uses it for on-instrument alignment.

What do the designators "A", "S," "I," and "M" in the manifest files mean?

A = All isoforms. The probe is designed to hit all splice isoforms of a gene.

I = Isoform specific. The probe is designed to hit a specific splice isoform of a gene, for which multiple isoforms are known to exist.

S = Single isoform. The gene has only one known splice isoform and our probe hits it.

M = Multiple isoforms. This gene has multiple isoforms. The probe targets more than one and fewer than all of them.

Is there LIMS support for FFPE samples on MethylationEPIC BeadChips?

LIMS support is not currently available for FFPE samples on the Infinium MethylationEPIC BeadChip.

Can I expect comparable library preparation performance between the TruSeq DNA PCR-Free LT and HT kits?

Yes, both the LT and HT kits will give comparable results.

How many samples can be generated per TruSeq HT library prep kit?

TruSeq HT library prep kits support 96 samples. When less than the full set of 96 libraries are pooled and sequenced, it is important that libraries with compatible index combinations are used in the index pool. See the user guide for your TruSeq HT library prep kit for more information.

Where is BaseSpace Variant Interpreter hosted? Where are data stored?

BaseSpace Variant Interpreter is hosted on Amazon Web Services (AWS). Storage is account-specific and is also hosted on AWS.

BaseSpace Variant Interpreter is aligned to the core requirements of HIPAA.

Does BaseSpace Knowledge Network provide clinical diagnostic variant interpretations?

No, BaseSpace Knowledge Network provides Research Use Only content that is focused on expediting interpretation of variants.

What quality control measures are in place before, during, and after genotyping services?

Throughout the process, Illumina has quality control procedures in place to provide you with the highest quality data. These procedures include:

  • Quality control and quantification of incoming DNA
  • Multiple internal controls built into each genotyping assay
  • Barcoded labeling of sample plates
  • Sample tracking under active database control to provide error-free handling of samples, assays, and data (PosiTrack)
  • Statistical measures of success for assay development and genotyping confidence scores (GenCall)

How do I check the quality of my library?

Use an Agilent Technologies 2100 Bioanalyzer to check the quality and intended size distribution of the Covaris sheared sample and the final library. For examples of Bioanalyzer traces and library size distributions, see the reference guide.

Which genes are targeted in this panel?

TruSight RNA Fusion targets 507 genes involved in fusions in cancer. Download the gene list from TruSight RNA Fusion Product Files.

How long are the oligos?

The oligos are 80 mers.

What happens if some of my samples do not genotype well?

The Illumina FastTrack Microarray Services lab repeats all samples that have adequate concentrations but low genotyping quality. This steps makes sure that each sample is given at least two genotyping attempts to optimize data quality. There is no charge for second attempts, but you are charged for all samples attempted.

What is the turnaround time for TruSight HLA?

The turnaround for TruSight HLA from DNA to reporting is less than 4 days.

  • Day 1 starts late with long-range PCR taking about 1 hour to set up and then overnight extension.
  • Day 2 is 8 hours, including 6 hours of hands-on time. The MiSeq System is loaded at the end of day 2 and takes 24–39 hours to complete the run, depending on the reagent kit used.
  • Data analysis and reporting on day 4 requires about 2 hours per 24-sample sequencing run.

How do I find the smallest aberration in my samples?

In the Found Regions table, click the Clear filters button to allow aberrations as small as 1 kb to appear in your data. Then sort the data by clicking the Size column header.

Does BaseSpace Variant Interpreter require an internet connection?

Yes, the software requires an internet connection. All software operations are conducted in the cloud.

How does the Illumina FastTrack Sequencing Services team ensure the quality of each genome?

Customer samples go through multiple quality control steps throughout the Illumina FastTrack Sequencing Services process. The sequencing team compares quality results to internal controls for accuracy verification and performs quality checks at each of these steps:

  • DNA quantification
  • Post-library prep
  • On-instrument sequencing
  • Post-genome build

Can I expect comparable library preparation performance between the TruSeq Nano DNA LT and HT kits?

Yes, both the LT and HT kits will give comparable results; however, the LT kit is recommended for preparing less than 24 samples at a time.

Can all 96 indexes support use of demultiplexing with mismatch=1?

Yes. The nucleotide distance of all the indexes is such that mismatch of 1 still makes a unique index.

Which products are available for online downloads using the DMAP download utility?

Decode files for all BeadChip products are available with this software except for: HumanHap300 v.1.0 Genotyping BeadChip, Expression BeadChips in one- or two-packs, iSelect Custom Genotyping BeadChips, and Sentrix Array Matrices (SAMs).

Can secondary analysis by MiSeq Reporter be delayed after a sequencing run completes?

No, secondary analysis cannot be delayed automatically. Secondary analysis will automatically stop if you begin the run setup steps for a subsequent sequencing run on the MiSeq System.

Do customers have to call Conexio for bioinformatics support?

Illumina supports all components of the workflow – library prep, sequencing, and software. Illumina bioinformatics support is well-trained on the Conexio software. In the event that Illumina support is unable to fully resolve a software question, we have escalation paths to internal Illumina experts and to experts at Conexio. If escalation is required, Illumina Technical Support will contact you directly with a resolution.

How do I check the quality of my library?

Use an Agilent Technologies 2100 Bioanalyzer to check the quality and intended size distribution of a tagmented sample, the pre-enriched library, and the post-enriched library. For examples of bioanalyzer traces and library size distributions, see the library prep reference guide. Variation in the Bioanalyzer profiles is expected because it is dependent on the input DNA type.

How can I calculate required coverage?

For more information, see the Optimizing Coverage for Targeted Resequencing tech note.

With the release of the TruSeq Nano DNA Library Prep Kit, will the TruSeq DNA Sample Prep Kit still be available for purchase?

Illumina has discontinued TruSeq DNA Sample Prep kits. Customers are encouraged to switch to either TruSeq DNA PCR-Free or TruSeq Nano DNA.

Are the Bisulfite Conversion reagents supplied with the kit?

The bisulfite conversion reagents required for the protocol are supplied in the kit.

What types of assays can be developed with the VeraCode Beads?

The glass surface of the VeraCode beads make them ideal for a number of bioassays, including genotyping, gene expression, methylation, and protein-based assays. Solution-based assays, in conjunction with microarrays, can also be developed.

Is the Infinium Methylation Assay a one-color or two-color assay?

This is a two-color assay.

For the Infinium HumanMethyation27 BeadChip assay, which is based on Infinium I Assay designs, the color incorporated depends upon the base preceding the CpG locus being queried. This can be either green or red.

The Infinium HumanMethylation450 BeadChip assay includes Infinium I and Infinium II study designs. In the latter case, a single base extension from the 3' end of the probe sequence (which is one base upstream of the query base) will result in either a red or green signal depending on whether the query site was unmethylated or methylated.

How does KaryoStudio determine which aberrations go into a report?

After you set how many aberrations should go into the report, KaryoStudio shows the aberrations based upon descending size. Only found regions with a checkmark in the Found Regions Table are displayed in the report.

What additional equipment is required to run a HiSeq?

The HiSeq requires a cBot cluster generation system, a network file system, and a mobile lab bench with locking casters (recommended).

Can I adjust the settings for cnvPartition?

Some settings for cnvPartition can be adjusted with a supplied configuration file. For information on adjusting the configuration file, refer to the algorithm release notes. For more information on this algorithm, see the CNV Algorithms Technical Note.

Is Globin reduction recommended?

There is no need to perform Globin reduction. Doing so could introduce additional experimental variation.

Is the miRNA product a one-color or two-color assay?

Illumina's miRNA product is a single-color assay.

Does Illumina provide classified variants in the Classification Database?

No, Illumina does not provide any classified variants. VariantStudio provides an empty Classification Database that you can populate by uploading classified variants from an external source or by manually classifying variants in samples that are being analyzed in VariantStudio.

What is the turnaround time for TruSight HLA v2?

The turnaround for TruSight HLA from DNA to reporting is less than 48 hours.

  • Day 1 starts late with long-range PCR taking about 30 minutes to set up for overnight amplification.
  • Day 2 is about 5 hours with about 3.5 hours hands-on time. The instrument is loaded at the end of Day 2 and takes 17–19 hours to complete, depending on the instrument and reagent kit.
  • Data analysis and reporting on Day 3 requires about 2 hours per 24-sample sequencing run.

What is the shelf life of the kits?

One year from the date of manufacture. The kit label provides an exact expiration date. Illumina guarantees at least three months from the date of receipt.

What is information content and why is it important for a linkage study?

Information content measures how informative a marker or map of markers is in a collection of pedigrees to extract the maximum amount of inheritance information for a linkage analysis. Information content is a function of marker heterozygosity and the number of meioses in the genetic study.

For multi-point linkage analysis, information content is also a function of marker density and spacing. It is important to have high information content throughout the genome for genome-wide searches for disease susceptibility loci or other traits so that regions of no linkage can be excluded, regions of significant linkage can be detected, and the linkage interval can be accurately defined.

How is the study number determined for the samples I have uploaded?

The study number assigned to your samples is the next study number available in sequence.

Can BaseSpace Variant Interpreter be used in a clinical lab, such as a CLIA lab?

Yes, but only after validating the BaseSpace Variant Interpreter software per institution, local, state, and federal guidelines before using it.

Can TruSeq DNA PCR-Free libraries be used for TruSeq Enrichment?

No, the yield from TruSeq DNA PCR-Free libraries is not sufficient as input into TruSeq Exome or Custom Enrichment assays.

What are the major differences between the miRNA and DASL protocols?

There are four new reagents, Polyadenylation Single (PAS), cDNA Synthesis Single (CSS), miRNA Assay Pool (MAP) and Single-Color Master Mix (SCM). There are also two new steps in the protocol: Make PAP for the polyadenylation, and Make CSP for cDNA synthesis. In addition, the Make ASE and Cycle PCR steps are modified.

How do I allow BaseSpace Variant Interpreter to access my BaseSpace Sequence Hub account?

BaseSpace Interpreter and BaseSpace Hub use the same logon credentials for authentication, so logging on to 1 application logs you on to the other. After logging on to BaseSpace Interpreter, your account automatically links to your BaseSpace Hub account to allow import and viewing of variant call files stored there.

What can cause my Nextera Rapid Capture Enrichment library to look different than the example shown in the Nextera Rapid Capture Enrichment Guide?

The quality and quantity of genomic DNA input into the Nextera tagmentation reaction will affect the library size distribution. A larger peak distribution (> 350 bp) can be indicative of > 50 ng genomic DNA input going into the Nextera tagmentation reaction. Conversely, a smaller sample peak distribution (< 225 bp) can be indicative of < 50 ng genomic DNA or fragmented, low quality genomic DNA.

It is critical to accurately quantify the concentration of input genomic DNA. Illumina recommends quantifying the starting genomic DNA using a fluorometric-based method specific to double-stranded DNA, such as QuantiFluor or PicoGreen. For more information, see the DNA Input Recommendations section of the Nextera Rapid Capture Enrichment Guide.

How can I visualize my results and compare them to GEX data?

miRNA data can be combined with mRNA gene expression BeadStudio projects (Whole genome or DASL) in BeadStudio Gene Expression v3.2 or later. One tool to help visualize positive and negative correlations is hierarchical clustering w/ absolute correlation.

How can I obtain new versions of cnvPartition?

Updated versions are available from GenomeStudio downloads.

Can I adjust what is in the Known Regions list?

Yes, this can be adjusted using Microsoft Excel. Information in any of the rows can be edited or deleted, or new rows can be added; however, no columns can be deleted. You can also create an entirely new known regions file and load it using the Filter Table interface. For more information about adjusting or creating new Known Regions lists, see the user guide.

Does the HiSeq System require an uninterruptible power supply (UPS)?

Illumina provides a region-specific UPS with the HiSeq System.

How up-to-date is the variant interpretation content in BaseSpace Knowledge Network?

Each content entry in BaseSpace Knowledge Network is stamped with an entry date parameter, indicating the creation date of the curated entry.

Can I generate a call report?

A call report can only be generated from the GenomeStudio Genotyping Module.

What is the per-genome price for whole-genome sequencing using Illumina FastTrack Sequencing Services?

Illumina FastTrack Sequencing Services offer several services. Price per genome varies depending on the service and number of genomes. For current pricing information, contact your local account manager or sales representative, or get a quote.

Are components from my other Nextera kits compatible with the Nextera Rapid Capture Custom Enrichment kit?

No. The Nextera Rapid Capture Custom Enrichment kits contain unique reagents that are not provided in other Nextera kits, including Index 2 primers with an "E" prefix. Index 2 primers from other Nextera library prep kits should not be used.

What if I have a magnet other than the magnet recommended for this library prep?

Cleanup procedures have been optimized and validated using the magnetic stand specified in the library prep reference guide. Comparable performance is not guaranteed when using other magnets.

If you choose to use a different magnet, test how long samples must sit on the magnet. Times can vary from the protocol.

How is the TruSeq ChIP Sample Prep kit different from the legacy ChIP-Seq Sample Prep kit?

The TruSeq ChIP Sample Prep kit generates Paired End (PE), indexed libraries that are compatible with all Illumina sequencing platforms and can be multiplexed. The legacy ChIP-Seq Sample Prep kit generates Single Read (SR) and non-indexed libraries that are compatible with Genome Analyzer, HiSeq and HiScanSQ, but not MiSeq.

Can RiboZero be used before sample prep and enrichment in the TruSeq RNA Exome protocol?

The use of RiboZero before sample prep and enrichment is not necessary and has not been tested by Illumina. The initial RNA input is very low for total RNA (10 ng) and FFPE RNA (20 ng).

Can I obtain a SNP list?

Yes, the SNP list is available on the Downloads page.

What is the best way to view the genes in a specific region of interest?

Gene information is preloaded into KaryoStudio and displayed in the IGV or in the Genes column of the Found Regions table. In addition, you can use the link to the UCSC Genome Browser, the Database of Genomic Variants, and DECIPHER; all of which provide additional gene information.

How can I be sure we are targeting only miRNA in the total-RNA?

Targeting only miRNA is achieved with two-step discrimination:

  1. Sequence hybridization—The specificity of the miRNA-specific probe, which targets the pre and mature miRNA species.
  2. Enzymatic primer extension—Enhanced discrimination between members of miRNA families and between miRNA and other similar sequences in the total-RNA (eg, mRNA targets).

Illumina has obtained very similar expression profiles with total-RNA and enriched small RNA species, suggesting that cross-hybridization (if any) from the total-RNA is minimal.

What do the failure codes indicate in my ADT results file?

Critical Failures (undesignable):

  • 101 = Flanking sequence is too short.
  • 102 = Polymorphism or sequence formatting error. Possible causes:
    • Check polymorphism format: SNP => [X/Y, INDEL => [-/XYZ, CpG => [CG
    • More than one set of brackets in sequence
    • Missing brackets around polymorphism
    • SNP alleles not separated by a "/"
    • Spaces found in submitted sequence
  • 103 = Top/Bot strand cannot be determined.
    • Low sequence complexity
  • 104 = Polymorphism is not appropriate for Illumina platform. Possible causes:
    • Tri- or quad-allelic SNP
    • Contains characters other than A, G, C, or T
  • 105 = Polymorphism is located on the mitochondrial genome. Mitochondrial polymorphisms are not recommended for Golden Gate oligo pool.
  • 106 = Degenerate nucleotides in assay design region.
    • W, R, S, N, etc
  • 107 = SNP sequence not found.
  • 108 = Final score falls below assay limit.

Warnings:

  • 301 = Polymorphism in duplicated/repetitive region.
  • 302 = Tm outside assay limits.
  • 304 = There are known SNPs within the probe region. See Underlying_SNP column for details.
  • 340 = Another polymorphism in this list is equal to or less than 60 nucleotides away.
  • 360 = Low score warning.
  • 399 = Multiple contributing issues.
  • 601 = Potentially non-specific against the genome.

How did Illumina determine the chromosomal location for the potential/putative miRNAs?

To determine the chromosomal location for the potential/putative miRNAs, Illumina uses the following procedure:

  1. Select 100% hits of genome BLAST for each mature miRNA sequence. If there is no 100% match, a value of zero is shown in the bgx file.
  2. For each hit, select the upstream and downstream pre-miRNA sequence (i.e., 5′-nnn-miRNA-nnnnnnnnnn-3′ and 5′-nnnnnnnnnn-miRNA-nnn-3′).
  3. Generate mfold structures for both upstream and downstream pre-miRNA sequences.
  4. Analyze each structure where the pre-miRNA is folded into the stem-loop structure.
  5. Filter out structures that are below the score threshold. The score threshold is estimated from the Sanger training set.
  6. Select the structure with the best score for each blast hit site.

When multiple 100% hits are found in the genome for one particular mature miRNA sequence, the chromosomal coordinates are listed by the score, starting with the best score.

What quantitation methods does Illumina recommend for starting material?

Dyes that are specific to double-stranded DNA, such as Qubit or PicoGreen.

What is the best way to ensure that my samples have been processed and analyzed without issues?

KaryoStudio provides several items for you to examine when you are QCing data. The LogRDev metric provides a measure of noise in the intensity data, and is essentially a measure of standard deviation of Log R values across the autosomes. The "percent aberration" metric is a sum of all of the found regions in a sample divided by the entire length of the genome. In a blood sample, where you expect to have little to no aberrations, you will see a very small (<1%) measure for % aberration. In cases where you have a higher number, it may indicate a sample processing issue. Both of these metrics can be impacted by real biological variation in samples, so they should be examined holistically while taking into account the data viewed in the IGV.

For specific troubleshooting issues and access to the controls dashboard, load your data into GenomeStudio, if you have access to that software. Otherwise, contact technical support with any additional questions on troubleshooting your data.

Can pre-TruSeq libraries be run on TruSeq cluster kits?

Yes, TruSeq cluster kits are backwards-compatible.

Do I need to purchase AMPure beads if using the TruSeq DNA PCR-Free Library Prep kit?

No, TruSeq DNA PCR-Free library prep kits contain Sample Purification Beads (SPB) that are used for size-selection and clean-up steps. They do not require the separate purchase of AMPure beads.

What is the maximum recommended genome size for de novo assembly?

The maximum recommended genome size for de novo assembly on the MiSeq is 20 Mb.

Does VariantStudio require an internet connection?

Illumina VariantStudio v2.1 requires an internet connection for annotating variants because the annotation database, known as the Illumina Annotation Service, resides in Amazon Web Services. After variants are annotated and saved in a project, an internet connection is not necessary.

How long does it take to scan an Infinium MethylationEPIC array on the iScan System?

The scan takes ~18-37 minutes per BeadChip.

What is the difference between the open source version of the Isaac algorithms and the HiSeq Analysis Software? Which one should I use?

The open source version of Isaac includes the component algorithms for the Isaac aligner and variant caller. It is intended for developers, and is not commercially supported. Instead, it is provided as is under Illumina Open Source Software License.

For most Illumina customers, we recommend using Isaac as part of the HiSeq Analysis Software (HAS) package. HAS provides rapid and easy alignment and variant calling for Whole Human Genomes (using the Isaac component algorithms) or Nextera Rapid Capture Exome Enrichment libraries (using the BWA/GATK component algorithms). The software is available through a command line interface or a graphical user interface package called Analysis Visual Controller Software (AVC).

HAS is freely available, easy to install (rpm), and commercially supported. For more information, HiSeq Analysis Software.

What RNA amplification kit does Illumina recommend?

Two kits have been tested for use with Illumina gene expression products:

  • Ambion Total Prep Kit (catalog # IL1791)
  • Epicenter TargetAmp-Nano Labeling Kit for Illumina Expression BeadChip (catalog # TAN07924)

The Ovation Amplification Kits from NuGen have also demonstrated acceptable results with the use of a modified protocol. (Illumina does not provide technical support for these kits.)

What is the lowest detectable fold change?

During product development experiments, the lowest detectable statistically significant fold change was 1.2 fold.

What are the requirements to run Infinium MethylationEPIC arrays?

Illumina Automation Control v5.3.0, Illumina LIMS v4.8.1, Tecan Tip Guide-E, Standard Teflow glass back plates and spacers, iScan Control Software v3.3.29, GenomeStudio 2011.1, BeadArray Controls Reporter.

Is the controls dashboard different for the Infinium HumanMethylation27 BeadChip versus the HumanMethylation450 BeadChip?

Yes, there are several differences in the control dashboard due to the inclusion of Infinium II Assay designs for the Infinium HumanMethylation450 BeadChip. See the Infinium HumanMethylation450 BeadChip User Guide for more details.

What is the turnaround time for Illumina FastTrack Sequencing Services projects?

Turnaround time depends on the selected product and the requested number of samples. Turnaround time also depends on when Illumina receives samples. See the Sequencing Service Process page for estimated turnaround time. Contact your local account manager or sales representative or submit questions through the Illumina website

Is the TruSeq Nano DNA Library Prep Kit compatible with TruSeq Exome Enrichment or Custom Enrichment?

The TruSeq Nano DNA kit is not compatible with the TruSeq Exome Enrichment or Custom Enrichment kits. The Nextera Rapid Capture product supports a variety of enrichment applications. For more information, see Nextera Rapid Capture.

How much DNA is used for the different library prep assays?

Refer to the DNA Input Recommendations section of the respective library prep reference guides.

I clicked the 'Download Checked Samples' option, and it completed the download. Why can't I find the genotyping results?

When you click the Download Checked Samples button only functions to generate the genotype data. You must also click the Generate Report button to generate the report and save it to your computer.

The Ambion TotalPrep kit is biased for eukaryotic genomes, whereas the microbial mRNAs lack poly-A tails. Does Illumina support any sample prep/labeling kit for microbial genomes?

Illumina has not tested kits designed to label microbial RNA. Reagent vendors such as Ambion do sell kits for this application. These kits may work, but there is no particular kit that we recommend. It is important that a single source of biotin-16-UTP (i.e., same vendor) is used for all labeling reactions (e.g., Ambion #8452 or #8453).

What if my SNP is not in your database?

You can submit the SNP in a SequenceList.

Can I quantify my cRNA by A260 instead of RiboGreen?

Illumina has found quantification by RiboGreen to be more accurate than spectrophotometer readings. Contaminants from purification columns can give artificially high readings. If you must use A260, sample measurements should be > 0.5 to minimize the impact of contaminants on amplification.

Can I import sample information (metadata) instead of entering it manually?

Yes. Metadata can be added to a sample at any point, with options to add it when importing the VCF files (recommended). After importing VCF files, sample metadata can still be added or edited using the software interface or a sample metadata sheet. However it is added, metadata autopopulates the appropriate section of the report.

Because only a single strand is captured, are there a large number of incorrect methylation calls due to high frequency SNPs?

Because bisulfite sequencing converts unmethylated cytosines to uracil, which eventually become thymines in the final sequence, C>T SNPs could be mistaken for "unmethylated" cytosines. For regions containing CpGs with common SNPs (MAF >5% in EUR superpopulation 1000 genomes data), probes are designed to capture both strands. As a result, high frequency SNPs at CpGs that would be obscured if only a single strand was sequenced can now be called by looking at the second strand in these regions. If low frequency SNP calls (<5% MAF) at unmethylated CpG sites are desired, other suitable technologies should be used (eg, sequencing an unconverted sample).

What is the expected size of a MiSeq analysis folder?

The size of the analysis folder output with each sequencing run depends on the number of cycles in your run. Typically, a 150-cycle paired-end run (2 x 151 cycles) generates ~3 GB in output.

When is RTA v2.7.7 available to support lower Phi-X spike-ins on on the HiSeq 4000 and HiSeq 3000?

Real-Time Analysis (RTA) v2.7.7 is available to TruSeq Methyl Capture EPIC customers in Q4 of 2016. It will then be released to all Illumina customers by 2017.

What are the sizes of the TruSeq QC probes?

There are three artificial dsDNA targets in the TruSeq Sample Prep Kit (CTE, CTA, CTL) that measure the enzymatic activities of each of the ERP, ATL, and LIG reagents. Each type of control probe ranges in size of 150-850 bp, in 100 bp increments. Illumina recommends selecting the 400-500 bp range library into cluster generation. However, for TruSeq Enrichment, Illumina recommends selecting a 300-400 bp range library.

How should I send DNA samples to Illumina for genotyping?

We will provide to you barcoded 96-well plates for you to ship samples to us along with a step-by-step preparation protocol.

There are samples from the HapMap Project in the database. What HapMap sample IDs do these Illumina barcodes correspond to?

A list of the Illumina barcodes and their corresponding HapMap IDs is available upon request from techsupport@illumina.com.

Can TruSeq DNA and RNA libraries be run on older cluster kits?

Yes, as long as the index read primer and multiplex read 2 primer are used. TruSeq DNA and RNA libraries have the same architecture and sequencing primer attachment sites as v2 multiplexed libraries.

Are there in-line controls included in the TruSeq Small RNA Library Prep Kit?

There are no in-line controls for the TruSeq Small RNA Library Prep Kit.

Why do some antibodies work for ChIP and others do not?

ChIP assays require that the epitope recognized by the antibody be available/exposed after cross-linking and not obscured in the middle of a protein complex. Antibodies have to be of very high avidity (strength of multiple interactions) so that the interaction with the protein will survive the washing steps. It also requires that the protein you are trying to immunoprecipitate cross-links efficiently to the chromatin.

Can TruSeq DNA HT (dual-indexed) libraries be used for downstream enrichment using the TruSeq Exome Enrichment or TruSeq Custom Enrichment kits? Are there any differences if the HT dual-indexed libraries are used compared to the TruSeq DNA v2/LT kits?

Yes, TruSeq DNA HT libraries can be used for downstream enrichment using either the TruSeq Exome Enrichment or TruSeq Custom Enrichment kits. There are slight differences in the design of the v2/LT and HT adapters; therefore, use of the HT adapters may result in slightly lower percent enrichment compared to the v2/LT adapters.

What is the difference between the TruSeq Nano DNA LT and HT kits?

The TruSeq Nano DNA LT Library Prep Kit comes with single-index adapter tubes recommended for preparing 24 or fewer samples at a time. The LT kits come in two sets, A and B. Each set contains 12 unique single-index adapter tubes for a total of 24 unique single-index adapters when both kits are combined.

The TruSeq Nano DNA HT Library Prep Kit comes in a 96-well adapter plate with 96 dual-indexed adapters and enough reagents for 96 samples.

What is the distribution of content types available in BaseSpace Knowledge Network?

The first release of BaseSpace Knowledge Network contains content curated by the Illumina Scientific Research Curation team. For more information on the content and the process that generated it, see BaseSpace Knowledge Network.

Which kits are available and how many samples can I process?

You can purchase standalone library prep reagents or a kit that includes both the library prep and sequencing reagents. The following kits are available:

  • Catalog # OP-101-1001–TruSight Tumor 15 MiSeq Kit (includes library prep and sequencing reagents) for 24 samples (48 libraries) with 3x MiSeq v3 reagents (600 cycle)
  • Catalog # 20005610–TruSight Tumor 15 MiniSeq Kit (includes library prep and sequencing reagents) for 24 samples (48 libraries) with 3x MiniSeq High Output reagents (300 Cycle)
  • Catalog # OP-101-1002–TruSight Tumor 15 (library prep only), for 24 samples (48 libraries)

You can order an additional MiSeq reagent kit v3 (600 cycle) (MS-102-3003) or MiniSeq High Output reagents (300 cycle) (FC-420-1003).

How are TruSight HLA reagents shipped?

TruSight HLA includes four boxes: two for pre-PCR (boxes 1 and 2) and two for post-PCR (boxex 3 and 4). Of each set, one box is shipped chilled and the other box is shipped frozen, as follows.

  • Boxes 1 and 3 are kept cool with gel packs.
  • Boxes 2 and 4 are frozen and shipped on dry ice.

Where can I find the Safety Data Sheets for TruSight HLA v2?

Saftey Data Sheets (SDS) are at support.illumina.com/sds. Search for one of the following numbers to obtain the most current SDS:

  • 20000215 for TruSight HLA v2 Sequencing Panel (24 Samples)
  • 20005170 for TruSight HLA v2 Sequencing Panel (24 Samples Automated)

How do I quantify the final libraries?

A qPCR or fluorometric quantification assay using dsDNA binding dyes (such as Qubit or PicoGreen) can be used to quantify the final libraries.

How many components can I expect with my order?

Orders will receive five items: four reagent boxes including the MAP, and one package containing arrays.

Which systems are eligible for the HiSeq v4 upgrade?

System

Serial Number

HiSeq 2500

SN# D00101 or higher

HiSeq 1500

SN# C00101 or higher

HiSeq 2000

SN# 7001403 or higher

Field-upgraded HiSeq 2500

SN# 7001403 or higher

Field-upgraded HiSeq 1500

SN# L179 or higher

For more information, see the upgrade resources on the Documentation page for your instrument.

How are the VeraCode Universal Beads used?

Each uniquely coded VeraCode Universal Bead has a unique oligonucleotide capture sequence attached and can be used to design nucleic-acid based assays. For example, to develop a 3-plex reaction using a single-color detection assay, such as Allele Specific Primer Extension (ASPE), pool together six different tubes of unique VeraCode Universal Oligo Beads (one bead type per allele).

How often is the annotation updated?

This depends on updates to the Sanger miRNA database, http://microrna.sanger.ac.uk/sequences/. We plan to update the annotation as needed.

What can cause my library to look different than the example shown in the reference guide?

The profile of the pre-enrichment library product can look different from the example shown depending on the type and quality of input DNA. Sometimes a larger molecular weight peak is present. This peak can be variable in size. However, it has minimal to no effect on the final exome metric output and you can proceed with the protocol. This peak is most often a result of the amplification step of the protocol.

Can I run FFPE samples on the Infinium Methylation Assay?

FFPE samples are NOT recommended for the standard protocol. FFPE samples are already highly degraded, with a high level of crosslinking, so conversion does not occur effectively. However, you can run FFPE samples on the Infinium HumanMethylation450 BeadChip using the FFPE automated or FFPE manual protocol along with the Infinium FFPE DNA Restoration Solution kit.

Can I save my filter settings as a template for use with other data sets?

Yes. Any filter combination can be saved and applied to other data sets. For more information, see the BaseSpace Variant Interpreter Online Help.

How can I take data from samples that have been run in the past and see if those same found regions appear in samples that are run in the future?

You can adjust the information contained within the known regions report. Refer to the user guide, which describe how to customize the known regions table so you can track your favorite regions in the additional samples.

What is the HiSeq X Ten?

The HiSeq X Ten is a collection of 10 ultra-high throughput HiSeq X Sequencing Systems. HiSeq X Ten can sequence more than 18,000 genomes per year at a price point of less than $1000 USD per genome.

Can secondary analysis run on MiSeq while a run is in progress?

If a new sequencing run is started on the MiSeq System before secondary analysis of a previous run is complete, secondary analysis will be stopped automatically. MiSeq computing resources are dedicated to either sequencing or analysis, and the system is designed in such a way that a sequencing command overrides an analysis command. Secondary analysis can later be requeued from the MiSeq Reporter Analyses tab.

How frequently do you update your annotation database?

Illumina does not have a specific schedule for updating the annotation database. The update process largely depends on the cadence of updates from the different sources aggregated to create the annotation database.

What is the correlation between the Infinium Human Methylation EPIC BeadChip and TruSeq Methyl Capture EPIC?

A Pearson correlation value of 0.96 was obtained. For more information, see the TruSeq Methyl Capture EPIC data sheet.

Can the MiSeq Reporter software be run on a separate PC?

MiSeq Reporter can be run on a 64-bit PC with at least 8GB RAM (16–32 GB RAM for optimal performance), Windows Vista or Windows 7, and at least 1 TB of available hard disk space. This allows secondary analysis to be performed offline.

What is the throughput of the BeadXpress Reader?

Throughput depends on the level of multiplexing and whether you are running a single- or dual-color detection scanner. Typical throughputs are:

Multiplex

Single-Color Detection

Dual-Color Detection

10

140 samples/hour

120 samples/hour

96

90 samples/hour

68 samples/hour

384

44 samples/hour

30 samples/hour

How does Illumina select SNPs and develop the SNP genotyping assays?

For custom genotyping projects, we will validate and develop the SNP genotyping assays for you. Prior to development, we will work with you to select SNPs, screening those SNPs informatically. Assay development success depends upon the source of the SNP, its frequency in the population, and the assay system. For instance, many SNPs derived from databases are sequencing errors, or exist in too low a frequency to be useful in most genotyping studies.

How does Illumina FastTrack Microarray Services ensure the data quality of each sample?

Customer samples go through multiple quality control steps throughout the Illumina FastTrack Microarray Services workflow. The microarray team compares quality results to internal controls for accuracy verification. The microarray team takes these steps to ensure quality:

  • For DNA samples, the microarray team graphs the call rates and the GenCall score. Outlying samples typically have poor quality.
  • For human samples on standard arrays, the call rate cut-off for sample quality is above 99%. Quality cut-offs can vary by sample species, product used etc.
  • In some cases, samples have apparently poor quality for genetic reasons. Your project manager works with you to understand the study and downstream ramifications.
  • For marker quality, the project manager looks at several metrics to investigate clustering quality. Some of those metrics include call rate, reproducibility, cluster separation, Hardy-Weinberg when applicable, intensity levels, etc. The goal is to identify poorly performing samples as well as markers that have not clustered well and either 'zero' or edit them.

Can I import custom annotations?

Yes. You can import custom annotations to the software using the Custom Annotation function in VariantStudio. The custom annotation file must be in tab-delimited text format (TSV) and must contain a header row that specifies the column names. VariantStudio allows both custom variant annotations and custom gene annotations.

Can TruSeq Exome Enrichment Kits be used for methylation applications?

The TruSeq Exome Enrichment kit protocol is not currently compatible with bisulfite sequencing.

Can I perform sample-to-sample comparisons?

No, BaseSpace Variant Interpreter does not currently support direct comparison of analysis results from two different samples.

Can I transfer called SNPs from CASAVA to use in custom BeadChip design?

Called SNPs from CASAVA can be parsed to ADT via two reports: a dbSNP Report for the start and stop positions of the region flanking the SNP, and a Sequence Report which provides the actual sequence of the region flanking the SNP.

How do I check the quality of my TruSeq Exome library?

Use an Agilent Technologies 2100 Bioanalyzer to check the quality and intended size distribution of a Covaris sheared sample, the pre-enriched library, and the post-enriched library. For examples of bioanalyzer traces and library size distributions, see the TruSeq Exome Reference Guide.

How can I tell if my genes of interest are present in the TruSight Cardio Sequencing Kit?

The TruSight Cardio *.bed file lists all targets of enrichment for this assay. Filter by gene name (HGNC nomenclature) or reference the coordinates of your loci of interest in the human genome reference (hg19 build). See the product files for the gene list.

MiSeq Reporter indicates that I have a sample numbered 0. What is sample 0?

Sample 0 is not designated in the samplesheet. Reads that were not successfully assigned to a sample are written to a FASTQ file for sample number 0, and excluded from downstream analysis.

In what format does Illumina provide the final data at the end of the project?

Illumina provides compact discs containing the final data in standard comma-delineated text. For large genotyping studies, contact us to discuss custom format options.

Is TruSeq DNA PCR-Free suitable for GC-rich samples?

Yes, TruSeq DNA PCR-Free gives better coverage for regions that are difficult to sequence. See the TruSeq DNA PCR-Free Datasheet and example datasets for more detailed information.

In addition to the MiSeq system, what other equipment is required for TruSight HLA?

You need 3 thermal cyclers with heated lids per 192 libraries, a microplate shaker up to 1800 rpm, and 96-well plate magnets for the bead normalization steps. A microplate heater is optional (thermal cyclers can be used instead).

Can I add previously generated content entries to the BaseSpace Knowledge Network?

No. Currently, the only method to add new content to your private knowledge base is through the curate as you review workflow.

What is the biggest genome I can sequence on the MiSeq?

For best results, sequence small genomes (up to 20 Mb) on the MiSeq System.

Does custom Infinium support Tri/Tetra-allelic SNP assays?

Not at this time.

How was the default known region list generated?

The known region list covers 244 known regions and was generated by compiling information from multiple sources including publications, databases, and public information available on which regions are typically examined in cytogenetics labs.

What is the Decode File Client?

The Illumina Decode File Client is a Windows-based software application used to download DMAP (decode map) files. DMAP files are unique per BeadChip and required for BeadChip scanning.

How do I requeue a run for analysis in MiSeq Reporter?

MiSeq Reporter needs to have access to the repository, which is the location of the folder containing data for analysis. You can set this location in the settings window on the MiSeq Reporter main screen. When MiSeq Reporter has access to the repository, your runs appear in the Analyses tab in MiSeq Reporter. Select the Requeue checkbox next to the run you want to analyze, and then click Requeue to start analysis.

What is the difference between the v1.5 small RNA and TruSeq small RNA sample preparation kits?

  1. The TruSeq kits allow indexing of up to 48 samples per lane. The v1.5 kits do not support indexing.
  2. The TruSeq kits have been optimized to simplify the workflow and reduce adapter dimer formation.
  3. The TruSeq kits allow paired-end sequencing, useful for directional RNA sequencing.

Is there an optimal cut-off value (i.e., number of tags per peak) to use to eliminate the majority of false positives?

See the GenomeStudio ChIP Sequencing Module User Guide, available in iCom and the GenomeStudio Portal, for information.

What are the system requirements for the Decode File Download Utility?

The software will work with any PC that has internet access (port 80) and for which you have rights to install new programs. You may need help from your IT department if you do not have sufficient rights to install new software on your computer. There are no other firewall or security restrictions.

Is a uracil-tolerant PCR polymerase supplied with the kit?

Uracil-tolerant PCR Polymerase is not supplied. However, purchasing Kapa HiFi HotStart Uracil + ReadyMix (2x) from Kapa Biosystems is recommended. Illumina has tested seven different commercially-available uracil-tolerant polymerases, and only Kapa Hifi gave results within Illumina specifications.

How does Isaac improve the speed of analysis 4–6 times over existing methods?

The Isaac aligner aligns reads by first identifying a small but complete set of relevant candidate mapping positions. The Isaac aligner begins with a seed-based search, using 32-mers as seeds. The initial single-seed search is followed by a multi-seed only for the reads that couldn't be placed unambiguously with a single seed. Speed up is achieved by sorting the reference index by the 32-mers. Improvement to accuracy is achieved by flagging of all the ambiguous reference positions in the index.

Following the seed-based search, selection of the best mapping among all the candidates is performed. For paired-end data sets, all mappings where only one end is aligned (orphan mappings) trigger a local search to find additional mapping candidates (shadow mappings) in the neighborhood defined by the expected minimum and maximum insert size. After optional trimming of low quality 3' ends and adaptor sequences, the possible mapping positions of each fragments are compared, taking into account pair-end information when available, possible gaps (using a banded Smith-Waterman gap aligner) and possible shadows. The selection is based on the Smith-Waterman score (using BWA, ELAND or user-defined scores) and on the log-probability of each mapping. The main speed-up comes from a parallel implementation of the gap aligner (using the SSE2 instruction set) and a shadow aligner optimized for short inserts. Further improvements could be achieved with AVX. The gapped alignment could be delegated to a coprocessor (e.g. Xeon Phi or GPU), however it is unclear if the benefit of large-scale parallelization would outweigh the cost of transferring the data between host and coprocessor.

Following alignment the fragments are sorted. Major speed-up in the sorting speed-up comes from efficient binning of the selected mappings, which greatly simplifies the sorting. Further analysis is performed to identify duplicates and optionally to re-align indels.

What is the overlap between SNPs on the Infinium OncoArray-500K and other Illumina arrays?

The following table lists arrays with overlapping SNPs.

HumanCore v1

246,557 SNPs

HumanCoreExome v1-1

260,852 SNPs

OmniExpress v1-1

275,691 SNPs

iCOGs

40,569 SNPs

What types of inputs does ADT support?

Acceptable inputs include RSList, GeneList, SequenceList, and RegionList.

What are the longest indels that Isaac has been able to detect reliably?

Isaac has reliably detected 10 bp continuous indels over the length of the read.

What additional functionality does cBot 2 System offer?

The cBot 2 System offers positive sample tracking for the cluster generation step of the sequencing workflow. The instrument records the barcode ID of the reagents, flow cell, library template, and any custom or additional primers used for a run. You can configure your cBot 2 System to share those IDs with LIMS, or work in standalone mode.

My computer has only a CD drive, not a DVD drive. How should I install GenomeStudio software?

There are three options:
--Download GenomeStudio software from iCom.
--Install GenomeStudio software over a network that has a shared DVD drive or a copy of the GenomeStudio image.
--Purchase a portable DVD drive with a USB port, then install GenomeStudio software from the DVD.

What if UCSC does not have a reference genome for the species I am interested in?

The genome files for nonhuman species conform to the UCSC format. The GenomeStudio 2008.1 Framework User Guide, available in iCom and the GenomeStudio Portal, describes how to create genome files for non-UCSC genomes.

Illumina is developing tools that convert NCBI formats to UCSC format. Check the GenomeStudio Portal for updates.

To which genome build are the current probe coordinates mapped?

The probe coordinates of v1 and v2 MAPS are mapped to Genome build 36.2.

The CASAVA calls SNPs at areas where the coverage dips. Is this expected?

These SNPs may not be real SNPs, but small indels. A small indel will cause a short run of snp calls (~indel+4) with a concomitant dip in coverage. Check whether the apparent SNP can be explained by a short indel.

In what format does MiSeq Reporter output aligned data?

MiSeq Reporter outputs aligned data in the BAM file format.

What is a cluster file?

The cluster file contains the mean (R) and standard deviation (theta) of the cluster positions in normalized coordinates for every genotype for every SNP. The cluster file also includes cluster score information and the allele frequencies from the training set used to generate the cluster file.

KaryoStudio requires a cluster file. Illumina provides a standard cluster file for each product. Alternatively, you can generate your own cluster file.

Which total-RNA quantification method does Illumina recommend?

Any routine method is fine. Illumina recommends using RiboGreen to quantify total-RNA concentration.

What is the difference between the library prep HS and LS protocols?

The library prep reference guide contains both a Low Sample (LS) and High Sample (HS) protocol. These protocols differ in the types of plates used and the method of incubation and mixing. The LS protocol uses 0.3 ml PCR plates, the incubation steps are done on a thermal cycler, and mixing method is pipetting. The HS protocol is done in MIDI plates, requires additional equipment such as microheating system for all incubation steps, and mixing method is done on a microplate shaker. The LS protocol is optimized for processing 24 samples at a time and HS protocol for more than 24 samples. Both the LT and HT kits can be used with either the LS or HS protocol.

What additional equipment do I need with a MiSeq System?

The MiSeq System includes all the hardware needed for cluster generation, sequencing, and data analysis. More advanced analysis requires additional computing infrastructure. Other equipment may vary with application and sample prep methods, which is outlined in library prep documentation.

How am I notified when there is a new version of the Illumina Annotation Engine?

If a new version of the Illumina Annotation Engine is released at the same time as a new software version release, the updates to the annotation database are specified in the software release notes. Otherwise, you receive an email notification informing you that a new database version is available. The email outlines the changes made relative to the previous annotation database version.

How many samples can I load into KaryoStudio?

There currently is no limit on the number of samples that can be loaded into KaryoStudio.

Refer to Appendix A of the KaryoStudio Software User Guide or see the KaryoStudio Benchmark Performance Technical Note for more information about KaryoStudio software performance.

Does the Library QC workflow in MiSeq Reporter perform alignment?

Yes, alignment is performed in the Library QC workflow; however, it is performed using a faster and less sensitive setting, which provides a much faster turnaround time. Variant calling is not employed for this workflow.

How do I know that MiSeq Reporter has completed its analysis?

When MiSeq Reporter analysis is finished, a checkmark appears in the State column of the Analyses tab and the CompletedJobInfo.xml file is written to the root level of the analysis folder.

How many samples can be processed?

We offer two kits to support the following sample numbers.

Library Prep Component

Catalog #

TruSeq PCR-Free (24 samples)

20015962

TruSeq PCR-Free (96 samples)

20015963

Where can I find the Safety Data Sheets for TruSight HLA?

Safety Data Sheets are available at support.illumina.com/sds.html. Enter FC-142-1001 in the search line to access the most current SDS for TruSight HLA.

Can I use cRNA prepared according to the Affymetrix protocol on BeadChips?

Illumina has had poor results when using fragmented sample on our arrays. The Affymetrix protocol requires fragmentation during sample preparation due to the short probes used on their arrays as it reduces the secondary structure of RNA. Illumina's longer probes allow for more stringent hybridization conditions which preclude the need for RNA fragmentation. If you have labeled sample which has not been fragmented, you might obtain satisfactory results, depending on the age and quality of the sample.

How do I start a project with Illumina FastTrack Microarray Services?

If you are interested in a project with Illumina Fast Track Microarray Services, contact your sales representative or get a quote through the Illumina website. An Illumina sales representative discusses your genotyping project and helps you select the product that best fit your needs. The sales representative also provides a quote. After you receive the quote, a project manager contacts you to discuss your microarray project and the sample submission process. For both test and production samples, the Illumina project manager ships you a thermal cooler that contains barcoded plates, sealing mats, and return labels. If necessary, a courier picks up your samples. There is no charge for shipping.

The installation instructions state that I need Microsoft .NET 3.5 in order to run GenomeStudio software. Where can I get .NET 3.5?

You can download .NET 3.5 from the Microsoft web site.

Is there additional information on the uracil-tolerant polymerase comparison?

See the white paper on the kit Documentation & Literature page for more information.

What level of sample plexity is supported for TruSeq Exome?

The number of samples pooled pre-enrichment depends on the kit being used. See the following for sample plexity in each kit. The input amount of each library changes as plexity changes. Refer to the TruSeq Exome Library Prep Reference Guide for detailed guidance.

Kit Name

Catalog #

TruSeq Exome Library Prep Kit (8 rxn x 3 plex)

FC-150-1001

TruSeq Exome Library Prep Kit (8 rxn x 6 plex)

FC-150-1002

TruSeq Exome Library Prep Kit (8 rxn x 9 plex)

FC-150-1003

TruSeq Exome Library Prep Kit (8 rxn x 12 plex)

FC-150-1004

Can I upgrade my HiSeq 2500 System or earlier model to a HiSeq 4000 System?

No. An upgrade package (catalog # SY-401-4002) is available for HiSeq 3000 Systems only.

What is the minimum shelf-life of TruSight HLA reagents?

TruSight HLA reagents are shipped with a minimum shelf life of six months.

What has changed with the NextSeq 500/550 Kit v2?

The NextSeq 500/550 Kit v2 provides an improved workflow during run setup. You no longer have to add sodium hypochlorite (NaOCl) and dual-index sequencing primers (BP13) manually before the run. All required reagents are included in the prefilled reagent cartridge v2.

Reagents provided in the NextSeq 500/550 Kit v2 in combination with NCS v1.4 generate improved data quality and a higher yield of error-free reads, and enable the use of custom sequencing primers for dual-indexed runs.

What regions are targeted?

The TruSeq Methyl Capture EPIC Sequencing Panel contains content that spans the full human methylome, including CpG islands, shores, shelves, enhancers, promoter regions, sites in open chromatin, and gene bodies.

This panel builds on content included in the Infinium Human Methylation EPIC BeadChip with additional regions of importance identified by ENCODE, FANTOM5, the Epigenomics RoadMap Consortium, and customer requests.

Why is a study number used for uploaded samples?

A study number allows you to keep track of the samples you have uploaded, and provides an easy reference to identify your control samples for any publications you are preparing. It also allows others who are referring to your publications and studies to replicate the analysis by downloading the same samples.

Which IEM workflow do I use in for TruSeq Targeted RNA Expression?

Select the RNA Resequencing category, and then select the Targeted RNA application.

How long does it take for MiSeq secondary analysis to complete?

For a 2 × 250 bp run, analysis takes about 3 hours when using the latest PC RAM configuration on the MiSeq. Genome size for resequencing also affects analysis time. If analysis is taking longer than two hours, consider mapping to a more appropriate reference for your sample, or perform analysis offline by installing MiSeq Reporter on another computer.

If an alignment is performed against the whole genome, then the analysis time will be significantly longer than two hours. Also, bioinformatics analysis for metagenomics may take as long as 12 hours.

For custom Infinium content, what is the expected overall conversion to functional assays?

The conversion rate depends on many factors including our stringent QC criteria during manufacturing, the sequence nature of the chosen SNPs, and criteria used for Gentraining. To maximize chances of success, we recommend selecting validated SNPs and/or SNPs with high design scores. In general, we expect the final Design Conversion Rate to average 80%.

What are some guidelines for improving data quality?

With Infinium products, the two main parameters for copy number are the B Allele Frequency (based on genotypes) and the Log R Ratio (based on intensities). The Log R Ratio is the log (base two) of the "observed intensity" divided by the "expected intensity". The "expected" intensity is generated from the cluster file.

Because of this direct comparison, accurately measuring the sample input amount is vital. Essentially, your input amount should match the recommended value (400 ng for Infinium HD Duo; 200 ng for Infinium HD Quad and Infinium HD 12-sample products). If this is this case, your Log R Ratio signal will tend to have low noise.

When the DNA samples are inaccurately quantified, you may see an "undulation" pattern (this looks like waves) in the log R ratio. This tends to be in GC-rich regions of the genome. This wavy pattern makes it difficult to do CNV analysis as the waves themselves look like copy number changes. This tends to confuse algorithms and confound analysis. On the other hand, call rates tend to be only slightly affected by this change (but this varies).

Is the VariantStudio software validated?

Illumina VariantStudio has undergone standard software development and testing requirements applied to all Illumina software tools developed for Research Use Only.

Illumina has not submitted the Illumina VariantStudio software for regulatory approval. The software has not undergone analytical validation or clinical validation required for software tools that are regulated as a medical device. Therefore, it is important that you follow procedures for validation of the software according to your Institution, Local, State, and Federal guidelines.

Is it possible to know who submitted the individual samples?

Samples are listed as being submitted either by Illumina, or by Other. The User Agreement that is necessary to comply with the Health Insurance Portability and Accountability Act (HIPAA) does not allow the identification of the source of the data from the download tool.

Is bisulfite sequencing compatible?

The polymerases used in the kits are not appropriate for bisulfite applications.

How does BaseSpace Variant Interpreter handle tri-allelic sites?

When both alleles of a heterozygous position are different from the reference, as in a tri-allelic position, the variants are split into 2 lines and both variants are annotated.

Over what range of intensities can I detect the median significant detectable fold change?

The intensity range over which a fold change of ~1.35 is significantly distinguishable is greater than 3 logs.

Is the SNP genotyping array included in the WGS service?

The sequencing team runs each of your samples in parallel on the BeadChip. You receive a file of the genotyped SNPs in VCF format. Illumina GenomeStudio software directly outputs the VCF file. The calls are obtained by applying the product standard cluster file to your data. For projects larger than 100 samples, Illumina recommends that you recluster on the project samples, check SNP performance, and re-export the genotype calls.

Illumina also provides intensity data files (IDATs) and a sample sheet. You can combine these files with the product definition files (beadpool manifest .bpm, and standard cluster file .egt, available on the Illumina support site) to recreate a GenomeStudio project from the source data. This step allows you to visualize and assess SNP and sample performance.

Can variant classification information be versioned in the Classification Database?

No, variant classification and associated notes saved in the Classification Database are not versioned. If the classification category or notes are modified for a given variant in the Classification Database, previous information is overwritten with the current modification.

What data analysis solutions are available for NextSeq 500?

The NextSeq 500 seamlessly streams data to BaseSpace, where demultiplexing and conversion to industry-standard FASTQ file formats occurs automatically as the final step in data transfer to Basespace. BaseSpace provides a number of analysis tools, including the Core BaseSpace Apps for whole genome sequencing, enrichment, and RNA-Seq.

Illumina provides the bcl2fastq 2.0 conversion software for demultiplexing and conversion of NextSeq 500 output to standard FASTQ file formats. This conversion software enables analysis of NextSeq 500 data in any third-party NGS analysis solutions.

Can I perform family-based genetic disease analysis in BaseSpace Variant Interpreter?

BaseSpace Variant Interpreter supports family-based analysis for singletons (proband only), duo (proband plus one parent), trios (proband plus both parents), and extended pedigree (proband, both parents, and up to 5 siblings). For more information, see the BaseSpace Variant Interpreter Online Help.

What information is stored as part of the DMAP download session?

Illumina records the login ID and the date, time, and serial numbers of any downloads. No cookies or information other than the *.sdf and *.dmap files are written to your computer.

Does the HiSeq Analysis Software replace CASAVA?

No, the HiSeq Analysis Software provides analysis of libraries prepared with the Nextera Rapid Capture exome enrichment kit and human Whole Genome Sequencing using only the hg19 genome as a reference. CASAVA provides analysis of additional applications such as RNA sequencing, Exome sequencing, targeted resequencing, and Whole Genome Sequencing using a more extended set of reference genomes available on the Illumina iGenomes page.

Is VariantStudio available as a command-line tool?

No. Illumina VariantStudio is a point-and-click software application that enables variant data exploration, annotation, and filtering without requiring bioinformatics expertise.

Can custom primers be used on the HiSeq 4000 System?

Custom primers have not been tested for use on the HiSeq 4000 System.

Are AMPure XP beads supplied with TruSeq Library Prep kits?

For some library preparations, AMPure XP beads are user-supplied from Beckman-Coulter Genomics. See the appropriate TruSeq DNA or RNA library prep guide for more information.

What do I need to know to allow VariantStudio access to BaseSpace via my company firewall?

VariantStudio needs external connections to annotate. For annotation, VariantStudio needs access to the annotation URL via port 80. This URL can be found in the installation folder, in the file VariantStudio.exe.config, next to the key annotationServiceUri. This link should be an Amazon Web Services or Illumina link. In addition to the annotation connection, VariantStudio must make a one-time connection to basespace.illumina.com and icom.illumina.com, also through port 80. This connection serves to make sure that the user has access to the Illumina Annotation Service. This connection is needed again if the Forget BaseSpace Logon option is used, or if the BaseSpace account information is not saved.

What is the maximum number of reads, lanes, or flow cells that can be loaded into GenomeStudio software?

As of January 2009, approximately two billion rows can be loaded, and 128 million rows can be displayed in the Sequence Table.

Where can I find information about library input for the various HiSeq chemistries?

For information on library library input for the various HiSeq chemistries, see the HiSeq Systems Denature and Dilute Libraries Guide.

Why can no custom reference be used? Is the reference generation itself proprietary? How easily can the process of creating a reference be implemented?

The Novel Indexing Scheme was pre-computed for the hg19 human reference and should be downloaded separately (67 GB size on hard drive).The reference is not proprietary. The open source version provides the tools to index any reference, however this requires significant computational resource and is not supported by Illumina at the moment.

Does Isaac utilize quality information during SNP-calling?

Base qualities are considered during mapping to calculate the alignment score and during SNP and indel calling to calculate the variant quality scores.

What is the difference between TruSight Tumor 15 and TruSight Tumor 26?

TruSight Tumor 15 targets 15 solid tumor genes, while TruSight Tumor 26 targets 26 genes. TruSight Tumor 15 is a multiplex, PCR amplicon-based assay and the TruSight Tumor 26 uses the TruSeq Custom Amplicon extension-ligation chemistry.

How does TruSight HLA v2 help with keeping pre- and post-PCR workflows physically separate?

TruSight HLA v2 kits include 3 boxes and all pre-PCR reagents are included in a physically separate box (Box 1) from the post-PCR reagents (Boxes 2 and 3).

What concentration does my RNA need to be?

Illumina specifies 40 ng/ul in the assay protocol.

How does VariantStudio handle tri-allelic sites?

When both alleles of a heterozygous position are different from the reference, as in a tri-allelic position, the variants are split into two lines and both variants are annotated.

Does Illumina offer a custom targeted resequencing product?

Yes. Illumina offers a TruSeq Custom Enrichment kit. Reference the TruSeq Custom Enrichment Data Sheet for more information.

What is the HiSeq Analysis Software?

HiSeq Analysis Software (HAS) provides rapid, easy alignment and variant calling for whole human genomes or libraries prepared with the Nextera Rapid Capture Exome Enrichment kit

  • For human whole-genome sequencing (WGS), HAS features the Isaac analysis workflow. Isaac provides a 4–6x speed increase over existing methods.
  • For Nextera Rapid Capture analysis, the BWA alignment and GATK variant calling methods are used.

The software is command-line, or you can use a graphical user interface package called Analysis Visual Controller Software (AVC).

Does the TruSeq Stranded Total RNA with Ribo-Zero Globin kit subtract Globin from non-human RNA?

TruSeq Stranded Total RNA with Ribo-Zero Globin kit supports human, mouse, and rat. Use of this kit with other organisms has not been tested and is not supported.

To determine if your organism is compatible with the Ribo-Zero Globin kit, contact Illumina technical support with the rRNA sequence for your organism of choice. The analysis is performed in silico and does not guarantee rRNA removal.

Can TruSight One libraries be run on the MiSeq System?

Yes, the TruSight One Sequencing Panel is designed to allow trio sequencing and data analysis using the MiSeq System. The 9-sample TruSight One kit is intended for use with the MiSeq and contains three MiSeq v3 reagent cartridges. The 36-sample kit does not contain sequencing reagents and can be sequenced on any Illumina sequencer.

What is the minimum order I can place?

The minimum order is 48 samples, which is the smallest kit size.

If I have my own HGMD Professional license or if I want to use the public HGMD version, can I import HGMD annotations into VariantStudio?

You can import external sources of annotations into VariantStudio using the custom annotation feature. Check your HGMD Professional or HGMD public version license terms to make sure that such use is permitted.

Why is there an option for 'Tissue Source' in the query tool, but there is no Tissue Source for the samples?

At this time, uploading samples does not include this information. Future releases of GenomeStudio and BeadAccess will incorporate this ability, and at that time you will be able to query on this field.

What is the difference between the TruSeq Methyl Capture EPIC LT and HT kits?

TruSeq Methyl Capture EPIC LT and HT kits are kitted for different sample numbers.

Kit

Sample Number

Enrichment
Reactions

Indexes

TruSeq Methyl Capture EPIC - LT

12

3

4

TruSeq Methyl Capture EPIC - HT

48

12

12

What are the lab temperature requirements for HiSeq systems?

The lab should maintain a temperature of 19–25°C (22°C 3°C). This is the operating temperature of the instrument. During a run, do not allow the ambient temperature to vary by more than 2°C. Maintain a relative non-condensing humidity between 20–80%.

What is the difference between the TruSight One, Nextera Rapid Capture Exome, and Nextera Rapid Capture Expanded Exome products?

  • TruSight One is a focused sequencing panel that enriches samples for a selection of coding exons associated with human disease, capturing ~12 Mb of genomic content.
  • The Nextera Rapid Capture Exome kit targets > 98% of human coding exon content and ~37 Mb of genomic content.
  • The Nextera Rapid Capture Expanded Exome panel includes exome content and additional UTR, promoter, and miRNA targets with ~62 Mb genomic content.

Is there a maximum distance from the 3′ end of the RNA transcript to which probes hybridize?

No, there is no set maximum distance. However, strong preference is given to probes closer to the 3′ end.

How much time does a standard HiSeq 2500 run take?

See system specifications on the HiSeq 2500 Specifications page.

What power connections are required for the NextSeq system?

The NextSeq System comes with a region-specific power cord. For more information, see the NextSeq System Site Prep Guide.

What is the difference between TruSeq Exome and TruSeq Rapid Exome?

TruSeq Rapid Exome uses transposon-based fragmentation while TruSeq Exome uses an alternate enrichment protocol with mechanical shearing.

TruSeq Exome is suitable for DNA that responds poorly to Nextera tagmentation. The TruSeq Exome protocol combines the proven TruSeq Nano Library Prep and Rapid Capture Enrichment kits.

What type of support can I expect for my Illumina FastTrack Microarray Services project?

After the Illumina FastTrack Microarray Services team receives your signed contract, a project manager contacts you. The project manager is assigned to your project and guides you throughout the process to prepare and submit your samples to our lab.

The project manager also helps design your SNP panel for custom products. The project manager is responsible for tracking your samples though the lab and is available throughout your project to answer questions about processes, samples, or deliverables.

What is the sensitivity for Illumina Gene Expression BeadChips?

Sensitivity is 1 in 250,000.

Do we have any information on how TruSight HLA performs on FFPE samples?

TruSight HLA uses long-range PCR for isolation and amplification of the HLA genes. These long-range amplicons range in size from 2.6 kb to 10.3 kb. Use a DNA sample that consists of at least 50% of the DNA greater than the size of the largest amplicon. In other words, if you are only interested in Class I genes (A, B, and C), then at least 50% of the DNA sample has to be larger than 4.2 kb.

TruSight HLA Amplicon Sizes:

  • HLA-A 4.1 kb
  • HLA-B 2.6 kb
  • HLA-C 4.2 kb
  • HLA-DRB1 4.1 kb
  • HLA-DRB3 4.1 kb
  • HLA-DRB4 4.6 kb
  • HLA-DRB5 4.1 kb
  • HLA-DQB1 9.7 kb
  • HLA-DPB1 6.9 kb
  • HLA-DQA1 7.3 kb
  • HLA-DPA1 10.3 kb

What is the shelf life of the TruSeq HT sample prep kits?

One year from the date of manufacture. The kit label provides the exact expiration date.

Illumina guarantees at least three months shelf life from the date of receipt, which is the same as for the TruSeq DNA LT and TruSeq RNA v2 kits.

Why do I have to log into my BaseSpace account to annotate variants in VariantStudio?

Illumina uses BaseSpace authentication to make sure that you are authorized to use the application. After you have authenticated for the first time, VariantStudio saves your credentials so that you are not asked to log in again later. You can delete your credentials using Annotation & Classification | Annotation Options | Forget BaseSpace Logon.

Can I submit custom designs for species other than human or mouse?

Illumina does not currently have custom design support for this product.

I am currently using the TruSeq DNA Sample Prep Kit. Should I switch to the TruSeq DNA PCR-Free Library Prep Kit or the TruSeq Nano DNA Library Prep Kit?

Illumina has discontined the TruSeq DNA Sample Prep kits. The Library Prep and Array Kit Selector can assist in choosing whether the TruSeq DNA PCR-Free Library Prep or TruSeq Nano DNA Library Prep kit best fits your needs. In summary, the kit selected depends on available sample input amounts and the quality of data.

  • The TruSeq DNA PCR-Free Library Prep Kit requires 1 g of gDNA, while the TruSeq Nano DNA Library Prep Kit requires 100 ng of gDNA.
  • The TruSeq DNA PCR-Free kit delivers the utmost data quality by eliminating PCR-induced biases, specifically, greater coverage of hard to reach regions, such as fosmid difficult promoters.
  • The TruSeq Nano DNA kit generates premier data quality, superior to TruSeq DNA, however it does have some PCR-induced bias and PCR duplicates, and cannot cover difficult promoters as well as TruSeq DNA PCR-Free.
  • TruSeq Nano DNA final libraries can be quantified using either qPCR or a fluorometric method using dsDNA binding dyes such as Qubit or Picogreen, while TruSeq DNA PCR-Free final libraries can only be quantified using qPCR.

Does VariantStudio support analysis of structural variations?

No. VariantStudio does not currently support copy number variations (CNVs) or structural variations (SVs).

How do you ensure you have enough markers for linkage mapping?

This is an optimal set of markers for linkage mapping. By simulation studies, it has been suggested that a 1 to 2 cM bi-allelic map of polymorphic markers (minor allele frequency 20–50%) will extract most of the inheritance information and that for common linkage study designs, adding more markers provides diminishing returns (Kruglyak, 1999). In a study of 188 meioses, the average information content over all chromosomes was over 97% and never dropped below 83%. This high information content throughout the genome can be attributed to both the appropriate level of marker density and high heterozygosity of SNPs used in the panel and will ensure maximal power for detecting linkage to a disease or trait and defining the linkage interval.

– What is the reference database for ADT (source and version for sequence, MAF, and validation)?

ADT retrieves data based on dbSNP126 for sequence, position, and Minor Allele Frequency (MAF).

Does the TruSeq Stranded Total RNA with Ribo-Zero Plant kit work on plant leaves only, or is it also compatible with seeds and roots?

The TruSeq Stranded Total RNA with Ribo-Zero Plant kit contains oligos that will remove cytoplasmic and chloroplast rRNA from leaves, seeds, and roots.

Which VCF versions does VariantStudio support?

VariantStudio imports, annotates, and analyzes SNPs, insertions, and deletions expressed in VCF version 4.0, or later. For more information, see VCF version 4.0 on the 1000 Genomes website.

Is the VariantStudio software a clinical software tool?

No. The VariantStudio software was developed as a Research Use Only tool.

How can I calculate a p-value from the DiffScore?

p = 10'(DiffScore*sgn(cond-ref)/10)

Is it possible to get the data for each feature on the chip?

Yes, this is known as bead-level data. Contact your Field Application Scientist (FAS) or Technical Support Representative for assistance.

Are somatic variants being called by the Isaac Variant Caller?

Only diploid variant calls are detected.

Can I use a TruSeq DNA sample prep kit with a TruSeq DNA PCR-Free library prep protocol? Can I use a TruSeq DNA PCR-Free Library Prep Kit with a TruSeq DNA sample prep protocol?

No, kits are not interchangeable.

What type of data can I export from KaryoStudio?

You can export data displayed in the Found Regions table as either a single row or the entire table.

  • To export a single row: right-click in the Found Regions table and select Copy Row to Clipboard.
  • To export the whole table: right-click in the Found Regions table and select Copy All to Clipboard
  • Paste the data into an Excel file or import it into a third-party application. You can also export data as a cytogenetics report.

How often is the content updated?

We update content as often as warranted by Sanger miRBase. Our current content covers Sanger v12.

How many samples and loci does the TruSight HLA v2 Sequencing Panel support?

The kit has enough reagents to process 24 samples. TruSight HLA v2 sequences and analyzes HLA-A, -B, -C, -DRB1/3/4/5, -DQB1, -DPB1, -DQA1, and -DPA1.

Which genes are targeted in this panel?

This panel targets 1385 cancer-associated genes, including 507 genes involved in fusions and > 850 genes either mutated or deregulated in cancers.

Where do I find adapter and index sequences?

The Illumina Adapter Sequences Document lists index adapter sequences and all other adapter sequences.

How many reactions can this kit support?

This kit is suitable for approximately 320 reactions when used for a TruSeq Low Input application.

Are the index sequences for the first seven cycles of index 1 the same as current TruSeq kits?

No. For Nextera or TruSeq index sequences, see Illumina Adapter Sequences.

Can I upgrade my cBot to a cBot 2?

No. The cBot 2 was specifically designed to provide positive sample tracking.

Does Illumina provide nebulizers in the TruSeq DNA Sample Prep kits?

Nebulizers are not provided, but can be purchased separately through iCom.

What applications have supported workflows on the MiSeq System?

The MiSeq supports a large portfolio of sequencing applications. See the MiSeq Applications page for more information.

How can I tell if my genes of interest are present in the TruSight One Sequencing Panel kit?

The TruSight One *.bed file details all regions of interest (ROI) and lists all targets of enrichment for this assay. Filter by gene name (HGNC nomenclature), or reference the coordinates of your loci of interest in the human genome reference (hg19 build). See the product files for the gene list.

What type of customer support can I expect for my Illumina FastTrack Sequencing Services project?

After the Illumina FastTrack Sequencing Services team receives your signed contract, a project manager contacts you. The project manager is assigned to your project and guides you throughout the process to prepare and submit samples to the Illumina lab.

The project manager is responsible for tracking your samples though the lab and is available throughout your project to answer questions about processes, samples, or deliverables.

We also have a team of scientific experts with the main priority of making sure that every genome shipped is of the highest quality. These experts are available to discuss data integrity and provide technical expertise.

What do Positive Phenotype and Negative Phenotype mean?

These are supplied by the submitter of the data. Samples may have been used as part of the case group in an experimental design (eg, a sample from a diabetes patient). In this instance, diabetes may be reflected in the Positive Phenotype for this sample. These samples are still valuable to other studies that are not looking at this Positive Phenotype, and it may be valuable to obtain accurate population representation in control sets. Similarly, Negative Phenotype is supplied by the submitter and is intended to indicate which phenotypes the sample has been screened for and are negative. Note that some submitters have chosen also to use the Phenotype field to note the sources of the samples (eg, blood, lymphocytes). In addition, the entry HapMap indicates HapMap samples from Illumina.

How am I notified when a new version of BaseSpace Variant Interpreter is released?

All registered users receive an email notification when a new software version is available. Details of changes between versions are provided in the release notes.