CASAVA 1.8.2 Sample Sheet Generation

Compatibility and Package Version: Support for CASAVA v1.8.2 sample sheet generation is available in multiple Illumina sequencer integration packages. Refer to the release notes for your installed package for information about the features available.

Overview

This article discusses the Illumina sequencer integration packages that generate a sample sheet for use with CASAVA v1.8.2 analysis software.

Sample sheet generation is configured on the step prior to the sequencing run, which is the step where samples are placed on the flow cells or reagent cartridges that will be placed in the instrument.

The sample sheet is generated by means of a script, which the user initiates by clicking a button on the Record Details screen of the step. This generates a CASAVA 1.8.2 format sample sheet file for the container loaded during the step, where the name of the sample sheet will be <container name>.csv.

Note:

Bcl conversion with CASAVA requires input sample sheets to have the default name SampleSheet.csv.

When the sequencing run is in progress on the instrument, the sequencing service that is monitoring instrument activity will copy the sample sheet for the given run to the /Data/Intensities/BaseCalls/ subdirectory of the sequencing run data directory, with the name SampleSheet.csv.

Script parameters and usage

The following table lists the parameters used by the script.

Parameter

Description

u, username

LIMS username (Required)

p, password

LIMS password (Required)

i, processURI

LIMS process URI (Required) (lowercase I)

c, csvFileLimsIds

Sample sheet CSV file LIMS ID (Required - may be provided multiple times)

e, errorLogFileName

Log file name (Required)

l, useProjectLimsID

Project LIMS ID will be used instead of project name in the Project column of the sample sheet (Optional)

Accepted values: true or false . Provide with quotes e.g. -l 'true' (lower case L)

s, useSampleLimsID

Sample LIMS ID will be used instead of sample name in the SampleID column of the sample sheet (Optional) (S ee Enabling unique FASTQ file names )

Accepted values: true or false . Provide with quotes e.g. -s 'true'

a, appendLimsID

Protocol step LIMS ID will be appended to sample names in the SampleID column of the sample sheet. (Optional)

Use this option to guarantee unique FASTQ file names per run. (See Enabling unique FASTQ file names)

Accepted values: true or false. Provide with quotes e.g. -a 'true'

Usage

Below is an example automation command showing the script in use. The sample sheet generation portion of the parameter string is shown in bold.

bash -c "/opt/gls/clarity/bin/java -jar /opt/gls/clarity/extensions/<package_directory>/<package_version>/EPP/<extensions.jar> \ -u {username} \ -p {password} \ -i {processURI:v2} \ script:generate_casava_sample_sheet \ -c {compoundOutputFileLuid1} \ -c {compoundOutputFileLuid2} \ -c {compoundOutputFileLuid3} \ -c {compoundOutputFileLuid4} \ -c {compoundOutputFileLuid5} \ -e {compoundOutputFileLuid6} \ script:labelNonLabeledOutputs \ -l 'NoIndex' \ script:initArtifactUDFs"

Note that this example includes calls to labelNonLabeledOutputs and initArtifactUDFs , which are important for use with Bcl conversion.

See the following documentation for details:

Label Non-Labeled Outputs
Initialize Artifact UDFs

Support for container types

All one-dimensional container types with both numeric rows and numeric columns are supported.

Sample sheet data

The following table lists the fields that display in the sample sheet. All columns are always present.

Note that if upstream pooling has been performed, BaseSpace Clarity LIMS will populate the sample sheet with the first upstream pooled inputs found – not with the current samples in the step.

For information about ordering of sample sheet data, illegal characters, and other rules and constraints, see Rules and constraints.

Column Header

Description

FCID

The name of the destination (flow cell or reagent cartridge) container for the current step

Lane

The flow cell or reagent cartridge lane number corresponding to the placement of the input sample

SampleID

Depending on the command-line value, this will either be the sample name or the LIMS ID of the submitted sample of the input

The additional ‘-a’ command line option appends the LIMS ID to the end of this value, e.g. "Sample1-1234" (see Script parameters and usage)

SampleRef

The value of the Reference Genome UDF on the input's submitted sample

Index

The Sequence attribute of the sample's reagent label

Note: BaseSpace Clarity LIMS does not check for duplicate indexes when generating the sample sheet. See Rules and constraints.

Description

The LIMS ID of the sample

Control

Indicates whether or not the sample is a control ('Y' or 'N')

In BaseSpace Clarity LIMS, a sample is considered to be a control if either of the following is true:

The sample was added to the LIMS as a control sample
The sample was added to the LIMS as a regular sample, with the name PhiX (case-sensitive)

Recipe

This column is always blank

Operator

The name of the technician who initiated the step in BaseSpace Clarity LIMS

SampleProject

The name of the project to which the input sample belongs

If the '-l' parameter is set to true, the project LIMS ID will be used instead of the project name (see Script parameters and usage )

File format and contents

This section outlines the format and contents of the generated CASAVA sample sheet and associated log file.

When validating the installation of your integration, refer to this information to ensure that the sample sheet and log files are correctly generated.

CASAVA sample sheet

The file is a comma-separated file containing ten columns.
The file is populated with data from the samples in the step. If pooled, each sample in the pool is represented as a separate, demultiplexed entry.
The entries are sorted by Lane and Description .
Sample placement appears in the file as a numeric lane value (e.g. well A:1 is represented as 1).

CASAVA sample sheet log file

The file is in HTML format.
The file contains logging information and a success message.

Configuration options

Support for multiple containers

To enable sample sheet generation for multiple containers, you must modify the process type on which the sample sheet generation automation is configured, creating a placeholder for each file.

The process must produce one shared result file per sample sheet that will be created.
The EPP command line must specify the -c parameter with the LIMS ID of the shared result file that will store each sample sheet in BaseSpace Clarity LIMS. The sample sheets will be attached to these result files and named as per the corresponding container.

Enabling unique FASTQ file names

To enable unique FASTQ file names per sequencing run, the EPP command on the process type must be configured to use the following parameter options:

-useSampleLimsID – ensures unique entries in the SampleID column by using the sample LIMS ID instead of its name
-appendLimsID – ensures unique names per run by appending the LIMS ID of the current step

For more information, see Script parameters and usage.

Rules and constraints

The step on which the sample generation script runs must be the step in which samples are placed on the flow cell(s) or reagent cartridge(s).
The naming pattern for the placeholder(s) to which the sample sheet(s) will be attached must not be changed from SampleSheet.csv. This is used to locate the sample sheet when copying it to the BaseCalls directory for use in Bcl conversion.
The contents of the sample sheet are ordered by Lane and then each Lane is ordered by Description (which contains the sample LIMS ID).
Project and sample names in the sample sheet cannot contain illegal characters. Characters not allowed are the space character and the following: ? ( ) [ ] / \ = + < > : ; " ' , * ^ | and
Illegal characters will be replaced with an underscore "_"
The destination container type (flow cell or reagent cartridge) must be either single well or a one-dimensional container type with both numeric rows and numeric columns.
BaseSpace Clarity LIMS does not check for duplicate indexes when generating the sample sheet. If this is a requirement in your lab, you must perform this check manually.
The script supports generating multiple sample sheets, one per destination container, for a single step. When using the script in this way:
The step configuration must be updated to add shared file placeholders for each expected file.
The parameter string must be updated to include the LIMS ID of each shared output (a simple way to provide this is with -c {compoundOutputFileLuids}).