Creating Template Files

Script Parameters

Upgrade Note: process vs step URIs

The driver_file_generator script now uses steps instead of processes for fetching information. When a process URI is supplied, the script detects it and automatically switches it to a step URI. (The PROCESS.TECHNICIAN token, which is only available on 'process' in the API, is still supported.)

The behavior of the script has not changed, except that the long form -processURI parameter must be replaced by -stepURI in configuration. The -i version of this parameter remains supported and now accepts both process and step URI values.

If your configuration is using -processURI or --processURI, replace each instance with -i (or -stepURI/--stepURI).

The following table defines the parameters used by the driver_file_generator script. For details on the metadata and tokens referenced in the table, see Template File ContentsMetadata

Script Parameters

Option

Name

Description

-i {stepURI:v2}

-stepURI {stepURI:v2}

Step URI

(Required) LIMS step URI

Provides context to resolve all token values.

See Upgrade note.

-u {username}

-username {username}

Username

(Required) LIMS login username

-p {password}

-password {password}

Password

(Required) LIMS login password

-t <templateFile>

-templatePath <templateFile>

Template file

(Required) Template file path

-o <outputFile>

-outputPath <outputFile>

Output file

(Required) Output file path

If the folder structure specified in the path does not exist, it is created.

•

This output file parameter value is overwritten by OUTPUT.FILE.NAME

•

To output multiple files, use GROUP.FILES.BY.INPUT.CONTAINERS and GROUP.FILES.BY.OUTPUT.CONTAINERS

•

Files generated are in CSV format by default. Other value-separated formats are available—see OUTPUT.SEPARATOR.

-l <logFile>

-logFileName <logFile>

Log file

(Required) Log file name

-q \[true|false]

-quickAttach \[true|false]

Quick Attach

Default is 'false'.

Provide as 'true' to attach the file on script completion. To attach manually or with AI/Automation Worker, name the file starting with the placeholder LIMSID. If multiple files are generated, they are zipped into one archive.

Main use cases are:

•

Multiple files are generated (see GROUP.FILES.BY) and must be attached to the LIMS (in addition to, or in place of, writing them to disk).

•

When chaining multiple scripts together, this make sure that the file has already been attached before the next script runs.

See Renaming generated files and Generating Multiple Files examples.

-destLIMSID <LIMSID>

Destination LIMS ID

LIMSID of the output to attach the template file to. Use with quickAttach.

See Renaming generated files and Generating Multiple Files examples.

Command-line example:

bash -l -c "opt/gls/clarity/bin/java -jar /opt/gls/clarity/extensions/ngs-common/v5/EPP/DriverFileGenerator.jar script:driver_file_generator -i {stepURI:v2} -u {username} -p {password} -t /opt/gls/clarity/customextensions/InfiniumHT/driverfiletemplates/NextSeq.csv-o {compoundOutputFileLuid0}.csv -l {compoundOutputFileLuid1}"

Command-line example using -quickAttach and -destLIMS:

bash -l -c "/opt/gls/clarity/bin/java -cp /opt/gls/clarity/extensions/ngs-common/v5/EPP/DriverFileGenerator.jar script:driver_file_generator -i {stepURI:v2} -u {username} -p {password} -t /opt/gls/clarity/customextensions/Robot.csv -quickAttach true -destLIMSID {compoundOutputFileLuid0} -o extended_driver_x384.csv -l {compoundOutputFileLuid2}"

See also the Template File Examples section.

Data Source

The input-output-maps of the step (defined by the -stepURI parameter) are used as the data source for the content of the generated file.

If they are present, input-output-maps with the attribute output-generation-type=PerInput are used. Otherwise, all input-output-map items are used.

By default, the data source entries are sorted alphanumerically by LIMS ID. You can modify the sort order by using the SORT.BY and SORT.VERTICAL metadata elements.

The output generation type specifies how the step outputs were generated in relation to the inputs. PerInput entries are available for the following step types: Standard, Standard QC, Add Labels, and Analysis.

Template Sections

The content of the generated file is determined by the sections defined in the template. Content for each section is contained within xml-like opening and closing tags that are structured as follows:

<SECTION> section content</SECTION>

Most template files follow the same basic structure and include some or all the following sections (by convention, section names are written in capital letters, but this is not required):

<HEADER_BLOCK><HEADER><DATA><FOOTER>

The order of the section blocks in the template does not affect the output. In the output file, blocks will always be in the order shown.

The area outside of the sections can contain metadata elements. Anything else outside of the section tags is ignored.

The <PLACEMENT> and <TOKEN FORMAT> sections are not part of the list and do not create distinct sections in the generated file. Instead, they alter the formatting of the generated output.

HEADER_BLOCK

Only a subset of the tokens is available for use in the header block section. (For details, see Tokens table).

If an unsupported token is included, file generation will complete with a warning message and a warning will appear in the log file.

The header block section may include both plain text and data from the LIMS. It consists of information that does not appear multiple times in the generated file—ie, the information is not included in the data rows (see DATA section)

Tokens in the header block always resolve in the context of the first input and first output available. For example, suppose the INPUT.CONTAINER.TYPE token is used in the header block:

•

If there is only one type of input container present in the data source, that container type will be present in the output file.

•

If multiple input container types are present in the data source, only the first one encountered while processing the data will be present in the output file.

For this reason, we recommend against using tokens that will resolve to different values for different samples - such as SAMPLE.NAME. If one of these tokens is encountered, a warning is logged and the first value retrieved from the API is used. (Note that you may use.ALL tokens, where available.)

To include a header block section in a template, enclose it within the <HEADER_BLOCK> and </HEADER_BLOCK> tags.

HIDE feature: If one of the tokens of a line is empty and is part of a HIDE statement, that line will be removed entirely. See Using HIDE to Exclude Empty Columns and Using HIDE to Exclude Empty HEADER rowsexamples.

HEADER

The header section describes the header line of the data section (see DATA section). A simple example might be "Sample ID, Placement".

The content of this section can only include plain text and is output as is. Tokens are not supported.

To include a header section in a template, enclose it within the <HEADER> and </HEADER> tags.

HIDE feature: See ' Hide feature' in DATA section. Also note:

•

If multiple <HEADER> lines are present, at least one must have the same number of columns as the <DATA> template line.

•

<HEADER> lines that do not match the number of columns are unaffected by the HIDE feature.

DATA

Each data source entry creates a data row for each template line in the section. All entries are output for the first template line, then the next template line runs, and so on.

The data section allows tokens and text entries. All tokens are supported.

Note the following:

•

Duplicated rows are eliminated, if present. A row is considered duplicated if its content (after all variables and placeholders have been replaced with their corresponding values) is identical to a previous row. Tokens must therefore provide distinctive enough data (ie, something more than just CONTAINER.NAME) if all of the input-output entry pairs are desired in the generated file.

•

By default, the script processes only sample entries. However, there are metadata options that allow inclusion of result files/measurements and exclusion of samples.

•

Metadata sorting options are applied to this section of the template file only.

•

By default, pooled artifacts are treated as a single input artifact. They can be demultiplexed using the PROCESS.POOLED.ARTIFACTS metadata element.

•

If there is at least one token relevant to the step inputs or outputs, this section will produce a row for each PerInput entry in the step input-output-map. If no PerInput entries are present in the step input-output-map, the script will attempt to add data rows for PerAllInputs entries.

•

Input and output artifacts are always loaded if a <DATA> section is present in the template file, due to the need to determine what type of artifacts the script is dealing with.

To include a data section in a template, enclose it within the <DATA> and </DATA> tags.

HIDE feature: If the token in a given column is empty for all lines and that token is part of a HIDE statement, that column (including the matching <HEADER> columns) will be removed entirely. There can only be one <DATA> template line present when using the HIDE feature.

See Using HIDE to Exclude Empty Columns and Using HIDE to Exclude Empty HEADER rowsexamples.

FOOTER

The content of this section can only include plain text and is output as is. Tokens are not supported.

To include a footer section in a template, enclose it within the <FOOTER> and </FOOTER> tags.

PLACEMENT

This section contains groovy code that controls the formatting of PLACEMENT tokens (see the PLACEMENT tokens in Tokens table).

Within the groovy code, the following variables are available:

Variable Name	Description
containerTypeNode	The container type holding the derived sample
row	The row part of the derived sample's location
column	The column part of the derived sample's location

Note the following:

•

The script must return a string, which replaces the corresponding <PLACEMENT> tag in the template.

•

Logic within the placement tags can be as complex as needed, provided it can be compiled by a groovy compiler.

•

If an error occurs while running formatting code, the original location value is used.

To include a placement section in a template, enclose it within the <PLACEMENT> and </PLACEMENT> tags.

TOKEN FORMAT

This section defines logic to be applied to specific tokens to change the format in which they appear in the generated file.

Special formatting rules can be defined per token using the following groovy syntax:

${token.identifier}… groovy code… // or ${token.identifier##Name}… groovy code…

Within the groovy code, the variable 'token' refers to the original value being transformed by the formatting code. The logic replaces all instances of that token with the result.

${token.identifier} marks the beginning of the token formatting code and the end of the previous token formatting code (if any).

•

You can define multiple formatting logic rules for a given token, by assigning a name to the formatting section (named formatters are called 'variations'). This is done by appending “##” after the token name (eg “${token.identifier##formatterName}”).

•

Using the named formatter syntax without giving a name (“${token.identifier##}”) will abort the file generation.

•

If an error occurs while running formatting code, the resulting value will be blank.

•

If a named formatter is used but not defined, the value is used as is.

To include a placement section in a template, enclose it within the <TOKEN_FORMAT> and </TOKEN_FORMAT> tags.

Metadata

Metadata provides information about the template file that is not retrieved from the API—such as the file output directory to use, and how the data contents should be grouped and sorted.

Metadata is not strictly confined to a section, and is not designated by opening and closing tags. However, each metadata entry must be on a separate line.

Metadata entries can be anywhere in the template, but the recommended best practice is to group them either at the top or the bottom of the file.

For a list of supported metadata elements, rules for using them, and examples, see Template File Contents, Metadata section.

Sorting Logic

Sorting in the generated file is done either alphanumerically or by vertical placement information, using the SORT.BY. and SORT.VERTICAL metadata elements.

Sorting must be done using a combination of sort keys - provided to SORT.BY. as one or more ${token} values, each of which always produces a unique value in the file. For example, sorting by just OUTPUT.CONTAINER.NAME would work for samples placed in tubes, but would not work for samples in 96 well plates. Sorting behavior on nonunique combinations is not guaranteed to be predictable.

To sort vertically:

Include the SORT.VERTICAL metadata element in the template file. In addition, the SORT.BY.${token}, ${token} metadata must also be included, as follows:

SORT.BY.${OUTPUT.CONTAINER.ROW}${OUTPUT.CONTAINER.COLUMN}

Any SORT.BY. tokens will be sorted using the vertical sorter instead of the alphanumeric sort.

To apply sorting to samples in 96 well plates:

You could narrow the sort key to a unique combination such as: SORT.BY.${OUTPUT.CONTAINER.NAME}${OUTPUT.CONTAINER.ROW}${OUTPUT.CONTAINER.COLUMN}

See also SORT.VERTICAL and SORT.BY in Template File Contents, Metadata section.

Rules and Constraints

The template must adhere to the following rules:

•

Metadata entries must each appear on a new line and be the only entry on that line.

•

Metadata entries must not appear inside tags.

•

Opening and closing section tags must appear on a new line and as the only entry on that line.

•

Each opened tag must be closed, otherwise it is skipped by the script.

•

Any sections (opening tag + closing tag combination) can be omitted from the template file.

•

Entries that are separated by commas in the template will be delimited by the metadata-specified separator (default: COMMA) in the template file.

•

White space is allowed in the template. However, if there is a blank line inside a tag, it will also be present in the template file produced.

•

If an entry in the template is enclosed in double quotes it will be imported as a single entry and written to the template file as such, even if it has commas inside.

•

To include double-quotes or single-quotes in the template file, use the escape character: Example: \" or \'

•

To include an escape character in the template file, use two escape characters inside double-quotes. For example, if you want to see \\Share\Folder\Filename.txt use "\\\\Share\\Folder\\Filename.txt" as the token.

If any of the following conditions is not met - the tag, and everything inside it, is ignored by the script and a warning displays in the log file:

•

Except for the metadata, all template sections must be enclosed inside tags.

•

Each tag must have its own line, and must be the only tag present on that line.

•

No other entries, even empty ones, are allowed.

•

All opened tags must be closed.

•

Custom field names must not contain periods.