Parsing Sequencing Meta-Data into Clarity LIMS

Once a sequencing run has occurred, there is often a requirement to store the locations of the FASTQ / BAM files in BaseSpace Clarity LIMS.

For paired-end sequencing, it is likely that the meta-data file that describes the locations of these files will contain two rows for each sample sequenced: one for the first read, and another for the second read.

Such a file is illustrated here:

Column 2 of the file, Sample ID, contains the LIMS IDs of the artifacts for which we want to store the FASTQ file values listed in column 3 (Fastq File).

This example discusses the strategy for parsing and storing data against process inputs, when that data is represented by multiple lines in a data file.