Output Files

As converted versions of BCL files, FASTQ files are the primary output of BCL Convert. Like BCL files, FASTQ files contain base calls with associated Q-scores. Unlike BCL files, which contain per‑cycle data, FASTQ files contain the per-read data that most analysis applications require.
The software generates one FASTQ file for every sample, read, and lane. For example, for each sample in a paired-end run, the software generates two FASTQ files: one for Read 1 and one for Read 2. In addition to these sample FASTQ files, the software generates two FASTQ files per lane containing all unknown samples. FASTQ files for Index Read 1 and Index Read 2 are not generated because the sequence is included in the header of each FASTQ entry.
• | If Sample_Name and Sample_Project are both present, and both --sample-name-column-enabled true and --bcl-sampleproject-subdirectories true command lines are used, then the output FASTQ files to subdirectories based on Sample_Project and Sample_ID, and name fastq files by Sample_Name. The same project directory contains the files for multiple samples. |
• | If the Sample_ID and Sample_Name columns are specified but do not match, the FASTQ files reside in a subdirectory where files use the Sample_Name value. |
• | Reads with unidentified index adapters are recorded in one file named Undetermined_S0_. If a sample sheet includes multiple samples without specified index adapters, the software displays a missing barcode error and ends the analysis. |
The software allows one unindexed sample since identification is not necessary to sequence one sample. Sequencing multiple samples requires multiplexing so the samples can be identified for analysis.

The file name format is constructed from fields specified in the sample sheet, using the format: <Sample_ID>_S#_L00#_R#_001.fastq.gz.
Example: <Sample_ID>_S1_L001_R1_001.fastq.gz
• | <Sample_ID>: The ID of the sample provided in the sample sheet. |
• | S1: The number of the sample based on the order that samples are listed in the sample sheet, starting with 1. In the example, S1 indicates that the sample is the first sample listed for the run. |
Reads that cannot be assigned to any sample are written to a FASTQ file as sample number 0 and excluded from downstream analysis.
• | L001: The lane number of the flow cell, starting with lane 1, to the number of lanes supported. |
• | R1: The read. R1 indicates Read 1. R2 would indicate Read 2 of a paired-end run. |
• | 001: The last portion of the file name is always 001. |

FASTQ files are text-based files that contain base calls with corresponding Q-scores for each read. Each file has one 4-line entry:
• | A sequence identifier with information about the run and cluster, formatted as: |
@Instrument:RunID:FlowCellID:Lane:Tile:X:Y:UMI Read:Filter:0:IndexSequence or SampleNumber
If a UMI is specified in an index read when isReverseComplement exists in the RunInfo.xml, the r character will be added at the beginning of the UMI sequence written in the Read Name of the FASTQ file.
• | The sequence (base calls A, G, C, T, and N, for unknown bases). |
• | A plus sign (+) that functions as a separator. |
• | The Q-score using ASCII 33 encoding. See Quality Output File for more information. |
Sequence Identifier Fields
Field |
Description |
---|---|
@ |
Each sequence identifier line starts with @. |
instrument |
The instrument ID. |
run ID |
The run number on the system. |
flow cell ID |
The flow cell ID. |
lane |
The flow cell lane number. |
tile |
The flow cell tile number. |
x_pos |
The X coordinate of the cluster. |
y_pos |
The Y coordinate of the cluster. |
UMI |
Optional. The UMI sequence (A, G, C, T, and N). When the sample sheet specifies UMIs, a plus sign separate4s the Read 1 and Read 2 sequences. |
read |
1 - Read 1, which is the first read of a paired-end run or the only read of a single-read run. 2 - Read 2, which is the second read of a paired-end run. |
is filtered |
N - No failed reads are included. |
control number |
0 - Control bits are not turned on. |
index sequence or sample number |
The Index Read sequence (A, G, C, T, and N. If the sample sheet indicates indexing, the index adapter sequence is appended to the end of the read identifier. If indexing is not indicated (one sample per lane), the sample number is appended to the read identifier. |
A complete FASTQ file entry resembles the following example:
@SIM:1:FCX:1:2106:15337:1063:GATCTGTACGTC 1:N:0:ATCACGGATCTGTACGTCTCTGCNTCACCTCCACCGTGCAACTCATCACGCAGCTCATGCCCTTCGGCTGCCTCCTGGACTA + CCCCCGGGGGGGGGGGG#:CFFGFGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGEGGFGGG

BCL conversion outputs log files to the Logs/ output subfolder. These include three separate files, Info.log, Warnings.log, and Errors.log, for three increasing levels of severity. All output to these files is also written to the terminal console: Info is written to standard-out, while Errors and Warnings are written to standard-error.
In addition, the file "FastqComplete.txt" is created in the Logs/ subfolder when conversion is complete. This can be used to trigger subsequent action if desired.

DRAGENBCL Convert produces the following metrics in CSV format to the Reports / output subfolder. In addition, the sample sheet and RunInfo.xml file used during conversion is copied into the Reports / output subfolder.

The following metrics are included in the Demultiplex_Stats.csv output file.
Column |
Description |
---|---|
Index |
The contents of index in sample sheet for this sample. For dual-index, the value concatenated with index2. |
# Reads |
The total number of pass-filter reads mapping to this sample for the lane. |
# Perfect Index Reads |
The number of mapped reads with barcodes that match the indexes provided in the sample sheet. |
# One Mismatch Index Reads |
The number of mapped reads with barcodes matched with one base mismatched. |
#Two Mismatch Index Reads |
The number of mapped reads with barcodes matched with exactly two bases mismatched. |
% Reads |
The percentage of pass-filter reads mapping to this sample for the lane. |
% Perfect Index Reads |
The percentage of mapped reads with barcodes that match the indexes provided in the sample sheet exactly. |
% One Index Reads |
The percentage of mapped reads with barcodes matched with exactly one base mismatched. |
% Two Index Reads |
The percentage of mapped reads with barcodes matched with exactly two bases mismatched. |

The following metrics are included in the Quality_Metrics.csv output file.
Column |
Description |
---|---|
Lane |
The lane number that this metric line refers to. |
Sample_ID |
The contents of Sample_ID in the sample sheet for this sample. |
index |
The contents of Index in sample sheet for this sample. |
index2 |
The contents of Index 2 in the sample sheet for this sample. |
ReadNumber |
The read number this metric line refers to. |
Yield |
The total number of bases mapping to the sample in this read. |
YieldQ30 |
The total number of bases with quality score ≥ 30 mapping to the sample in this read. |
QualityScoreSum |
The sum of quality scores of bases mapping to the sample in this read. |
Mean Quality Score (PF) |
The mean quality score of bases mapping to the sample in this read. |
% Q30 |
The percentage of bases with quality score ≥ 30 mapping to the sample in this read. |

The following information is included in the Adapter_Metrics.csv output file.
Column |
Description |
---|---|
Lane |
The lane number this metric line refers to. |
Sample_ID |
The contents of Sample_ID in the sample sheet for this sample |
index |
The contents of Index 1 (i7) in sample sheet for this sample. |
index2 |
The contents of Index 2 (i5) in the sample sheet for this sample. |
ReadNumber |
The read number this metric line refers to. |
AdapterBases |
The total number of bases trimmed as adapter from the read in the sample. |
SampleBases |
The total number of bases not trimmed from the read in the sample. |
% Adapter Bases |
The percentage of bases trimmed as adapter from the read in the sample. |

For unique dual index inputs, the Index_Hopping_Counts.csv file provides the number of reads mapping to every combination of provided index and index2 values, including via mismatch tolerance. The metrics provide visibility into any index-hopping behavior that have occurred. The samples with both index and index2 values present in the sample sheet are present in the index hopping file. The following information is included in the Index_Hopping_Counts.csv output file.
Column |
Description |
---|---|
Lane |
The lane for each metric. |
SampleID |
If the index combination corresponds to a sample, the contents of Sample_ID in the sample sheet for this sample. |
index |
The contents of index in sample sheet for the sample. |
index2 |
The contents of index 2 in sample sheet for the sample. |
# Reads |
The total number of pass-filter reads mapping to the index and index2 combination. |
% of Hopped Reads |
The percentage of hopped pass-filter reads mapping to the index and index2 combination. |
% of All Reads |
The percentage of all pass-filter reads mapping to the index and index2 combination. |

The Top_Unknown_Barcodes.csv file lists the most commonly encountered barcode sequences in the flow cell input that are not listed in the sample sheet. The 1,000 most common unlisted sequences are listed, along with any other sequences with a frequency equivalent to the 1,000th most commonly encountered sequence. The following information is included in the Top_Unknown_Barcodes.csv output file.
Column |
Description |
---|---|
Lane |
The lane for each metric. |
index |
The first index value of this unlisted sequence. |
index2 |
The second index value of this unlisted sequence. |
# Reads |
The total number of pass-filter reads mapping to the index and index2 combination. |
% of Unknown Barcodes |
The percentage of unknown pass-filter reads mapping to the index and index2 combination. |
% of All Reads |
The percentage of all pass-filter reads mapping to the index and index2 combination. |

The following information is included in the Adapter_Cycle_Metrics.csv output file.
Column |
Description |
---|---|
Lane |
The lane number this metric line refers to. |
Sample_ID |
The contents of Sample_ID in the sample sheet for this sample. |
index |
The contents of index in sample sheet for this sample. |
index2 |
The contents of index2 in the sample sheet for this sample. |
ReadNumber |
The read number this metric line refers to. |
Cycle |
The cycle number this metric line refers to. |
NumClustersWithAdapterAtCycle |
The number of clusters where the adapter was detected to begin precisely at this cycle. |
% At Cycle |
The percentage of all clusters where the adapter was detected to begin precisely at this cycle. |

The format of Demultiplex_Tile_Stats.csv and Quality_Tile_Metrics.csv matches that of Demultiplex_Stats.csv and Quality_Metrics.csv, respectively, save that an additional column is added:
Column |
Description |
---|---|
Tile |
The tile numeral value this metric line refers to. |
These files provide per-tile data rather than aggregated across the lane and read.

For the metrics files listed above (apart from Top_Unknown_Barcodes.csv), up to two additional columns may be added to each line if 'bcl-sampleproject-subdirectories' and/or 'sample-name-column-enabled' options are enabled:
Column |
Description |
---|---|
Sample_Project |
The Sample_Project value for the sample this metric line refers to. |
Sample_Name |
The Sample_Name value for the sample this metric line refers to. |
These files provide per-tile data rather than aggregated across the lane and read.

The fastq_list.csv output file is located in the output folder with the FASTQ files, and provides the associations between the sample indexes, lane, and the output FASTQ file names. The columns of each row are shown, along with example entries from a test run. For more information on running DRAGEN using fastq_list.csv, see FASTQ CSV File Format.
The following columns are provided per unique sample_ID and lane combination:
Column |
Description |
---|---|
RGID |
Read Group |
RGSM |
Sample ID |
RGLB |
Library |
Lane |
Flow cell lane |
Read1File |
Full path to a valid FASTQ input file |
Read2File |
Full path to a valid FASTQ input file. Required for paired-end input. If not using paired-end input, leave empty, |
The following is an example fastq_list.csv output file.
RGID,RGSM,RGLB,Lane,Read1File,Read2File
AACAACCA.ACTGCATA.1,1,UnknownLibrary,1,/home/user/dragen_bcl_out/1_S1_L001_R1_001.fastq.gz,/home/user/dragen_bcl_out/1_S1_L001_R2_001.fastq.gz
AATCCGTC.ACTGCATA.1,2,UnknownLibrary,1,/home/user/dragen_bcl_out/2_S2_L001_R1_001.fastq.gz,/home/user/dragen_bcl_out/2_S2_L001_R2_001.fastq.gz
CGAACTTA.GCGTAAGA.1,3,UnknownLibrary,1,/home/user/dragen_bcl_out/3_S3_L001_R1_001.fastq.gz,/home/user/dragen_bcl_out/3_S3_L001_R2_001.fastq.gz
GATAGACA.GCGTAAGA.1,4,UnknownLibrary,1,/home/user/dragen_bcl_out/4_S4_L001_R1_001.fastq.gz,/home/user/dragen_bcl_out/4_S4_L001_R2_001.fastq.gz

When the output-legacy-stats command line option is enabled, DRAGENBCL Convert produces the following metrics to the Reports/legacy output subfolder. The files are identical to the bcl2fastq2.20 report files except for incidences where there is decreased accuracy, non-deterministic output, or incorrect output from bcl2fastq2.20.

The ConversionStats.xml file contains the lane number for each lane and the following information for each tile:
• | Raw Cluster Count Read Number |
• | YieldQ30 |
• | Yield |
• | QualityScore Sum |

The DemultiplexingStats.xml contains the flow cell ID and project name. For each sample, index, and lane, the file lists the BarcodeCount, PerfectBarcodeCount, and OneMismatchBarcodeCount (if applicable).

The adapter trimming file is a text-based file that contains a statistics summary of adapter trimming for a FASTQ file. The file contains the fraction of reads with untrimmed bases for each sample, lane, and read number plus the following information:
• | Lane |
• | Read |
• | Project |
• | Sample ID |
• | Sample Name |
• | Sample Number |
• | TrimmedBases |
• | PercentageOfBases(beingtrimmed) |

A FastqSummaryF1L#.txt file contains the number of raw and passed filter reads for each sample and tile in a lane. The number sign (#) indicates the lane number.

DemuxSummaryF1L#.txt files, where # indicates the lane number, are generated when the sample sheet contains at least one indexed sample. A file contains the percentage of each tile that each sample occupies. It also lists the 1000 most common unknown index adapter sequences and the total number of reads with each index adapter identified.
To improve processing speed, the total for each index adapter is based on an estimate from a sampling algorithm.

HTML reports are generated from data in DemultiplexingStats.xml and ConversionStats.xml. The reports reside in Reports\html in the output directory or in the directory specified by the --reports-dir option.
The flow cell summary contains the following information:
• | Clusters(Raw) Clusters(PF)*Yield (MBases) |
For patterned flow cells, the number of raw clusters is equal to the number of wells on the flow cell.
The lane summary provides the following information for each project, sample, and index sequence specified in the sample sheet:
• | Lane# |
• | Clusters(Raw) |
• | %oftheLane |
• | % Perfect Barcode |
• | % One Mismatch |
• | Clusters(Filtered) |
• | Yield |
• | % PF Clusters |
• | %Q30Bases |
• | Mean Quality Score |
• | The Top Unknown Barcodes table in the HTML report provides the count and sequence for the 10 most common unmapped index adapters in each lane. |