Choose your preferred format for downstream analysis of sequencing data

Numerous options are available for converting data to compatible sequence file formats such as FASTQ files, and for downstream analysis of next-generation sequencing (NGS) data. Illumina sequencing systems are designed so data can be easily streamed into cloud-based Illumina informatics platforms for data management, analysis, and collaboration.
Raw data files are provided in sequence file formats that are compatible, or easily converted, to standardized data formats for streamlined aggregation and mining of large cohorts.
FASTQ is a text-based sequencing data file format that stores both raw sequence data and quality scores. FASTQ files have become the standard format for storing NGS data from Illumina sequencing systems, and can be used as input for a wide variety of secondary data analysis solutions.
FASTQ files may contain up to millions of entries and can be several megabytes or gigabytes in size, which often makes them too large to open in a typical text editor. Generally, it is not necessary to view FASTQ files, since they are intermediate output files used as input for tools that perform downstream data analysis.
FASTQ Original Read Archive (ORA) files are lossless data compression files that make it easier to store, manage, and share large NGS data files. This file format reduces file size, time to transfer, and data storage costs. FASTQ ORA files are up to 5x smaller than FASTQ files in traditional fastq.gz format, without compromising data integrity. FASTQ ORA files can be generated with Illumina DRAGEN secondary analysis software.
All fastq.ora file formats can be read using the free DRAGEN ORA Decompression Software provided by Illumina. Once installed, a simple command can be used to pipe the output of decompression into popular mapping tools such as BWA,1 STAR,2 and Bowtie.3
Binary base call (BCL) files contain raw data generated by Illumina sequencing systems. The BCL sequence file format requires conversion to FASTQ format for use with user-developed or third-party data analysis tools.
DRAGEN secondary analysis offers rapid BCL conversion to FASTQ files as part of its suite of pipelines. Illumina also offers BCL Convert software to convert BCL files to FASTQ files. BCL Convert is a standalone software solution that demultiplexes data and converts BCL files to standard FASTQ file formats for downstream analysis.
FASTQ files are the typical starting format for sequencing data analysis. However, BaseSpace Sequence Hub can create other file formats that are common to secondary and tertiary analysis programs.
During secondary or tertiary analysis of NGS data, Illumina software platforms and apps often convert raw sequence files from FASTQ files to other sequence file formats (ie, .vcf, .bam) as part of the analysis workflow.
Access user guides, release notes, and additional technical information.
Get hands-on NGS training from expert instructors. We also offer live or self-paced online courses and other educational resources.
DRAGEN secondary analysis pipelines support various NGS experiment types, including genome, exome, transcriptome, and methylome studies.
Store, process, and share large genomic and NGS datasets in the cloud with built-in speed, security, and scalability.
Enter your email address.