Long-Read Sequencing Technology

Deeper insights into complex genomic regions with long reads

Long-read sequencing technology can help resolve challenging regions of the genome, such as highly repetitive and highly homologous regions

What is long-read sequencing?

Long-read sequencing is a DNA sequencing approach that enables the sequencing of much longer DNA fragments than traditional short-read sequencing methods. While short reads can capture the majority of genetic variation, long-read sequencing allows the detection of complex structural variants in the genome that may be difficult to detect with short reads. These include large inversions, deletions, or translocations, some of which have been implicated in genetic disease.

Advantages of long-read sequencing

Long-read sequencing technology can help resolve challenging regions of the genome by sequencing thousands of bases to:

  • Resolve traditionally difficult to map genes or regions of the genome, such as those containing high variable or highly repetitive elements
  • Perform phased sequencing to identify co-inherited alleles, haplotype information, and phase de novo mutations
  • Generate long reads for de novo assembly and genome finishing applications
Illumina technology innovations
Illumina technology innovations

We have a broad range of innovations in development, including constellation mapped read technology, which uses a highly simplified NGS workflow that enables on-flow cell library prep and standard short-reads with cluster proximity information.

View innovation roadmap

Long-range genomic insights

Illumina mapped read technology provides long-distance genomic information that can help scientists detect large structural variants and resolve challenging-to-map regions. While technically a short-read technology, Illumina mapped reads leverage on-flow cell library preparation and novel informatics that incorporate proximity information from clusters in neighboring nanowells to generate accurate long-range genomic insights. The unique workflow maintains the link between the original long DNA template and the resulting short sequencing reads, enabling enhanced detection of structural variants, ultra-long phasing of genetic variants, and improved mapping in low-complexity regions.

Learn more about mapped read technology

Introducing Illumina Complete Long Read sequencing technology
How does Illumina mapped read technology work?

DNA templates are extracted from samples using standard or high molecular weight methods and introduced directly to the flow cell surface, where they are captured, transformed into clusters, and sequenced. By introducing long DNA templates directly to the flow cell, proximal nanowells produce a constellation-like pattern that allows clusters to be mapped back to the original template using novel algorithms in DRAGEN secondary analysis. This significantly improves mapping reads to a reference genome and allows scientists to unlock long-range genomic insights with the accuracy and scalability of short-read SBS sequencing.

Additional benefits of long-read sequencing

Long-read sequencing technology has the potential to improve the efficiency and accuracy of some existing DNA sequencing applications while increasing the resolution of some clinically important genes.

These advantages allow for the phased re‐sequencing of human genomes and rapid de novo sequencing of plant and animal genomes.

The long reads produced typically span more than one heterozygous SNP in the phasing application. The technology simplifies de novo sequencing because large repeat regions in the DNA fragments can easily be spanned.

The long reads produced typically span more than one heterozygous SNP, which can facilitate mapping them to the correct maternal or paternal chromosome during phasing applications.

Long reads can also span large repetitive motifs which simplifies mapping in challenging sequences and de novo sequencing.

Alternative long-range genomics technology: Linked reads

Transposase enzyme-linked long-read sequencing (TELL-Seq) technology uses linked reads to generate non-contiguous, long-range data to inform de novo assembly or ultra-long distance (> 1 Mb) phasing. This alternative sequencing data type can be used to complement standard short reads for novel or complex genomes.

Ultra-long-range phasing with TELL-Seq

TELL-Seq technology generates ultra-long phasing blocks, providing an accessible solution to perform genome phasing studies.

Microbial de novo assembly with TELL-Seq

TELL-Seq demonstrates exceptional performance for microbial WGS, even for challenging samples or regions with high GC content.

Insect genome assembly with TELL-Seq

Learn how researchers use transposase enzyme-linked long-read sequencing (TELL-Seq) to sequence and assemble genomes of nine insect species in this recorded webinar.

FAQ
Illumina short-read sequencing by synthesis (SBS) produces highly accurate reads 50 to 600 DNA bases long. Short-read SBS is easily scalable and can be targeted to focused parts of genomes up to full genomes. Long-read sequencing can produce reads tens of thousands of bases long. Long-read sequencing has some advantages with sequencing genomes without a complete reference, identifying large structural rearrangements, and sequencing through low-complexity regions of a genome.

The main advantages of Illumina short-read sequencing are the high accuracy of SBS chemistry, flexibility of assay designs from small panels up to full genomes, and scalability with a range of instruments and solutions for everything from investigating individual genomes to large-scale population initiatives. While short-read sequencing is able to sequence the vast majority of the human genome, there is a very small percentage with repetitive motifs, motifs with homology to other genomics regions, or large structural elements that can be difficult to resolve. Illumina mapped read technology provides long-distance genomic information that can help scientists resolve these challenging-to-map regions.

Learn more about mapped read technology

It can be advantageous to combine long-read data with complementary short-read information. Many long-read sequencing technologies have laborious workflows as well as highly variable results.1-4 Short reads (typically 50–600 bp) offer high data quality and sequencing depth at low cost. With advanced data analysis, short-read sequencing can generate whole-genome variant calls with outstanding accuracy. In addition, a small fraction of genomic regions can benefit from long-read information to improve resolution of difficult-to-map genes.
Linked-read sequencing is a category of methods that attempt to get the long-distance information of long-read sequencing with the accuracy of short-read sequencing. These methods modify long DNA templates to introduce a chemical tag or sequence barcode that is used during the analysis to map longer sequences. The main downside to linked-read sequencing is that the template modification and additional analysis steps increase complexity and cost, which limits their usability and scalability for many labs. Long reads may still be advantageous when mapping genomes without a reference, accurately mapping in repetitive regions, phasing of variants, and identifying large structural variants.
Synthetic long-read sequencing involves tagging long DNA templates with unique sequence barcodes or chemical tags, as well as fragmenting the DNA. The DNA fragments are then sequenced on a short-read sequencing instrument and assembled into synthetic long reads using specialized bioinformatics software. The software uses the barcodes to map the DNA fragments back to the original long DNA templates. The main drawback to synthetic long-read sequencing is that the DNA template modifications and additional computational analysis steps increase complexity and cost, which limits usability and scalability for many labs.

Related applications

rare disease icon
Rare disease whole-genome sequencing

Whole-genome sequencing is the most comprehensive test for rare disease, with the potential for superior diagnostics and outcomes.

wgs icon
Human whole-genome sequencing

Human whole-genome sequencing provides the most detailed view into the complex genetic variants that make us unique.

hematological cancer icon
Cancer whole-genome sequencing

Get a comprehensive base-by-base view of the unique genomic abnormalities in cancer.

Related Solutions

Sequencing platforms

Compare next-generation sequencing (NGS) platforms by application, throughput, and other key specs. Find tools to help you choose the right sequencer.

Technological advancements

Read articles about recent genomics breakthroughs and advances in bioinformatics and clinical research from Illumina scientists and thought leaders.

DNA sequencing

During DNA sequencing, the bases of a fragment of DNA are identified. Illumina DNA sequencers can produce terabases of sequence data from a single run.

References
  1. Pacific Biosciences. Preparing DNA for PacBio HiFi sequencing—Extraction and quality control. pacb.com/wp-content/uploads/Technical-Note-Preparing-DNA-for-PacBio-HiFi-Sequencing-Extraction-and-Quality-Control.pdf.
  2. Pacific Biosciences. Preparing whole genome and metagenome libraries using SMRTbell prep kit 3.0. pacb.com/wp-content/uploads/Procedure-checklist-Preparing-whole-genome-and-metagenome-libraries-using-SMRTbell-prep-kit-3.0.pdf.
  3. Oxford Nanopore Technologies. Ligation Sequencing Kit. store.nanoporetech.com/us/ligation-sequencing-kit110.html.
  4. Pacific Biosciences. Low Yield Troubleshooting Guide. pacb.com/wp-content/uploads/Guide-Low-Yield-Troubleshooting.pdf.
Interested in receiving newsletters, case studies, and information from Illumina based on your area of interest? Sign up now.