NovaSeq X software update 1.3 improves data quality and customer usability

Published January 10, 2025

The first shipments of the NovaSeq X Series introduced XLEAP-SBS chemistry on Illumina’s next-generation high-throughput platform. With the 25B flow cell, NovaSeq X Series software update 1.2 enabled up to 16 terabases in a dual flow cell sequencing run. Software update 1.3 continues the evolution of the NovaSeq X Series by enabling significant improvements to data quality and instrument robustness.

In this article, we highlight:   

  • Higher sequencing data quality, including significant improvements in flow cell yield, sample demultiplexing, and accuracy under low-diversity conditions
  • Enhanced NGS secondary analysis, including an upgrade to DRAGEN v4.3, faster BCL Convert run times, and more flexible configuration options
  • Improved usability, with a reduction in the PhiX spike-in requirements for low-diversity applications, the elimination of special recipes for RNA applications, and turnkey integration with a LIMS

Higher sequencing data quality

Higher yield:

With software update 1.3, higher yield was achieved through optimizations of the sequencing recipe—the set of events driving the sequencing process—resulting in enhanced signals emanating from the flow cell nanowells. Improvements were achieved through changes to thermal and fluidic steps, including adjustments to rates and dwell times. Key steps in the clustering protocol were optimized to grow brighter, more robust clusters. The post-run wash was improved to more efficiently flush out the fluidic system between runs. These changes have resulted in higher signal-to-noise ratios, which in turn improve yield and quality scores.

In addition to recipe improvements, NovaSeq X software update 1.3 includes image processing algorithm improvements that are critical to Illumina’s pursuit of higher data quality. On high-throughput flow cells, the alignment of image pixels to well locations in the patterned flow cell is particularly challenging, since well location estimation can be affected by jitter and distortion. While jitter is usually caused by unintended camera movement, instrument vibration, or image repositioning, distortion is induced by the geometry of the camera lens. In software update 1.3, a novel approach has been implemented that shows superior performance in pixel-to-well location synchronization by eliminating spatial errors on the flow cell. The improvement is highlighted in Figure 1. Circles show the actual location of a cluster within a well, while X marks indicate the estimated position. The heat map shows the spatial error in the x-(or horizontal) direction on a section of the flow cell, with a significant reduction observed after the software 1.3 algorithm is applied.

Figure 1: Spatial well location error before and after the software 1.3 update

Recipe enhancements, along with jitter and distortion correction, contribute prominently to the improved yield on the system. These improvements are highlighted in Figures 2 and 3, which compare NovaSeq X software 1.2 with software update 1.3 using the libraries TruSeq DNA PCR-Free (TSPF) and Illumina DNA PCR-Free (IDPF). For all flow cell types on the system, the distribution of clusters passing filter (PF) shifts to the right dramatically. Additionally, the distribution has narrowed, reflecting lower run-to-run variation. The result is an increase in flow cell yield.

Figure 2: Pass filter distribution for NovaSeq X software 1.2 for three flow cell types

Figure 3: Pass filter distribution for NovaSeq X software update 1.3 for three flow cell types

This improvement in pass filter rate was achieved at no cost to base quality. In fact, the percentage of bases that exceed Q30, when averaged over all cycles, also improves with NovaSeq X software update 1.3. The improvement is shown in Figure 4 for the IDPF and TSPF libraries over all three NovaSeq X flow cell types.

Figure 4: Improved mean % > Q30 for all three NovaSeq X flow cell types

Improved sample demultiplexing:

Usable yield depends not only on the number of clusters on the flow cell passing filter, but also on the number of reads demultiplexed into appropriate samples.

Sample demultiplexing depends on base-calling accuracy during index cycles, and that can vary depending on the plexity—the number of samples per lane—used in the run. Miscalls during index cycles are measured, with mismatch rates of zero, one, and two bases reported for the run. NovaSeq X software 1.2 shows relatively lower demultiplexing efficiency (that is, higher mismatch rates), causing a loss of throughput for some samples, especially in low-plexity runs (4–10 samples per lane). With software 1.2, some indexes show a cycle-dependent drop in the expected intensity, causing a higher rate of miscalls for that index and hence a reduction in the overall demultiplexing efficiency.

In NovaSeq X software update 1.3, an improvement was made to mitigate that issue: The expected levels of intensity are now adaptively learned and that knowledge is transferred into the base-calling algorithm during index cycles. The resulting improvement is illustrated in Figure 5A, which shows a marked reduction in one mismatch rate, particularly under low-plexity conditions, with NovaSeq X software update 1.3. Additionally, Figure 5B illustrates that index-dependent mismatch rates are significantly improved with the 1.3 software.

Figure 5A: One mismatch rate versus sample plexity for IDPF and TSPF
Figure 5B: Mismatch rates versus index sequence for IDPF and TSPF

The improved pass filter rates, combined with the higher demultiplexing efficiency, translate directly to a higher usable yield. Highlighted in Figure 6 is the improvement to usable yield offered by NovaSeq X software update 1.3.

Figure 6: Improved usable yield for NovaSeq X software update 1.3

Low diversity enhancements:

In some sequencing libraries, each of the four possible base types are not sufficiently abundant in the pool of sequenced clusters at a given cycle. Runs where one or more of the four bases are significantly underrepresented in the data population are runs with low diversity. Maintaining high data quality and base-calling accuracy in low-diversity conditions is challenging for real-time base calling. In NovaSeq X software 1.2, customers with low-diversity applications are advised to spike-in a sufficient level of PhiX, a high-diversity library, along with the actual sequenced sample. When a sufficient level of spiked-in PhiX (for example, 15%) is used, the low-diversity nature of the library is masked from the downstream algorithms, maintaining comparable data quality to high-diversity libraries. However, that spike-in requirement comes with the cost of losing some throughput (in this example, 15%) that was occupied by the PhiX library.

NovaSeq X software update 1.3 offers improved performance for base calling under low-diversity settings, resulting in a much lower requirement for spiked-in PhiX percentage (5%). (Note that for libraries in which the level of diversity may vary significantly from cycle to cycle, the previous guideline of 15% PhiX spike-in still applies.) The performance improvement is shown in Figures 7A and 7B, which indicate that high levels of percentage pass filter and percentage > Q30 data quality can be maintained for low-diversity libraries in NovaSeq X software update 1.3, even at low PhiX spike-in levels.

Figure 7A: Percentage pass filter versus PhiX spike-in
Figure 7B: Percentage > Q30 versus PhiX spike-in

Enhanced secondary analysis

Upgrade to DRAGEN v4.3:

The upgrade to DRAGEN v4.3 brings significant accuracy improvements, driven by an update to the multigenome mapper and pangenome reference. The pangenome reference now extends the sample population to 128, covering 26 different ancestries. Advancements in the multigenome mapper take better advantage of the new reference, resulting in a reduction in SNP false positives (FP) plus false negatives (FN) by 49.0%, and indel FP + FN by 19.6%, on average across population, compared to DRAGEN v4.1. These improvements are demonstrated in Figure 8, which shows a reduction ranging from 35% to 50% across the set of HG001–HG007 samples. Errors in European ancestry samples are also reduced by 40.2% compared to the pangenome reference of DRAGEN v4.1, with a reduction of 47.0% on non-European samples.

Figure 8: SNP and indel false positive + false negative counts for DRAGEN v4.1.23 and v4.3.13

Faster BCL Convert:

In addition to data quality improvements, BCL Convert has been accelerated when either the on-instrument version or the BaseSpace Sequence Hub application is used. Figure 9A shows a nearly 5× reduction in run time of the newly released on-instrument DRAGEN v4.3.13 compared to DRAGEN v4.1.23. The run-time improvement is achieved by a combination of on-instrument acceleration and more flexible configuration options—for example, no lane splitting (NLS) and FASTQC metric generation, in addition to the previously provided ORA configuration. The configuration options and corresponding run times for a 25B flow cell are shown in Figure 9B.

Figure 9A. BCL Convert run time improvements (hours) for a 25B flow cell
Figure 9B: BCL Convert run time versus configuration parameters

More flexible configurations:

In addition to run time improvements, DRAGEN BCL Convert v4.3.13 now supports additional configuration options (for example, sample projects and no lane splitting). In addition, more flexibility has been added to flow cell configurations; DRAGEN v4.3.13 now supports up to 12 workflow / genome pairs on a single flow cell, with up to 32 unique configurations per workflow / genome pair. The increased flexibility is summarized in Table 1.

DRAGEN version Workflow / genome pairs Configurations per pair
v4.1.23 3 + BCL Convert 8
v4.3.13 12 + BCL Convert 32

Table 1: Configuration flexibility for DRAGEN v4.1.23 versus DRAGEN v4.3.13


BAM/CRAM transfers to the cloud:

Many users of the NovaSeq X System execute mapping and alignment directly on the instrument using DRAGEN, with specialized post-processing (for example, custom variant calling) executed offline. With this approach, users can exploit DRAGEN’s fast processing times, since the mapping and alignment will finish before the next sequencing runs start. With software update 1.3, these users can now send FASTQ/ORA and BAM/CRAM directly to BaseSpace Sequence Hub.

Further details on DRAGEN v4.3.13 for NovaSeq X can be found in the release notes at this link.

Improved usability

RNA processing without special recipes:

The addition of adapters in some library preparation methods results in low base diversity (for example, a single T base) within the first cycles of a sequencing run. Historically, this has been addressed using custom recipes with “dark cycles”—that is, cycles in which no imaging is performed. However, these custom recipes may not be compatible on the same flow cell with other libraries that require first-cycle imaging.

With improvements in NovaSeq X software update 1.3, custom dark cycle recipes are no longer required for sequencing these unique libraries. As a result, these libraries may be multiplexed on the same flow cell to more flexibly enable multiomic use cases. The improvement is shown in Table 2, which compares primary metrics for the sequencing of RNA libraries under three conditions: software 1.2 with no dark cycle recipe (a force failure condition), software 1.2 with a custom dark cycle recipe, and NovaSeq X software update 1.3 with the default (no dark cycle) recipe. As expected in the force failure condition, the resulting sequencing quality is poor, with a high error rate and a low percentage of clusters passing filter (%PF). When no dark cycle recipe is used with software 1.3, the sequencing quality is improved, and it benefits further from the %PF improvements in the software 1.3 recipe, with error rate and pass filter metrics comparable to software 1.2 with a dark cycle recipe. Note that for those who wish to continue to use dark cycle recipes, software 1.3 includes custom dark cycle recipes that also incorporate the %PF recipe improvements.

Table 2: Percentage error and percentage cluster passing filter improvements for software 1.3 with no dark cycles

Secondary analysis results also show good consistency between NovaSeq X software update 1.3 with no dark cycle recipe, and software 1.2 with a dark cycle recipe, as shown in Figure 10. A correlation analysis of the gene expression, transcripts per million (TPM), identified high concordance, with R2 = 0.999 for both 1.5B and 10B flow cells. It should be noted that with software 1.3, trimming of the first cycle T base using override cycles in BCL Convert (N1Y74I10I10N1Y74) is required.

Figure 10: High TPM concordance: software 1.2 with dark cycles versus software 1.3 without dark cycles

Illumina Clarity LIMS integration:

Illumina Run Manager provides turnkey integration between Clarity LIMS and the NovaSeq X System for customers who are fully on-premises.

Key features include:

  • Out-of-the-box NovaSeq X workflows with step-by-step guidance on pooling, diluting, denaturing, etc.
  • Sequencing run and analysis planning
  • Sequencing run status and metrics tracking
  • Sample sheet validation
  • Analysis run status and metrics tracking

Third-party LIMS support via APIs:

With the latest updates, it is easier than ever for NovaSeq X users to integrate third-party or home-brewed LIMS through robust API capabilities. Here's how these enhancements simplify and streamline your LIMS workflow:

  • Secure authorization via bearer tokens: Users can now create an authorization client, securely generate credentials, and use them to obtain a bearer token. This token allows seamless and secure interaction with Illumina Run Manager (IRM) endpoints.
  • Real-time notifications with webhooks: Stay informed with event-triggered notifications! Users can set up webhooks to automatically deliver updates to external web servers whenever specific events occur in IRM, ensuring timely and efficient monitoring.
  • Automated Run Management: Simplify your operations with APIs that enable authorized clients to create planned runs directly from your LIMS. These runs can then be selected and initiated on the instrument, reducing manual intervention and saving time.

Simpler disk space management:

With the high throughput of the NovaSeq X System comes the challenge of managing high volumes of data. NovaSeq X software update 1.3 simplifies data management by offering two new data management features: First, the user can elect for the system to automatically delete secondary analysis data from the instrument once the data has been successfully transferred to the off-instrument storage location. Second, the user can start a second sequencing run if there is sufficient room to store primary analysis data on the instrument. With the latter feature, the user can remove data from prior runs from the instrument any time before the end of the sequencing run. If secondary analysis starts and there is insufficient storage, then the secondary analysis run will be automatically cancelled.

Coming next

Staggered start: Will give users the ability to initiate a run on a second flow cell while the first flow cell is still in progress.

File-based LIMS integration: Will offer run automation after first loading consumables, to minimize operator error and to enforce traceability for clinical specialty and biopharma customers.

1.5B 600 Cycle Kit: Will unlock the high-throughput tier for applications like shotgun metagenomics, immune repertoire profiling, and amplicon sequencing, enabling deeper coverage and longer read lengths in a single run.

Data quality enhancements: Will provide additional data quality improvements, ensuring higher accuracy and reliability for customer workflows.

Conclusion

Software update 1.3 on the NovaSeq X Series enables significant improvements to both data quality and instrument usability. The higher data quality is reflected in improved per-sample yield, achieved through changes to the sequencing recipe, image processing algorithms, and low-diversity enhancements. Additionally, secondary analysis has been enhanced with the upgrade to DRAGEN v4.3 and includes faster BCL Convert run times and more flexible configuration options. Finally, the update offers improved usability, with a reduction in the PhiX spike-in requirements for low-diversity applications, the elimination of special recipes for RNA applications, and turnkey integration with a LIMS.