The first shipments of the NovaSeq X Series introduced XLEAP-SBS chemistry on Illumina’s next-generation high-throughput platform. With the 25B flow cell, NovaSeq X Series software update 1.2 enabled up to 16 terabases in a dual flow cell sequencing run. Software update 1.3 continues the evolution of the NovaSeq X Series by enabling significant improvements to data quality and instrument robustness.
In this article, we highlight:
- Higher sequencing data quality, including significant improvements in flow cell yield, sample demultiplexing, and accuracy under low-diversity conditions
- Enhanced NGS secondary analysis, including an upgrade to DRAGEN v4.3, faster BCL Convert run times, and more flexible configuration options
- Improved usability, with a reduction in the PhiX spike-in requirements for low-diversity applications, the elimination of special recipes for RNA applications, and turnkey integration with a LIMS
Higher sequencing data quality
Higher yield:
With software update 1.3, higher yield was achieved through optimizations of the sequencing recipe—the set of events driving the sequencing process—resulting in enhanced signals emanating from the flow cell nanowells. Improvements were achieved through changes to thermal and fluidic steps, including adjustments to rates and dwell times. Key steps in the clustering protocol were optimized to grow brighter, more robust clusters. The post-run wash was improved to more efficiently flush out the fluidic system between runs. These changes have resulted in higher signal-to-noise ratios, which in turn improve yield and quality scores.
In addition to recipe improvements, NovaSeq X software update 1.3 includes image processing algorithm improvements that are critical to Illumina’s pursuit of higher data quality. On high-throughput flow cells, the alignment of image pixels to well locations in the patterned flow cell is particularly challenging, since well location estimation can be affected by jitter and distortion. While jitter is usually caused by unintended camera movement, instrument vibration, or image repositioning, distortion is induced by the geometry of the camera lens. In software update 1.3, a novel approach has been implemented that shows superior performance in pixel-to-well location synchronization by eliminating spatial errors on the flow cell. The improvement is highlighted in Figure 1. Circles show the actual location of a cluster within a well, while X marks indicate the estimated position. The heat map shows the spatial error in the x-(or horizontal) direction on a section of the flow cell, with a significant reduction observed after the software 1.3 algorithm is applied.
Improved sample demultiplexing:
Usable yield depends not only on the number of clusters on the flow cell passing filter, but also on the number of reads demultiplexed into appropriate samples.
Sample demultiplexing depends on base-calling accuracy during index cycles, and that can vary depending on the plexity—the number of samples per lane—used in the run. Miscalls during index cycles are measured, with mismatch rates of zero, one, and two bases reported for the run. NovaSeq X software 1.2 shows relatively lower demultiplexing efficiency (that is, higher mismatch rates), causing a loss of throughput for some samples, especially in low-plexity runs (4–10 samples per lane). With software 1.2, some indexes show a cycle-dependent drop in the expected intensity, causing a higher rate of miscalls for that index and hence a reduction in the overall demultiplexing efficiency.
In NovaSeq X software update 1.3, an improvement was made to mitigate that issue: The expected levels of intensity are now adaptively learned and that knowledge is transferred into the base-calling algorithm during index cycles. The resulting improvement is illustrated in Figure 5A, which shows a marked reduction in one mismatch rate, particularly under low-plexity conditions, with NovaSeq X software update 1.3. Additionally, Figure 5B illustrates that index-dependent mismatch rates are significantly improved with the 1.3 software.
Low diversity enhancements:
In some sequencing libraries, each of the four possible base types are not sufficiently abundant in the pool of sequenced clusters at a given cycle. Runs where one or more of the four bases are significantly underrepresented in the data population are runs with low diversity. Maintaining high data quality and base-calling accuracy in low-diversity conditions is challenging for real-time base calling. In NovaSeq X software 1.2, customers with low-diversity applications are advised to spike-in a sufficient level of PhiX, a high-diversity library, along with the actual sequenced sample. When a sufficient level of spiked-in PhiX (for example, 15%) is used, the low-diversity nature of the library is masked from the downstream algorithms, maintaining comparable data quality to high-diversity libraries. However, that spike-in requirement comes with the cost of losing some throughput (in this example, 15%) that was occupied by the PhiX library.
NovaSeq X software update 1.3 offers improved performance for base calling under low-diversity settings, resulting in a much lower requirement for spiked-in PhiX percentage (5%). (Note that for libraries in which the level of diversity may vary significantly from cycle to cycle, the previous guideline of 15% PhiX spike-in still applies.) The performance improvement is shown in Figures 7A and 7B, which indicate that high levels of percentage pass filter and percentage > Q30 data quality can be maintained for low-diversity libraries in NovaSeq X software update 1.3, even at low PhiX spike-in levels.
Enhanced secondary analysis
Upgrade to DRAGEN v4.3:
The upgrade to DRAGEN v4.3 brings significant accuracy improvements, driven by an update to the multigenome mapper and pangenome reference. The pangenome reference now extends the sample population to 128, covering 26 different ancestries. Advancements in the multigenome mapper take better advantage of the new reference, resulting in a reduction in SNP false positives (FP) plus false negatives (FN) by 49.0%, and indel FP + FN by 19.6%, on average across population, compared to DRAGEN v4.1. These improvements are demonstrated in Figure 8, which shows a reduction ranging from 35% to 50% across the set of HG001–HG007 samples. Errors in European ancestry samples are also reduced by 40.2% compared to the pangenome reference of DRAGEN v4.1, with a reduction of 47.0% on non-European samples.
Faster BCL Convert:
In addition to data quality improvements, BCL Convert has been accelerated when either the on-instrument version or the BaseSpace Sequence Hub application is used. Figure 9A shows a nearly 5× reduction in run time of the newly released on-instrument DRAGEN v4.3.13 compared to DRAGEN v4.1.23. The run-time improvement is achieved by a combination of on-instrument acceleration and more flexible configuration options—for example, no lane splitting (NLS) and FASTQC metric generation, in addition to the previously provided ORA configuration. The configuration options and corresponding run times for a 25B flow cell are shown in Figure 9B.
More flexible configurations:
In addition to run time improvements, DRAGEN BCL Convert v4.3.13 now supports additional configuration options (for example, sample projects and no lane splitting). In addition, more flexibility has been added to flow cell configurations; DRAGEN v4.3.13 now supports up to 12 workflow / genome pairs on a single flow cell, with up to 32 unique configurations per workflow / genome pair. The increased flexibility is summarized in Table 1.
DRAGEN version | Workflow / genome pairs | Configurations per pair |
---|---|---|
v4.1.23 | 3 + BCL Convert | 8 |
v4.3.13 | 12 + BCL Convert | 32 |
Table 1: Configuration flexibility for DRAGEN v4.1.23 versus DRAGEN v4.3.13
BAM/CRAM transfers to the cloud:
Many users of the NovaSeq X System execute mapping and alignment directly on the instrument using DRAGEN, with specialized post-processing (for example, custom variant calling) executed offline. With this approach, users can exploit DRAGEN’s fast processing times, since the mapping and alignment will finish before the next sequencing runs start. With software update 1.3, these users can now send FASTQ/ORA and BAM/CRAM directly to BaseSpace Sequence Hub.
Further details on DRAGEN v4.3.13 for NovaSeq X can be found in the release notes at this link.
Improved usability
RNA processing without special recipes:
The addition of adapters in some library preparation methods results in low base diversity (for example, a single T base) within the first cycles of a sequencing run. Historically, this has been addressed using custom recipes with “dark cycles”—that is, cycles in which no imaging is performed. However, these custom recipes may not be compatible on the same flow cell with other libraries that require first-cycle imaging.
With improvements in NovaSeq X software update 1.3, custom dark cycle recipes are no longer required for sequencing these unique libraries. As a result, these libraries may be multiplexed on the same flow cell to more flexibly enable multiomic use cases. The improvement is shown in Table 2, which compares primary metrics for the sequencing of RNA libraries under three conditions: software 1.2 with no dark cycle recipe (a force failure condition), software 1.2 with a custom dark cycle recipe, and NovaSeq X software update 1.3 with the default (no dark cycle) recipe. As expected in the force failure condition, the resulting sequencing quality is poor, with a high error rate and a low percentage of clusters passing filter (%PF). When no dark cycle recipe is used with software 1.3, the sequencing quality is improved, and it benefits further from the %PF improvements in the software 1.3 recipe, with error rate and pass filter metrics comparable to software 1.2 with a dark cycle recipe. Note that for those who wish to continue to use dark cycle recipes, software 1.3 includes custom dark cycle recipes that also incorporate the %PF recipe improvements.
Illumina Clarity LIMS integration:
Illumina Run Manager provides turnkey integration between Clarity LIMS and the NovaSeq X System for customers who are fully on-premises.
Key features include:
- Out-of-the-box NovaSeq X workflows with step-by-step guidance on pooling, diluting, denaturing, etc.
- Sequencing run and analysis planning
- Sequencing run status and metrics tracking
- Sample sheet validation
- Analysis run status and metrics tracking
Third-party LIMS support via APIs:
With the latest updates, it is easier than ever for NovaSeq X users to integrate third-party or home-brewed LIMS through robust API capabilities. Here's how these enhancements simplify and streamline your LIMS workflow:
- Secure authorization via bearer tokens: Users can now create an authorization client, securely generate credentials, and use them to obtain a bearer token. This token allows seamless and secure interaction with Illumina Run Manager (IRM) endpoints.
- Real-time notifications with webhooks: Stay informed with event-triggered notifications! Users can set up webhooks to automatically deliver updates to external web servers whenever specific events occur in IRM, ensuring timely and efficient monitoring.
- Automated Run Management: Simplify your operations with APIs that enable authorized clients to create planned runs directly from your LIMS. These runs can then be selected and initiated on the instrument, reducing manual intervention and saving time.
Simpler disk space management:
With the high throughput of the NovaSeq X System comes the challenge of managing high volumes of data. NovaSeq X software update 1.3 simplifies data management by offering two new data management features: First, the user can elect for the system to automatically delete secondary analysis data from the instrument once the data has been successfully transferred to the off-instrument storage location. Second, the user can start a second sequencing run if there is sufficient room to store primary analysis data on the instrument. With the latter feature, the user can remove data from prior runs from the instrument any time before the end of the sequencing run. If secondary analysis starts and there is insufficient storage, then the secondary analysis run will be automatically cancelled.
Coming next
Staggered start: Will give users the ability to initiate a run on a second flow cell while the first flow cell is still in progress.
File-based LIMS integration: Will offer run automation after first loading consumables, to minimize operator error and to enforce traceability for clinical specialty and biopharma customers.
1.5B 600 Cycle Kit: Will unlock the high-throughput tier for applications like shotgun metagenomics, immune repertoire profiling, and amplicon sequencing, enabling deeper coverage and longer read lengths in a single run.
Data quality enhancements: Will provide additional data quality improvements, ensuring higher accuracy and reliability for customer workflows.
Conclusion
Software update 1.3 on the NovaSeq X Series enables significant improvements to both data quality and instrument usability. The higher data quality is reflected in improved per-sample yield, achieved through changes to the sequencing recipe, image processing algorithms, and low-diversity enhancements. Additionally, secondary analysis has been enhanced with the upgrade to DRAGEN v4.3 and includes faster BCL Convert run times and more flexible configuration options. Finally, the update offers improved usability, with a reduction in the PhiX spike-in requirements for low-diversity applications, the elimination of special recipes for RNA applications, and turnkey integration with a LIMS.