 
                        19 March 2024
Cardiovascular disease is the leading cause of death in the United States, killing one person every 33 seconds. In 2019, heart disease cost the US nearly $240 billion. The good news is that, with the right tools and information, it can be predictable and preventable, since the disease is linked to genetics. Using next-generation sequencing and data analysis software, researchers are increasing their understanding of this seemingly ubiquitous medical condition.
Bioinformaticians on Illumina’s DRAGEN Secondary Analysis team have devised a breakthrough method to measure long, repetitive DNA sequences in the gene LPA, which is known to impact cardiovascular health.
Samuel Strom, PhD, a principal scientist in Illumina Research and Development, presented these findings at the annual American College of Medical Genetics and Genomics (ACMG) Clinical Genetics Meeting, held this year in Toronto.
The gene LPA encodes for lipoprotein a. Elevated lipoprotein a levels are correlated with an increased risk of developing a heart attack or stroke, and those levels are dictated by a repetitive DNA sequence embedded within LPA called a variable number tandem repeat (VNTR). VNTRs vary in length; LPA’s is about 5500 base pairs long, and humans can have anywhere from one to 60 copies of it.
“These alleles can be 300 kilobases or larger, far longer than Illumina’s or any other technology’s read length,” Strom says. “Using our historical variant callers, the reads that come from that region don’t even map because they’re not unique. For years, that data has been getting thrown away.”
However, the bioinformatics team working on DRAGEN found a way to analyze these reads that are normally discarded. The method they developed can accurately detect and quantify VNTRs in LPA.
The detection method for VNTRs that the DRAGEN team created also reduces the need to study the LPA gene based on polymorphisms alone, Strom says. This is important because some variations may be prevalent only in patients of a particular ethnic background and not others. After their methods were applied to a test cohort of over 2300 individuals of African, European, and Hispanic descent, Strom and his colleagues found that the DRAGEN analysis is “a more equitable way to look at this region, compared to looking at single-nucleotide variations,” he says.
Since this method is part of the secondary analysis step, it can be applied to genomic data already available in shared databases. Strom envisions being able to run this new analysis on population studies, such as the UK Biobank, to quantify and further study population risk levels, especially as researchers are beginning to appreciate the role of VNTRs in health and disease.
Decoding and understanding VNTRs in LPA offers a proof of principle that DRAGEN can work on long, repetitive sequences without discarding them.
“This is the tip of the iceberg for VNTR studies by whole-genome analysis,” Strom says. “No lab has ever done this before because it’s so technically difficult. If we know how to look for them, Illumina’s technology does a great job, even in things that are very complicated.”
He believes that there may be many other clinically relevant repetitive sequences similar to those in LPA; the ability to assess VNTRs in general, and not just in LPA, holds a lot of excitement and potential for the scientific community. “This is a dark part of the genome,” Strom says. “One of our next steps with the DRAGEN team is working on generalizing this method to be able to detect other VNTRs.”
He hopes that this and future Illumina-developed methods will shed new light on rare disease and oncology research.
To learn about how this VNTR detection method works and the methods used to test it in greater detail, read this article by Jonathan Belyeu, Vitor Onuchic, and Mitchell Bekritsky on Illumina’s Genomics Research Hub.


