Just as a flashlight casts a broader beam than the brightest candle when walking along a darkened trail, so too does long-read genomic sequencing seem to clarify a broader genomic picture of DNA mutations than short-read sequencing.
New EMBL research, published recently in Cell Genomics, indicates that long-read genomic sequencing can reveal important patterns of chromosomal structural rearrangement that had previously eluded the more predominant short-read sequencing used in cancer genomics.
A collaboration co-led by EMBL Heidelberg, the German Cancer Research Center (DKFZ), and EMBL-EBI researchers applied new technologies to harness long-read sequencing in a way that could potentially be applied to clinical settings.
Short- vs. long-read sequencing
Scientists have long explored the mutational landscapes of cancer using mostly short-read genomic sequencing.
Short-read genome sequencing technology has high throughput but can only generate many short segments of DNA, which researchers then piece together to identify mutations in the genome using computational tools.
Researchers suspected, however, this approach left some mutation patterns undetected.
It’s why they have sought better methods to analyse the effects of somatic structural variations (SSVs) on cell function.
These SSVs are rearrangements of large DNA sections (e.g., deletions, duplications, etc.) that are known to be associated with a majority of cancer-causing mutations.
Newer long-read sequencing methods (like Oxford Nanopore used in this EMBL research) potentially offer a way to detect mutations in cancer genomes in a better way.
Nanopore sequencing allows researchers to carry out real-time sequencing of long DNA or RNA fragments.
It works by monitoring changes to an electrical current as nucleic acids – the building blocks of DNA and RNA – are passed through a protein nanopore.
The resulting signal is computationally decoded to give the specific DNA or RNA sequence.
The equipment for long-read sequencing is smaller, faster and can read longer DNA strands compared to short-read sequencing.
So, like a puzzle with fewer, bigger pieces, the sequences are easier to assemble.
Additionally, it can allow researchers to understand changes to the epigenome in cancer.
“We knew we weren’t getting a full picture, using short-read sequencing,” said Tobias Rausch, senior bioinformatician in the Korbel research group at EMBL Heidelberg and lead author on the Cell Genomics paper.
“The technology is now at a point where we can really use long-read sequencing and uncover what was missing.”
How to identify a previously undiscovered genomic pattern
Using cells from a single medulloblastoma – a primary childhood brain tumour, collected at diagnosis and following treatment, the researchers were able to use new long-read sequence analysis methods to identify a novel mutational pattern leading to the rearrangement of longer sections in the genome, which they were then able to confirm in other cancer types.
“Right from the start, we understood that the development of methods needed to be an essential part of our work,” Rausch said.
“How can we use long-read sequencing best in a cancer genomic situation? Methods delivery was an important part of this project, and a number of tools came out of it that will hopefully be useful to the wider community.”
However, beyond just methodology, the scientists were also able to identify and name a rather complex pattern that they believe is tied to a particular form of mutation in cancer genomes, especially in liposarcoma, a rare, but sometimes fatal cancer known for often having a highly unstable genome.
Previously, this pattern went undetected with short-read sequencing.
“It’s not too surprising to see a pattern of mutations in genomic sequencing, but to do so with just one sample, and to have it be something that people had not seen before, was quite striking,” said Jan Korbel, who leads the EMBL research group that Rausch is part of.
“But that’s also because short-read sequencing couldn’t piece it together. Now, we are able to observe such complex rearrangements and actually view their internal structure.”
Important expertise through collaboration
An important part of the research process hinged on collaborating within EMBL.
This included colleagues from GeneCore, which delivers part of the actual sequencing after conferring with the collaborators to select the right approach, as well as EMBL’s European Bioinformatics Institute in Hinxton, United Kingdom, which provided expertise with respect to Oxford Nanopore sequencing.
For the Korbel group, a discussion with colleagues at EMBL-EBI to work together on this project started almost five years ago, but it could only be realised when technology matured enough for them to implement their scientific vision with this long-read approach and subsequent analysis tools.
“Long-read sequencing provides a new way to see genome information – both in structural variation and DNA modifications such as methylation,” said Ewan Birney, EMBL Deputy Director General, Joint Director of EMBL-EBI, and one of the research group leaders collaborating on this project.
“It is wonderful to see this new mutational process being illuminated by this new technology.”
Likewise, engaging with DKFZ helped not only procure tissue samples but brought important biological insights into the work.
On the horizon
Having identified a mutational pattern but within just a single sample, the researchers realise the need for follow-up studies with larger cohorts to understand the pattern better and determine if it is clinically relevant.
Right now, there are very few samples studied with long-read genomic sequencing.
“There really is a lot of excitement now for long-read sequencing,” Korbel said.
“Already we have plans to continue our work on a larger scale, and with this work, we will again rely on the collaborations we’ve started – some of which are now piloting ways to apply this long-read sequencing into the clinical setting, where, in general, patients tend to have higher survival rates when sequencing has been involved.”