Long read sequencing is making chromosome-scale assemblies, including diploid genomes, possible and is therefore improving our understanding of human genetic variation. But rapid improvements in long read sequencing capacity have been limited by the extraction of high molecular weight DNA. Magnetic bead-based high molecular weight DNA extraction limits DNA fragmentation, and is also less laborious and more cost-effective than other methods.
What is high molecular weight DNA?
Long read sequencing, also known as third generation sequencing, is making it easier to assemble complex genomes and can read DNA lengths of >10 kb. High molecular weight DNA is a length of DNA that is at least 50 kb long, but increasingly there is a need for even larger sections of DNA for long read sequencing and studies into large scale genomic variation. High molecular weight DNA is often defined as 100-300 kb long, while ultra-high molecular weight DNA is more than 300 kb.
What is long read sequencing?
Long read sequencing (LRS) involves reading sequences of DNA that are at least 10,000 base pairs (10 kb) long in one go. Next generation sequencing, which has driven numerous genomic advances in the last decade, involves cutting DNA into small fragments that are a few hundred base pairs long. The fragments are then amplified (copied many times) with PCR and read by a sequencer. Finally, the genome is put back together using specialized software.
What are the benefits of long-read sequencing?
The long read data generated from sequencing high molecular weight DNA is particularly useful for:
- Sequencing long fragments of DNA as a single molecule, which means there is no need to fragment or amplify the DNA. No PCR step also removes the GC PCR bias.
- Identifying changes to large sections of DNA, such as from insertions, inversions, duplications, deletions, and translocations.
- Facilitating extraction of high molecular weight DNA from plant and fungi species, which is otherwise difficult and often not cost effective.
- Facilitating the discovery of new heritable mechanisms of disease.
- Identifying copy number variations (CNVs), which is when the number of copies of a specific segment of DNA varies among different individuals’ genomes. CNVs are thought to make up around 12% of the genome, and can have implications for disease. For example, the number of ‘CAG’ repeats at the end of the ‘Huntingtin’ gene dictates whether an individual will develop Huntington's disease.
- Assembling diploid genomes, helping to advance understanding of human genetic variation.
- Providing information on variants on two homologous copies of a chromosomal region, and the impact this has on gene expression.
- Improving understanding of genome region variation, helping to uncover genotype-phenotype associations.
High Molecular Weight DNA extraction for long read sequencing
There are now a series of protocols that have been developed for high molecular weight DNA extraction, which can then be used in long read technologies. Methods for extracting high molecular weight DNA include:
Conventional bulk PCR can lead to the loss of unique sequences due to template mispairing and GC bias. Emulsion PCR encapsulates DNA repertoires in liquid droplets, which removes the effects of mispairing in DNA libraries. Emulsion PCR is a quasi-single molecule technique, unlike true long read sequencing, that reduces DNA errors. However, its high cost means it is not widely exploited.
Phenol-chloroform extraction and ethanol precipitation
Phenol-chloroform extraction is a traditional and inexpensive method for extracting high molecular weight DNA. However, it is a labor-intensive method that involves working with hazardous chemicals that require special disposal. DNA also tends to clump during the extraction process, which makes it harder to work with. Newer methods, such as magnetic beads, do not involve chloroform.
A magnetic bead-based high molecular weight DNA extraction protocol removes the need for spin columns and high centrifugation, which limits DNA fragmentation. This method produces high yields of high molecular weight DNA and has been demonstrated to yield fragment sizes of 20 – 200 kb long. Magnetic bead-based protocols are scalable and are cost effective compared to kit-based protocols. Multiple magnetic bead-based protocols have been developed in recent years, including Jones et al., 2021 and Maghini et al., 2021. The benefits of magnetic bead protocols include:
- Reproducible isolation of high molecular weight DNA
- High yields
- Quick and straightforward protocols
Find out more about using magnetic separation to extract high molecular weight DNA
The ability to efficiently extract high molecular weight DNA is underpinning the new biological insights gained from long read sequencing, and magnetic beads are an increasingly common way to achieve this. We know that you will want to be sure of the benefits before switching to a new DNA extraction protocol. If you have questions about how magnetic bead-based high molecular weight DNA extraction could work for your lab or application, get in touch via the comments or by contacting us directly.