HISTORY AND METHODS OF DNA SEQUENCING

Technologies enabling the sequencing of nucleic acids have been developed rapidly for many years. Currently, manufacturers offering platforms for sequencing DNA molecules are competing in providing faster, cheaper and simpler solutions. In this article, we describe the history and methods of DNA sequencing.


First generation sequencing


The history of sequencing began in 1953, when James Watson and Francis Crick, with the help of crystallographic research, proposed a three-dimensional structure of the helix and DNA. Since then, intensive efforts have been made to develop a method that would read the sequence of nucleotides in the genome. The first successes appeared in 1965, thanks to Robert Holley, who was able to read the sequence of nucleotides in a tRNA molecule isolated from yeast.

Later, in 1977, a new method that took over the world of sequencing for over 30 years was proposed by Frederick Sanger when he developed a technique commonly known as the dideoxy method or simply the Sanger method. The Sanger method is based on the use of modified nucleotides, the so-called dideoxynucleotides, the attachment of which to the newly synthesized strand leads to the termination of the reaction. In addition, nucleotides are labelled with radioisotopes, and the analysis uses high-resolution gel electrophoresis. Due to the great interest in sequencing, over time, this method was improved, enabling the automation of the process with the development of a technique that is still used today, based on the use of fluorescently labelled nucleotides in combination with capillary electrophoresis and digital signal reading. The introduction of the above facilities enabled the development of the first commercially available sequencer capable of sequencing fragments up to 1500 base pairs in length. With the help of first-generation sequencing, the Human Genome Project was launched in 2001, resulting in the development of the first outline for the sequence of the human genome.


Second generation sequencing


Over time, increasing interest in nucleic acid sequencing has contributed to the emergence and development of high-throughput sequencing techniques. Thus, it enabled the simultaneous analysis of many samples in one reaction. Next-generation sequencing – NGS (next generations sequencing), allows us to obtain millions of readings in an incomparably shorter time than in the classic Sanger sequencing method.

In order to meet the need to increase the throughput and significantly reduce the time of conducting the analysis, a number of sequencing methods have been developed to avoid the tedious steps related to the cloning of DNA fragments. Instead, in 2nd generation sequencing methods, linker or adapter sequences are directly ligated to the fragmented DNA. The library amplification takes place here on specially prepared plates or beads to which template DNA fragments are attached. The attachment of nucleotides is monitored directly by measuring luminescence or detecting protons. The developed solutions enabled large-scale sequencing of DNA and RNA with a reasonable amount of time to work. Despite the high resolution and quality of the obtained sequences, a significant limitation of the second generation methods is the length of the obtained reading, not exceeding several hundred nucleotides, which causes difficulties in subsequent stages of nucleotide sequence analysis, such as de novo splicing of genomes, especially in the context of repeated or low complexity regions.


Third generation sequencing


Third-generation sequencing is referred to as single-molecule sequencing (nucleic acid molecules). It allows you to obtain readings of much longer length than the technologies of previous generations while reducing the financial expenditure necessary to conduct the analysis. It is worth noting that 3rd generation sequencing takes place in real-time, which also allows for a significant acceleration of the entire sequencing process.

Currently, there are two third-generation platforms on the commercial market that allow reading the sequence of nucleotides in single DNA and RNA molecules: SMRT by PacBio and nanopore sequencing by Oxford Nanopore Technologies. In theory, both of these sequencing methods allow obtaining readings equal in length to the DNA fragment under study. In practice, however, SMRT technology works best with DNA fragments that yield readings of approximately 10,000 base pairs. In contrast, the length of the readings obtained with the technology developed by Oxford Nanopore seems to be unlimited, often reaching from 300,000 to 500,000 base pairs, and record readings can be up to 2 million nucleotides in length.