DNA sequencing is the laboratory technique for the determination of exact nucleotide sequences (As, Ts, Cs, Gs) within the DNA molecule.
Every organism consists of unique sets of nucleotides in its whole genome. Determining the sequence of DNA from different organisms helps scientists to identify the relationship between the organisms. Also, identifying the sequencing patterns of the indigenous people and people with recent modifications at the genetic level helps to reveal different diseases associated with it and study the ancestral relationships between them.
- Sequencing the whole DNA helps reveal the position of the nucleotide bases like adenine, guanine, cytosine, and thymine that occur within that nucleic acid molecule.
- DNA sequencing was made first by Francis crick’s theory.
- Alteration in the sequence of the DNA can lead to the production of different amino acids and proteins.
- The advancement in technology has made it easier to sequence the whole DNA in a matter of hours with the use of a high-end device like the high-throughput method.
- Due to the advancement, many companies are able to sequence the whole genome for thousands of dollars and also provide services at the home itself.
- However, scientists are able to extract the data and test the DNA of diverse organisms to understand the evolutionary pattern with the related and the non-related species.
- Many companies are offering single nucleotide polymorphism tests and mainly focus on individual nucleotides within the genes.
- Sequencing the DNA helps us to understand the gene functions and other parts of the genome.
- When the Human Genome Project was established, then there was a rapid development in the technologies for sequencing.
- As DNA sequencing has increased with more efficiency and less cost, more and more data are produced in the databases leading to the advancement in the field of genomics.
The principle of DNA sequencing
The process of identifying the nucleotide bases from the nucleic acid of DNA is called DNA sequencing. Two different methods of DNA sequencing were introduced after twenty-four years of the discovery of structural DNA. These two methods were the chain termination method and the chemical degradation method, but mostly chain termination method is preferred because of different reasons. This method is mostly used because the single-nucleotide of DNA molecules that differ from each other even with single bases can be separated by the polyacrylamide gel electrophoresis method.
- During this process, the DNA that is to be sequenced is converted into single-stranded nucleotides, and these DNA sequences are also called the template DNA.
- Then these template DNA is joined or annealed with the short oligonucleotides to the same position. Oligonucleotides are short single or double-stranded fragments of DNA.
- The oligonucleotides act as a primer that binds to the specific region of single-stranded DNA and starts to synthesize a new DNA strand that is complementary to that of the template DNA strand.
- During the synthesis of the new DNA strand, the following components are required.
- DNA template,
- DNA polymerase: it is an enzyme for the synthesis of DNA.
- Primer: are short bases of oligonucleotide that are labeled with light-emitting chemicals or radioactive molecules.
- Four deoxynucleotides: (A, T, C, G) and
- One dideoxynucleotides: either ddG, ddA, ddT, or ddC.
- DNA polymerase helps to add the nucleotide sequences. Once the first deoxyribonucleic acid is added to the complementary sequences, DNA polymerase starts to add the bases one by one along that template strand.
- The synthesis of the stand continues and the elongation process is stopped by the addition of dideoxynucleotides.
- Generally, dideoxynucleotide does not have the 3’ OH (hydroxyl) group, so they are not able to form the connection with the other nucleotide leading to the termination of the synthesis.
- During this reaction only a small amount of dideoxynucleotides are added to the reaction mechanism because of which DNA of different lengths is formed on the basis of time and units.
- DNA polymerase inserts a dideoxynucleotide along the template strand, terminating the reaction and forming DNA of different sizes and lengths.
- After this, newly generated sequences are read by electrophoresis where four of the reactions are run side by side on a polyacrylamide sequencing gel.
- The four lanes are divided where one lane of the gel is loaded with the molecules generated in the presence of ddATP. Similarly, the remaining three lanes of the gel are loaded with the molecules that are generated in the presence of ddCTP, ddGTP, and ddTTP.
- After the gel is run or after electrophoresis, the bands are formed on the basis of the length and the DNA sequences can be read on the basis of band formed.
Steps or Procedure of DNA Sequencing
- Sample preparation (DNA extraction)
The primary step for sequencing is the extraction of DNA of interest. DNA can be isolated from animals, plants, bacteria, plasmids, or the environment. Different enzymes and methods are used during the extraction of DNA like proteinase K and spin-column DNA extraction method as this method gives a higher yield of sequences. DNA of 260/280 ratio is considered to be pure DNA.
- PCR amplification of targeted sequences
The polymerase chain reaction is one of the important steps in sequencing. During this process, the gene of interest or the DNA which we want to study is amplified. It makes the number of copies of the genes of interest. Here, the primer binds to the targeted region of the DNA sequence and amplifies the sequences of interest, removing unwanted DNA sequences. During this process series of denaturation, annealing, and extension processes are carried out at different temperatures. Denaturation is done at 94°C, annealing at 55°C to 65°C, and extension at 72°C. Now the PCR product is run on 2% agarose gel along with a DNA ladder.
- Amplicon purification
The purity of the PCR product is examined at 260/280 nm wavelength and it always gives 1.8 because the amplicons are in the purest form. But when the DNA is contaminated then the amplification is not possible. So it is necessary to purify amplicon with unbound primers, DNA polymerases, primer-dimers, DNA templates, and other reaction buffers. To obtain maximum purification alcohol is not used rather spin-column PCR purification kit is used.
- Sequencing pre-preparation
Sequencing pre-preparation is similar to that of the PCR and sample pre-preparation is necessary during DNA sequencing. During this process, both the ends of the DNA are ligated with the adaptor DNA sequences. Now the amplification is carried and the nucleotides are radio-labeled.
- DNA sequencing
The reaction tubes that are prepared after purification of the DNA sequences are now placed into the sequencer machine. Similarly, a series of steps of denaturation, annealing, and extension takes place in this sequencer and the signals of labeled nucleotides are recorded. After this, the recorded data is sent to the computer for analysis.
- Data analysis
After the whole DNA is sequenced the data is recorded in a certain file format. Thus, the obtained data is then compared with the other related data present in the database. During this different software tools are used for understanding and comparing the sequences of different strands.
Methods or Types of DNA Sequencing
- Maxam-Gilbert sequencing
- Chain termination (sanger’s) method
- Short Gun Sequencing
- Bridge PCR sequencing
- Massively parallel signature sequencing
- Polony sequencing
- Illumina sequencing
- Solid sequencing
- DNA nano ball sequencing
It was introduced by two scientists called A.M. Maxam and W. Gilbert in 1997 and is also called the chemical cleavage method. The principle includes the cleavage of the single-stranded DNA at a specific region with the help of chemicals and then running it into the polyacrylamide gel.
Maxam-Gilbert Sequencing Steps
- At first, the DNA is extracted and then denatured using the heat denaturation method for the generation of single-stranded DNA.
- Radiolabeled P32 is labeled at the 5’ region of the DNA after the phosphate (5’P) end is removed.
- Different enzymes are involved during this process where phosphatase is used to remove the phosphate group from the DNA, whereas Kinase adds 32P at the 5’ end of DNA.
- To cleave the DNA at four different positions, four different chemicals are used. Hydrazine and Hydrazine NaCl chemicals are used to attack pyrimidine nucleotides while Dimethyl sulfate and piperidine chemicals are used to attack purine nucleotides.
- Hydrazine: T + C
- Hydrazine NaCl: C
- Dimethyl sulfate: A + G
- Piperidine: G
- Four different tubes containing four different chemicals are added with the four different ssDNA samples of equal volume.
- Then the electrophoresis is carried out after the incubation of the samples for a certain period of time.
- Since the 32P end is radiolabeled, DNA bands are visualized through autoradiography.
At our glance, it seems like this method is more accurate and advanced than that of the Sanger sequencing technique because purified DNA is directly used for DNA sequencing.
Sanger sequencing is also known as the first-generation sequencing method. It is also called the chain termination method and was found by sanger and their co-worker.
In the sanger method, the primer binds to the denatured DNA molecule and initiates the synthesis of single-stranded nucleotide in the presence of the DNA polymerase enzyme.
During the synthesis covalent bond is formed between the 3’ OH of the template strand and the 5’ region of the synthesis strand.
Dideoxynucleotides (ddNTP) is used during this process so as to terminate the further synthesis, so it is also called the chain termination method.
Sanger Sequencing Steps
- At first, the DNA is extracted and then denatured in the presence of proteinase k enzyme. Then PCR amplification is carried on in the presence of PCR reagents.
- Due to the absence of the hydroxyl group in the dideoxynucleotide, there is no formation of a bond between the two adjacent nucleotides.
- The process is divided into three steps:
- DNA extraction: DNA is extracted on the basis of the organism.
- PCR amplification: primers, dNTPs, ddNTPs, DNA polymerase, and PCR buffer are used.
- Identification of the amplified fragments: different techniques like PAGE, gel electrophoresis, or autoradiography are used.
- Here, the amplified PCR product is loaded on the polyacrylamide gel where the DNA fragment migrates to the oppositely charged electrode on the basis of size. Smaller fragments of the DNA move towards the positively charged region than the larger fragments.
- The gel is then observed under UV light.
- Sanger sequencing is a widely used method for researchers and for diagnostics. The recorded signals are further analyzed computationally.
- Shotgun sequencing is the technique of DNA sequencing that is used for sequencing long DNA strands or the whole genome.
- During this process, DNA is broken into different smaller fragments or segments and overlapping regions are identified between all of the generated individual sequences.
- By several rounds of fragmentation and sequencing, multiple overlapping reads for the targeted DNA are obtained.
- Then a continuous sequence is formed by assembling the multiple overlaps reads with the use of computer programs.
- The shotgun technique was first successfully used in H. Influenzae.
- Some scientists used it to map the human genome project in 2001.
- Pyrosequencing is the sequence detection technology based on the “SEQUENCING BY SYNTHESIS”.
- It is the method that detects the pyrophosphate release on incorporated nucleotide rather than the chain termination method.
- During this method, the single-stranded DNA template is hybridized with a primer and then incubated with the enzymes.
- This method is based on the detection of the activity of the DNA polymerase and other enzymes like ATP sulfurylase, luciferase, and apyrase.
High Throughput Sequencing
- High throughput sequencing is the recent modern DNA sequencing technology alternative to microarray.
- This technique can measure the factors that affect gene regulation and is relatively more expensive than that of microarray.
- It has become one of the widely used methods for sequencing DNA fragments of fewer lengths or for whole-genome sequencing.
- Compared to sanger methods, it has three main changes; first was for cloning of the DNA fragments, the cell-free system was developed.
- Previously, the DNA which needed to be sequenced was amplified into the bacterial plasmid and then extracted for purification.
- High throughput sequencing doesn’t rely on these time-consuming and expensive methods.
- It can carry millions of sequences of reactions parallelly.
- It has vastly expanded the applications of genomics in the field of research, science, medicine, medical diagnostics, and forensics.
Applications of DNA Sequencing
Since DNA sequencing has become an integral part, its application has reached diverse areas, including:
- Medical Science: DNA sequencing is an integral part of the medical sector for the study of genetic diseases related to heredity. Similarly, different diseases associated and the cause of the diseases are also determined by the sequencing method. Besides this, DNA sequencing can help to detect new mutations.
- Forensics science: DNA sequencing has become one of the growing demands in forensic science for studying a particular individual because the individual has a unique sequence of his or her DNA. It is mostly used during the crime scene and identifying the suspect by finding proof from the hair, blood samples, nail, or skin samples.
- Agriculture: Agriculture sector has been advanced due to DNA sequencing technology. Different microorganisms found in the soil have been sequenced and mapped, and the beneficial microorganisms have been selected that are useful for food and crop plants. for example, genes that are found in the bacteria which are able to resist certain kinds of bugs and insects have been used in plants to increase their productivity and nutrition value of plants. similarly, genetic studies have been done to improve the quality and quantity of milk production in livestock.
- DNA sequencing is also used for the construction of genome maps, whole chromosomal maps, and restriction digestion maps.
- It helps in identifying the protein-coding regions, open reading frames, and non-open reading frames.
- DNA sequencing is also used in the detection of tandem repeats, introns or exons, and repeat sequences.
- The modern method is also used in the editing of genes and gene manipulation. Also, sequencing is able to detect new variations in nature.
- DNA sequencing method like pyrosequencing has made a study of metagenomics possible.
- In the past, identifying the bacterial species was a big challenge until sequencing technology was developed, which is used in the Microbial identification and identification of new bacterial species.
- New mutations and genes can be identified in the microorganisms by comparing the sequences between the related and different species of microorganisms.
- Modern DNA sequencing technology like NGS is mostly used for the study of cancer and oncology. The genes that are responsible for causing cancer have been identified by this technique.
- The evolutionary relationship and phylogenetic study have been possible due to the advancement in DNA sequencing technology and methods.
- DNA sequencing also helps to study the asymptomatic high-risk populations in the field of diseases.
Advantages of DNA Sequencing
- Whole genome sequencing is possible with the help of DNA sequencing along with the advancement in technology.
- Different genetic diseases like Alzheimer’s disease, cystic fibrosis, and many other diseases caused by the disability of the gene to function properly can be studied.
- Accuracy: DNA sequencing and whole-genome data are interpreted following the highest standards. So the advantage of it is accurate data of the genome.
- Cost saving: due to the advancement in the technology and methods of sequencing, the whole genome can be sequenced in less than a thousand dollars. From here the action and function of a single gene can be traced and identified for any possible reasons for the cause of diseases.
- Physiological benefits: some people suffer from the problem of genetic disease and that ultimately hampers the development of physiological and mental behavior. Thus identifying the gene responsible for the cause of genetic diseases through sequencing of DNA can help save lives and design the drugs for those specific genetic diseases. Finally, it helps people to believe and make them mentally strong.
- Sense of empowerment: identifying the cause of diseases and study of the genes helps scientists to think a step ahead. They are able to develop solutions for the diseases caused by different microorganisms at an early stage. From the whole genome data, researchers can bring mutations and predict the future diseases that might be caused by that specific microorganisms. In this way, we can remain ahead of possible disease outbreaks.
- Lifetime use: the genomic data that is extracted from individuals can be used to study the whole genetic makeup. Since an individual has a different genetic makeup and can be used for the study.
Limitations of DNA Sequencing
- A high-tech and high-speed supercomputer are required for DNA sequencing as it relies on computational data processing.
- It cannot study several sequences like tandem repeats, fragmented genes, repetitive DNA, and other duplicated regions properly.
- A big economic loss can be caused if the samples are not in a pure state while sequencing.
- Sometimes the test may be failed because of external environmental conditions and factors.
Some of the famous DNA Sequencers
Illumina Next-generation sequencer
The Illumina Next-generation sequencing is based on reversible dye terminators and sequencing by synthesis, and when introduced into DNA strands, it enables the identification of single bases. It consists of several steps:
- Library preparation,
- Cluster generation,
- Alignment and Data analysis.
Solid is an enzymatic method of sequencing that uses DNA ligase (an enzyme that is used to ligate double-stranded DNA strands).
DNA nanoball sequencer
DNA nanoball sequencer is an instrument that is used for determining the entire genomic sequence of an organism. It is a high throughput sequencing technology. This method uses circle replication to amplify small fragments of genomic DNA into DNA nanoballs and is one of the modern technologies.
References for DNA Sequencing
- Mardis, E. DNA sequencing technologies: 2006–2016. Nat Protoc 12, 213–218 (2017). https://doi.org/10.1038/nprot.2016.182