Last edited and updated on: by
Genetics: is the science that studies the inheritance of biological characteristics by living things. This subject, also known as heredity, is a wide-ranging science that examines:
- The transmission of biological properties (traits) from parents to offspring.
- The expression and variation of those traits.
- The structure and function of the genetic material and
- How this material changes or evolves
The nature of Genetic material
Deoxyribonucleic acid or DNA is the central molecule of genetics, although DNA was once thought to be too simple a molecule to store genetic information. And scientist thought that a molecule of much greater complexity must house the genetic information of a cell. It was then argued that proteins, is composed of 20 different amino acids would be a better candidate for this function.
However, the early work of Fred Griffith in 1920 on the transfer of virulence in pathogen Streptococcus pneumoniae, commonly called pneumococcus debunk the ideal that protein store genetic information and show that DNA is involved. Griffith found that if he boiled virulent bacteria and injected them into mice, the mice were not affected and no pneumococci could be recovered from the animals. When he injected a combination of killed virulent bacteria and a living non-virulent strain, the mice died; moreover, he could recover living virulent bacteria from the dead mice. Griffith called this change of non-virulent bacteria into virulent pathogen transformation.
Osward Avery and his colleagues then set out to discover which constituent in the heat-killed virulent pneumococci was responsible for Griffith’s transformation. These investigators selectively destroyed constituents in purified extracts of virulent pneumococci (S cells) using enzymes that would hybridize DNA, RNA, or protein. They then exposed non-virulent pneumococcal strains (R strain) to the treated extracts. Transformation of the non-virulent bacteria was blocked only if the DNA was destroyed, suggesting that DNA was carrying the information required for transformation. This work by Avery and his colleagues in 1944 provided the first evidence that Griffith’s transforming principle was DNA and therefore that DNA carried genetic information.
Some years later (1952), Alfred Hershey and Martha chase performed several experiments indicating that DNA was the genetic material in bacteria virus called Ts bacteriophage. Though some luck was involved in their discovery, for the genetic material of many viruses is RNA and the researchers happened to select a DNA virus for their studies. Imagine the confusion if T2 had been an RNA virus! The controversy surrounding the nature of genetic information might have lasted considerably longer than it did. Hershey and Chase made the virus’s DNA radioactive with 32P, or they labeled it a protein coat with 35S. They mixed radioactive bacteriophage with Escherichia coli and incubated the mixture for a few minutes. The suspension was then agitated violently in a blender to shear off any adsorbed, bacteriophage particles. After centrifugation radioactivity in the supernatant (where the virus remained) versus the bacterial cells in the pellet was determined. They found that most radioactive proteins was released into the supernatant, whereas 32P DNA remained within the bacteria. Since genetic material was injected and T2 progeny were produced, DNA must have been carrying the genetic information for T2 (Plate 1)
The Levels of structure and function of the Genome
The genome is the sum total of genetic material carried within a cell. Although most of the genome exists in the form of chromosomes. Genetic material can appear in non-chromosomal sites as well. For example, bacteria and some fungi contain tiny extra pieces of DNA (plasmid) and the mitochondria and chloroplasts of eukaryotes are equipped with their own functional chromosome.
Genomes of cells are composed exclusively of DNA, but viruses contain either DNA or RNA as the principal genetic material.
A chromosome is a discrete cellular structure composed of a neatly packaged DNA molecule. The chromosome of eukaryotes and bacterial cells differ in several respects. The structure of eukaryotic chromosomes consists of a DNA molecule tightly wound around histone protein, whereas a bacterial chromosome is condensed and secured into a packet by means of a different type of protein.
Eukaryotic chromosomes are located in the nucleus; they vary in number from a few to hundreds: they can occur in pairs (diploid) or singles (haploid), and they are linear in format. In contrast, most bacteria have a single circular chromosome, although some have multiple chromosomes and a few have linear chromosomes.
All chromosomes contain a series of basic informational ‘packets’ called genes. A gene can be defined from more than one perspective but in classical genetics, the term refers to the fundamental unit of heredity responsible for a given trait in an organism. In a molecular and biochemical sense, it is a portion of the chromosome that provides information for a given function. Specifically, it is a specific segment of DNA that contains the necessary codes to make a protein of the RNA molecule.
Genes fall into three basic categories:
i. The structural gene that code for protein
ii. Genes that code for RNA and
iii. Regulatory genes that control gene expression
The sum of all these types of genes constitutes an organism’s distinctive genetic makeup or genotype. The expression of the genotype creates traits (certain structures or functions) referred to as the phenotype. For example, a person inherits a combination of genes (genotype) that gives a certain eye color or height (phenotype), a bacterium inherits genes that direct the formation of a flagellum or the ability to metabolize a certain substrate, and a virus has genes that dictate its capsid structure.
Note: All organisms contain more genes in their genotypes than are being seen as a phenotype at any given time. In other words, the phenotypes can change depending on which genes are ‘turned on’ or expressed.
The basic unit of DNA structure is a nucleotide, composed of phosphate deoxyribose sugar and a nitrogenous base (Plate 4 and 5). Each deoxyribose sugar bonds covalently in a repeating pattern with two phosphates. One of the bonds is to the number 5’ (read ‘five prime’) carbon on deoxyribose and the other is to the 3’ carbon which specifies the order and direction of each strand. This formation results in an elongate strand with a sugar-phosphate backbone.
The nitrogen bases, purines, and pyrimidines attach by covalent bonds at the 1’ position of the sugar. They span the center of the molecule and pair with appropriate complementary bases from the other strand, thereby forming a double-stranded helix.
The paired bases are so aligned as to be joined by hydrogen bonds. Such weak bonds are easily broken, allowing the molecule to be ‘unzipped’ into its complementary strands. This is important so as to gain access to the information encoded in the nitrogen base sequence.
It should be noted that the purine adenine (A) pairs with the pyrimidine thymine (T) and the purine guanine (G) pairs with the pyrimidine cytosine (C). And the adenine forms two hydrogen bonds with thymine and cytosine forms three hydrogen bonds with guanine. This AT and GC base pairing means that the two strands in a DNA double helix are complementary.
It should also be noted that the two strands of the DNA are not oriented in the same direction. One side of the helix runs in the opposite direction of the other, in what is called an antiparallel arrangement. The order of the bond between the carbon on deoxyribose and the phosphate is used to keep track of the direction of the two sides of helix. Thus, one helix runs from the 5’ to 3’ direction and the other runs from the 3’ to 5’ direction. This characteristic is a significant factor in DNA synthesis and translation.
The flow of Genetic information
The pathway from DNA to RNA to protein is conserved in all cellular forms of life and often called the central dogma. This usually involves:
The flow of genetic material information from one generation to the next (replication) and the flow of information within a single cell, a process also called gene expression.
DNA functions as a storage molecule holding genetic information for the lifetime of a cellular organism and allowing that information to be duplicated and passed on to its progeny. Synthesis of the duplicated DNA is directed by both strands of the parental molecule and is called replication. This process is catalyzed by DNA polymerase enzymes.
The genetic information stored in DNA is divided into units called genes. For an organism to function properly and reproduce, its gene must be expressed at the appropriate time and place. Gene expression begins with the synthesis of an RNA copy of the gene. This process of DNA-directed RNA synthesis is called transcription because the DNA base sequence is re-written as an RNA base sequence. RNA polymerase enzyme catalyzes transcription.
Note: Although DNA has two complementary strands, only one strand of the template strand, of a particular gene is transcribed. If both strands of a single gene were transcribed, two different RNA molecules would result in two different products. However, different genes may be evolved on opposite strands, thus both strands of DNA can be served as templates on for RNA synthesis, depending on the orientation of the gene of the DNA.
Transcription yields three major types of RNA, depending on the gene transcribe. These are messenger RNA (mRNA), transfer RNA (tRNA), and ribosomal RNA (rRNA). During the last phase of gene expression, translation: genetic information in the form of an RNA base sequence in the messenger RNA (mRNA) is decoded and used to govern the synthesis of a polypeptide. Thus, the amino acid sequence of a protein is a direct reflection of the base sequence in mRNA. In turn, the mRNA nucleotide sequence is complementary to a portion of the DNA genome (Plate 3)
Ribonucleic acid (RNA) contains the bases adenine, guanine, cytosine and uracil (instead of thymine, although tRNA contains a modified form of thymine). The nucleotides are joined by a phosphodiester bond, just as they are in DNA. The sugar in RNA is ribose. Most RNA molecules are single-stranded molecules that can assume the secondary and tertiary levels of complexity due to bonds within the molecule, leading to specialized forms of RNA (mRNA, tRNA, rRNA).
Note: All types of RNA are formed through transcription of DNA gene, but the only mRNA is further translated into protein.
Messenger RNA (mRNA): this is the transcribed version of a structural gene or genes in DNA. It is synthesized by a process similar to the synthesis of the leading strand during DNA replication, and the complementary base-pairing rules ensure that the code will be faithfully copied in the mRNA transcript. The message of this transcribes strand will be faithfully copied in the mRNA transcript. This message of this transcribed strand is later read as a series of triplets called codons (Plate 6a).
Transfer RNA (tRNA): these are also complementary copies of specific regions of DNA. However, they differ from mRNA because they are unique in length, being 75 to 95 nucleotides long and it contains sequences of bases that form hydrogen bonds with complementary sections of the same tRNA strand. At this points, the molecule bends back upon itself into several hairpin loops, giving the molecule a secondary helix (plate 6b). The bottom loop of the cloverleaf exposes a triplet, the anticodon, that both designates the specificity of the tRNA and the complements mRNA codons.
At the opposite end of the molecule is a binding site for the amino acid that is specific for that tRNA’s anticodon. For each of the 20 amino acids there is at least one specialized type of tRNA to carry it. Binding of an amino acid to its specific tRNA requires a specific enzyme that can correctly match each tRNA with its amino acid.
Ribosome: the prokaryotic (70S) ribosome is a particle composed of tightly package ribosomal RNA and protein. The rRNA component of the ribosome is also a long polynucleotide molecule. It forms complex three-dimensional figure that contributes to the structure and function of the ribosome.
The interaction of protein and rRNA create the two subunits of the ribosome that engage in final translation of the genetic code (Plate 7).
With the discovery and characterization of DNA, the gene was defined more precisely as a linear sequence of nucleotides with fixed start and end points. It should be noted that not all genes encodes protein; some code instead for rRNA and tRNA as we discuss early (plate 3). In addition, it is now known that some eukaryotic gene encodes more than one protein. This, therefore, disproved the theory of one gene one polypeptide hypothesis which once thought that a gene contained information for the synthesis of one enzyme. Thus, a gene might be defined as a polypeptide sequence that codes for a functional product (i.e. a polypeptide, tRNA or rRNA).
The nucleotide sequence of protein-coding genes are distinct from RNA coding genes and noncoding regions because when transcribed, the resulting mRNA can ‘read’ in discrete sequences of sets of three nucleotides each set being a codon. Each codon codes for a single amino acid. The sequence of codon is ‘read’ in only one way to produce a single product. That is, the code is not overlapping and there is a single starting point with one reading frame or way in which nucleotides are grouped into codon (plate 14).
Each strand of DNA therefore usually consists of gene sequences that do not overlap one another (plate 15a). However, there are the exception to the rule some viruses such as the phage фx174 have the overlapping gene (plate 15b), and parts of genes overlap in some bacterial genomes.
As will be discussed later, prokaryotic and viral gene structure differs greatly from that of eukaryotes. In prokaryotic and viral systems, the coding information within a gene normally is continuous. However, in eukaryotic organisms many genes contain coding information (exons) interrupted periodically by noncoding sequence (intron). The intron must be cut or sliced out of the mRNA before the protein is made.
Genes that code for protein
As discussed earlier, in order for genetic information in the DNA to be used, it must first be transcribed to form an RNA molecule. The RNA product of the gene that codes for a protein is messenger RNA (mRNA). Recall from our discussion on information flow that although DNA is double stranded, only one strand of a gene contains coded information and directs RNA synthesis. This strand is called the template strand, and the complementing strand is known as the coding strand because it is the same nucleotide sequence as the mRNA except in DNA bases (plate 16). Because the mRNA is made from the 5’ to the 3’ end, the polarity of the DNA template strand is 3’ to 5’. Therefore, the beginning of the gene is at the 3’ end of the template strand.
An important site, the promoter, is located at the start of the gene. The promoter is a recognition/binding site for RNA polymerase, the enzyme that synthesizes RNA. The promoter is neither transcribed nor translated; it functions strictly to orient RNA polymerase a specific distance from the first DNA nucleotide that will serve as a template. The promoter is very important in regulating when and where a gene will be transcribed or expressed.
The transcription start site (labeled +1 in plate 16) represents the first nucleotide in the mRNA synthesizes from the gene. However, the initially transcribe a portion of the gene does not necessarily code for amino acids. Instead, it is a leader sequence that is transcribed into mRNA, but is not translated into amino acid. The leader sequence includes a region called the Shine-Dalgarno sequence that is important in the initiation of translation. The leader sometimes is also involved in regulation of transcription and translation.
Immediately next to (and downstream of) the leader is the most important part of the gene, the coding region (plate 16). In genes that direct the synthesis of proteins, the coding region typically begins with the template DNA sequence 3’-TAC-5’. This produces the codon 5’-AUG-3’, which in bacteria codes for N-formylmethionine, a specially modified amino acid used to initiate protein synthesis. The remainder of the coding region consists of a sequence of codons that specifies the sequence of amino acids for that particular protein. The coding region ends with a special codon called the stop codon, which signals the end of the protein and stops the ribosome during translation.
The stop codon is immediately followed by the trailer sequence (plate 16) which is needed for proper expression of the codon region of the gene. The stop codon is not recognized by RNA polymerase during transcription. Instead, a terminator sequence is used to stop transcription by dislodging the RNA polymerase from the template DNA.
Gene that code for tRNA or rRNA
The DNA segments that code for tRNA and rRNA also are considered genes, although they give rise to important RNA rather than protein. In E. coli the genes for tRNA are fairly typical, consisting of a promoter and transcribed leader and trailer sequences that are removed during the process of tRNA maturation (plate 17a). The precise function of the leader is not clear; however, the trailer is required for transcription termination. Gene for tRNA may code for more than a single tRNA molecule or type of tRNA (plate 17a). The segments coding for tRNAs are separated by short spacer sequences that are removed after transcription by special ribonucleases, at least one of which contains catalytic RNA. RNA molecules with catalytic activity are called ribozymes.
The genes for rRNA also are similar in organization to genes coding for proteins because they have promoters, trailers, and terminators. Interestingly all the rRNAs are transcribed as a single, large precursor molecule that is cut up by ribonucleases after transcription to yield the final rRNA products. E.coli pre-rRNA spacer and trailer regions even contain tRNA genes. Thus the synthesis of tRNA and rRNA involves posttranscriptional modification, a relatively rare process in prokaryotes.