MSOE > CBM > Student Programs > Science Olympiad > The Molecular Story of XIAP > DNA Sequencing

Sequencing Nic's Exons

DNA Sequencing Machine
DNA Sequencing Machine
Roche Company

Back in 1968, Erwin Chargaff (a famous biochemist) was quoted as saying:

"A detailed determination of the nucleotide sequence of a DNA molecule is beyond our present means, nor is it likely to occur in the near future. Even the smallest functional DNA varieties seen, those occurring in certain small phages, must contain something like 5,000 nucleotides in a row. We may therefore, leave the task of reading the complete nucleotide sequence of a DNA to the 21st Century, which will however have other worries."

Chargaff's pessimistic view of the future of DNA sequencing was quickly proven to be wrong. Only ten years later (1978), two different DNA sequencing methods had been described. Maxim and Gilbert had described a DNA sequencing method that involved the step-wise "chemical degradation" of DNA from one of its ends, and Fred Sanger had described a "sequencing by synthesis" method in which chain-terminating, fluorescently-labeled nucleotide analogs were used to generate a collection of DNA fragments, each ending with one of the four fluorescent nucleotides. In each of these sequenceing methods, high-resolution gel electrophoresis was used to separate the resulting DNA fragments that differed in length by a single nucleotide.

Further enhancements of this Sanger chain–terminator DNA sequencing technology was used to determine the first draft sequence of the human genome – published in 2001 (the first year of the 21st century). This Human Genome Project was a collaboration / competition among many DNA sequencing centers around the world, both public and private – at a cost of approximately 600 million dollars.

Today, ten years after the first human genome was sequenced, the technology that was used to sequence the first genome is viewed as "old–fashioned". During the past ten years, "next generation" DNA sequencing technologies have been developed using a mind-boggling array of clever tricks that simultaneously increased the speed and decreased the cost of sequencing – to the point that it was both technically possible and financially affordable to sequence Nic's exome as a way to arrive at a diagnosis of his disease.

 

A Brief Overview of the DNA Sequencing Method That Was Used to Sequence Nic's Exomes:

  1. DNA was extracted from the cells in a small blood sample (~ 1 milliliter) taken from Nic.

  2. Nic's DNA was randomly broken into many short fragments, each approximately 1000 nucleotides long.

  3. The 1.5% of Nic's DNA fragments that contained exon sequences was isolated by passing the DNA over a "exome-capture slide" – a glass microscope slide onto which an array of 2.1 million different exome sequences – known from the original DNA reference sequence – were permanently attached. (The creation of this exome capture slide is another story of ingenuity – not dealt with here.)

  4. These exome-DNA fragments were recovered from the capture slide and then stuck onto small beads – such that each bead contained only one DNA fragment – with a unique nucleotide sequence.

  5. Each bead, with its unique exome–DNA fragment, was then trapped inside a water bubble in oil (an emulsion). The water bubble also contained DNA polymerase enzymes, nucleotides and other components that resulted in the amplification (PCR) of the original DNA fragment. At the conclusion of this step in the process, each "hairy bead" now contained several thousand copies of the original DNA fragment, each with the same unique nucleotide sequence.

  6. These hairy beads were then passed over a slide containing a regular array of picoliter-sized wells – each one just large enough to accommodate one (but not two) hairy bead. Even smaller beads are then added to the wells, trapping the one hairy bead (unique nucleotide sequence) in each well. The smaller beads are coated with two enzymes that work together to generate light in the well every time a nucleotide is added to the end of the DNA that is being synthesized in that well (see animation that follows).

 

A Schematic Illustration of This Sequencing Method

DNA sequencing machine tiles

 

The Sequencing Machine at Work

Once the sequencing array is prepared, as illustrated above, the actual sequencing can begin. The machine flows one nucleotide triphosphate (either dATP, dGTP, dCTP or dTTP) at a time over all of the pits in the array. If the nucleotide was the complement of the next base in a DNA sequence on a bead, the nucleotide is added to the growing sequence – – and the pyrophosphate (PPi) that is generated when the nucloetide monophosophte is added is used as a substrate by the enzymes coating the smaller beads – – to produce LIGHT in the pit. A charge–coupled device (CCD) camera then photographs the sequencing slide, recording the pattern of wells that "light up" when the four nucleotide triphosphates are successively flowed over the slide. By repeatedly flowing the four different dNTPs over the DNA covered beads, each of the sequences in the various pits can be determined.

Special software analyzes each photo and reports the nucleotide sequence of the new DNA strand being synthesized in each well of the sequencing slide – based on the pattern of light generated in that well. Watch the animation below for a schematic representation of this process.

 

The next animation schematically shows the incorporation of complementary nucleotides into the DNA being synthesized in each pit. These nucleotides are actually being incorporated into the growing DNA strands by DNA polymerase enzymes. (These polymerases are not shown in the animation.) The pyrophosphate (PPi) – that is liberated from the dNTP (deoxynucleotide triphosphate) as each dNMP (deoxynucleotide monophosphate) is incorporated into the growing DNA strand – is used as a substrate for an enzyme (sulphurylase) that coats the smaller capture beads in each pit, and leads to the generation of light – by a luciferase enzyme that also coats the capture beads.

 

By sequencing Nic's genes (exome) in this way, researchers at the MCW Human and Molecular Genetics Center found 16,125 differences between Nic's genes and the presumed "normal sequences" in the human reference sequence.

Bioinformaticists then proceeded to analyze these differences, systematically eliminating those differences that were not responsible for Nic's disease, unitl they were left with only one mutation – – an G to A mutation in codon 203 of Nic's XIAP gene.