MSOE > CBM > Student Programs > Science Olympiad > The Molecular Story of XIAP > DNA Sequencing

DNA Contains the Information Needed to Make Proteins

While double–stranded DNA has become one of the most iconic structures in modern biology, it is only part of the story. DNA is only important in that it contains the information the cell needs to make proteins.

Proteins are simply linear polymers of amino acids that spontaneously fold up into compact 3D shapes following basic principles of chemistry and physics. Humans are made up of approximately 50,000 different kinds of proteins. Each protein has a unique sequence of amino acids – and that sequence of amino acids is encoded in a sequence of nucleotides that makes up your genome. Each folded protein is an amazing molecular machine that performs a unique job. For example, the beta–globin protein safely binds to and transports oxygen throughout your body; aquaporin transports water through phospholipid bilayers; and influenza hemagglutinin allows the flu virus to infect our respiratory cells.


XIAP Protein
Based on 1i3o.pdb


Based on 1g73.pdb

HA Protein

Based on 1i3o.pdb

More information on protein structure available at the
MSOE Center for BioMolecular Modeling Protein Structure pages.


The Flow of Genetic Information is – – – DNA to RNA to Protein.

The sequence of amino acids in a protein is encoded by a sequence of nucleotides in the DNA that makes up the gene for that protein (Read this over – – and over again – – and think about it until it makes sense to you.) But if you are a eukaryote (which you are, if you are human) your genes are a little more complicated than those of a prokaryote. Your genes are split. That means that the nucleotides that encode the amino acids in any given protein are interrupted by other nucleotides that have nothing to do with the protein. The nucleotides that encode amino acids, and are therefore expressed as protein, are called exons. The intervening nucleotides that do not code for amino acids in your protein are called introns.

When a gene is "expressed" – or made into protein – it is first copied into messenger RNA (mRNA) by the process known as transcription and then translated into protein by a ribosome. But the fact that eukaryotic genes have introns in them means that we have to add another step to this flow of genetic information. The introns of the initial mRNA transcript must be spliced out of the RNA to generate the mature mRNA that consists of one continuous sequence of protein–encoding nucleotides (exons).

Introns and Exons

The following three steps are involved in the expression of a human (eukaryotic) gene.

1. Transcription

In order to make the proteins that are needed to perform the unique functions required in each of your cells, you must first express a specific subset of your genes as messenger RNAs. RNA polymerase is an enzyme that synthesizes mRNA. Messenger RNA is complementary to the template strand of DNA – following the Watson-Crick base pairing rules (A pairs with U;G pairs with C). This process of RNA synthesis – known as transcription – is shown below.


2. mRNA Splicing

RNA polymerase cannot tell the difference between exon and intron sequences. Therefore, the initial messenger RNA that is copied from the template strand of DNA contains both exon and intron sequences. Eukaryotic cells contain "splicesomes" that can recognize the junctions between exons and introns and splice out the introns from the precursor mRNA to generate the mature mRNA that is ready to be translated into protein. This RNA splicing reaction is animated below.


3. Translation

Once the eukaryotic mRNA has been spliced to remove the introns, the mRNA is bound by a ribosome that translates this sequence of nucleotides into a sequence of amino acids – i.e., a protein. This nucleotide sequence is translated "three nucleotides at a time" based on the code that is illustrated in The Standard Genetic Code. Note that the code is degenerate – meaning that most amino acids are encoded by more than one triplet codon. Also notice that the Codon Chart shown below is color–coded according to the chemical properties of the amino acid that is encoded. It is these chemical properties of the individual amino acids that will direct the spontaneous folding of the protein into a compact 3D shape following its synthesis by the ribosome.

For more information on protein folding based on individual amino acid chemical properties, visit the
MSOE Center for BioMolecular Modeling Protein Structure pages.

Codon Chart


An animation of the decoding of the mRNA by a ribosome, and the synthesis of a protein by a ribosome is animated below. As the mRNA moves through the ribosome one triplet codon at a time, small tRNA molecules bring the appropriate amino acid to the ribosome for addition to the growing protein chain. Note the complementary base pairs (GC and AU) that form between the codons of the mRNA and the anti-codon of the tRNAs.


Expression of the XIAP Gene

The XIAP Gene – that figures prominently in the 2012 Science Olympiad Protein Modeling Event – undergoes transcription, mRNA splicing, and translation to produce the final XIAP protein that is composed of 497 amino acids.

Examine the Human XIAP mRNA → Protein Map provided below to see the four domains of known protein structure that make up this protein and the G → A mutation that occurs at nucleotide 641 of Nic's gene. As you explore Nic's story further, you will understand how this change in a single nucleotide of his XIAP gene is believed to be the molecular basis of his disease.

Click here to explore the Human XIAP mRNA → Protein Map in detail as a PDF.