DNA sequencing
Nucleic Acid Chemistry
Third term 2022-2023
Randy Bryant
Department of Biochemistry and Molecular Biology
Johns Hopkins Bloomberg School of Public Health
DNA polymerases Catalyze the synthesis of DNA chains
3´ 3´
5´
dNTP
Biochemistry, 6 th ed. Freeman 2007
Use deoxynucleoside triphosphates (dNTPs) as precursors
Extend DNA chains by adding one nucleotide unit at a time to
the 3´-end of a primer strand
Template-directed DNA polymerization
primer
template Biochemistry, 6th ed. Freeman 2007
DNA polymerase uses single-stranded DNA as a template
Selects and adds the nucleotide that is complementary to the
corresponding nucleotide in the template strand
Dideoxy DNA sequencing
Fred Sanger and dideoxy DNA sequencing
Fred Sanger
Dideoxy chain termination
deoxyribonucleoside dideoxyribonucleoside
triphosphate triphosphate
(dNTP) (ddNTP)
extends DNA synthesis terminates DNA synthesis
Dideoxy chain termination
3´
H
ddNTP
3´
H H
no -OH here
Cannot add another nucleotide when a ddNTP has been incorporated
into a growing DNA chain
Dideoxy chain terminators
ddATP ddGTP
ddCTP ddTTP
Dideoxy sequencing
Convert DNA to be sequenced into a single-stranded form
Anneal short primer adjacent to region to be sequenced
Carry out primer extension reaction using DNA polymerase
with:
High concentration of the four normal dNTPs
and
Low concentration of one ddNTP (1:100)
Under these conditions:
The chance of incorporating a ddNTP in place of a dNTP will
be small – so some DNA synthesis will occur
But a ddNTP will be incorporated at some point during the
reaction and terminate DNA synthesis
Example: If the reaction was carried out with four normal dNTPs and
ddATP:
5´ 3´
primer GACT
template CTGATGGATCGATCGAGC
primer GACTA 5 nt
template CTGATGGATCGATCGAGC
primer GACTACCTA 9 nt
template CTGATGGATCGATCGAGC
primer GACTACCTAGCTA 13 nt
template CTGATGGATCGATCGAGC
Would generate a mixture of fragments corresponding to termination at
each position where there was a T in the template
Example: If the reaction was carried out with four normal dNTPs and
ddTTP:
5´ 3´
primer GACT
template CTGATGGATCGATCGAGC
primer GACTACCT 8 nt
template CTGATGGATCGATCGAGC
primer GACTACCTAGCT 12 nt
template CTGATGGATCGATCGAGC
primer GACTACCTAGCTAGCT 16 nt
template CTGATGGATCGATCGAGC
Would generate a different mixture of fragments corresponding to
termination at each position where there was a A in the template
Dideoxy DNA sequencing
Molecular Cell Biology, Freeman 2008
Carry out four parallel reactions using the four normal dNTPs and each
of the four ddNTPs
Analyze the products of each individual reaction by polyacrylamide
gel electrophoresis
DNA sequencing gel
longer Separates DNA strands
differing in length by a
single nucleotide
Termination products
appear as a ladder
shorter of bands
primer
Can read the sequence of the synthesized strand directly off the gel
Will be complementary to the sequence of the template strand being
analyzed
Dideoxy DNA sequencing
Fred Sanger DNA sequencing gel
Sanger et. al. “DNA sequencing with chain-terminating
inhibitors” PNAS 74, 5463 (1977)
Can sequence ~300-500 bases per reaction
Sanger dideoxy DNA sequencing
Sanger used Klenow polymerase in his original paper
But:
Is a relatively slow polymerase: 30-45 nucleotides/sec
Is not very processive: 10-50 nucleotides/binding event
The 3´-5´ exonuclease tends to degrade primer strand
and interfere with efficient primer extension
So Klenow polymerase is not an optimal polymerase
for DNA sequencing
T7 DNA polymerase
Has higher rate of polymerization: 300 nucleotides/sec
Has higher processivity: > 1000 nucleotides/binding event
(with thioredoxin subunit)
Has no 5´-3´ exonuclease activity
So is a more efficient polymerase than Klenow
But the 3´-5´ exonuclease activity still limits the efficiency of
the reaction for DNA sequencing
SequenaseTM
Chemically modified version of T7 DNA polymerase
Treatment with oxidizing agent inactivates the
3´-5´ exonuclease activity (0.01%)
But retains normal polymerase activity
So was useful for dideoxy sequencing
First introduced in 1987 and was sold commercially
SequenaseTM version 2.0
Modified form of T7 DNA polymerase
Has a 28 amino acid deletion that removes the
3´-5´ exonuclease domain
So does not degrade primers
Polymerase activity is not altered by the deletion of the
3´-5´ exonuclease domain
Incorporates dideoxy-NTPs and other nucleotide analogs
Sold commercially: Used for dideoxy DNA sequencing
Automated DNA sequencing
DNA sequencing with fluorescent chain terminators
Fluorescently-labeled dideoxy chain terminators
*
*
TT CC
* *
AA GG
Each of the fluorophores has a distinctive emission maximum and can be
distinguished by fluorescence spectroscopy
Prober et. al. “A system for rapid DNA sequencing with fluorescent chain-terminating dideoxynucleotides” Science 238, 336 (1987)
Demonstration of the fluorescently-labeled dideoxy terminator method
normal labeled
Worked with Sequenase, but not
with Klenow polymerase
Prober et. al. “A system for rapid DNA sequencing with fluorescent chain-terminating dideoxynucleotides” Science 238, 336 (1987)
Demonstration of the fluorescently-labeled dideoxy terminator method
Prober et. al. “A system for rapid DNA sequencing with fluorescent chain-terminating dideoxynucleotides” Science 238, 336 (1987)
Automated dideoxy DNA sequencing
Uses the four ddNTPs, each labeled with
a different fluorescent group (color)
Termination products are separated
by capillary gel electrophoresis
The colors of the different termination products are determined using
a fluorescence detector as they elute from the gel
Produces a DNA sequence trace that can be converted into a DNA
sequence
Can sequence up to 700-900 bases per reaction
Automated dideoxy DNA sequencing
Applied Biosystems DNA sequencer DNA sequence trace
Uses a modified form of Taq I DNA polymerase
dNTP/ddNTP selectivity
Compared the dNTP/ddNTP selectivity
of:
E. coli DNA polymerase I
Taq I DNA polymerase
T7 DNA polymerase
dNTP/ddNTP selectivity
Primer extension reactions were carried
out with a 6:1 ratio of dNTPs/ddNTPs:
G, A, T, C
T7 polymerase incorporates ddNTPs
more efficiently than do Pol I and
Taq I polymerase
Tabor and Richardson, “A single residue in DNA polymerases of the Escherichia coli DNA polymerase I family is critical for distinguishing between deoxy-and
dideoxyribonucleotides” Proc. Natl. Acad. Sci. USA 92, 6339 (1995)
Basis for dNTP/ddNTP selectivity
Pol I polymerase: Phe762
Taq I polymerase: Phe667
T7 polymerase: Tyr526
dNTP binding site (Pol I)
Hypothesized that a hydroxyl group at either the 3´-position of the
substrate (dNTP) or in the active site (Tyr) was required to stabilize
a catalytic Mg2+ ion
Tabor and Richardson, “A single residue in DNA polymerases of the Escherichia coli DNA polymerase I family is critical for distinguishing between deoxy-and
dideoxyribonucleotides” Proc. Natl. Acad. Sci. USA 92, 6339 (1995)
dNTP/ddNTP selectivity
Replacing Tyr526 of T7 polymerase with
Phe decreased ddNTP incorporation
and
Replacing Phe762 of Pol I, or Phe667 of
Taq I, with Tyr increased ddNTP
incorporation
Were able to re-engineer the dNTP/ddNTP
specificity of these DNA polymerases
Tabor and Richardson, “A single residue in DNA polymerases of the Escherichia coli DNA polymerase I family is critical for distinguishing between deoxy-and
dideoxyribonucleotides” Proc. Natl. Acad. Sci. USA 92, 6339 (1995)
Automated dideoxy DNA sequencing
Applied Biosystems DNA sequencer DNA sequence trace
Uses a modified form of Taq I DNA polymerase (F667Y)
Human genome sequencing project
Automated dideoxy sequencing was used
for the first human genome sequencing
project (2001)
Cost ~$100,000,000
Was a composite sequence derived from
the DNA from a number of individuals
DNA pyrosequencing
DNA sequencing by pyrophosphate release
Pyrosequencing reaction steps
DNA polymerase
(NA)n + nucleotide (NA)n+1 + PPi
ATP sulfurylase
PPi + APS ATP + SO42-
luciferase
ATP + luciferin + O2 AMP + PPi + oxyluciferin
+ CO2 + light
1. DNA polymerase
Each nucleotide incorporation step is accompanied by the
release of one pyrophosphate
2. ATP sulfurylase
- - - - - -
+ +
-
-
- - -
-
- -
pyrophosphate adenosine adenosine sulfate
5´-phosphosulfate 5´-triphosphate
(APS) (ATP)
Each pyrophosphate is used to produce one ATP
3. Luciferase
Each ATP is used to convert one molecule of luciferin to oxyluciferin
and is accompanied by the emission of light
Pyrosequencing reaction steps
DNA polymerase
+ dNTP
+ APS/ATP sulfurylase
/luciferase
Is a sequencing by synthesis method
Uses normal dNTPs
Original pyrosequencing reaction scheme
The four different dNTPs were added
template
primer
sequentially to a primer/template
DNA
polymerase
Nucleotide incorporation was detected
by the sulfurylase and luciferase
reactions - as a flash of light
Signal was proportional to the number
of nucleotides that were incorporated
Unincorporated dNTPs were washed
template
extended primer
away and cycle was repeated
Enzymatic Luminometric Inorganic pyrophosphate Detection Assay
Ronaghi et. al. “Real-time DNA sequencing using detection of pyrophosphate release” Anal. Biochem. 242, 84 (1996)
Demonstration of the pyrosequencing method
primer
template
dNTP addition
sequence
This experiment was carried out with Klenow (exo-) polymerase
Also tried Klenow polymerase and Sequenase 2.0
Ronaghi et. al. “Real-time DNA sequencing using detection of pyrophosphate release” Anal. Biochem. 242, 84 (1996)
Automated DNA pyrosequencing
Qiagen DNA pyrosequencer DNA pyrogram
Uses Klenow DNA polymerase
454 DNA sequencing
454 DNA sequencing
The first next-generation DNA sequencing technology
Developed in 2008
Features the use of a solid-phase version of the
pyrosequencing method
Adapted so that many sequencing reactions could
be carried out in parallel
Used for genomic sequencing
454 sequencing
Genomic DNA is fragmented
DNA fragments are ligated to
adapters which contain specific
PCR primer binding sequences
Ligated fragments are then
separated into single strands
Rothberg and Leamon, “The development and impact of 454 sequencing” , Nature Biotech. 26, 1117 (2008)
454 sequencing
Fragments are annealed to primers
that are attached to beads - one
fragment per bead
primer
Individual beads are isolated in
DNA
the droplets of a PCR reaction
mixture/oil emulsion
PCR amplification PCR amplification occurs within
One strand of the amplified DNA will
be connected to the primer attached each droplet (microreactor)
to the bead
Clonal DNA amplification
Rothberg and Leamon, “The development and impact of 454 sequencing” , Nature Biotech. 26, 1117 (2008)
454 sequencing
Emulsion is broken
DNA strands are denatured
Beads carrying the covalently
attached single-stranded DNA are
deposited into wells of a fiber-
optic slide
One bead per well
Rothberg and Leamon, “The development and impact of 454 sequencing” , Nature Biotech. 26, 1117 (2008)
454 sequencing
Smaller beads containing
immobilized pyrosequencing
enzymes are added to each
well:
ATP sulfurylase and luciferase are covalently attached to the smaller beads
Rothberg and Leamon, “The development and impact of 454 sequencing” , Nature Biotech. 26, 1117 (2008)
454 sequencing
Add: Sequencing primer
DNA polymerase
Four dNTPs
Pyrosequencing reactions begin
Rothberg and Leamon, “The development and impact of 454 sequencing” , Nature Biotech. 26, 1117 (2008)
454 sequencing
The four dNTPs are flowed sequentially
across the plate
Sequencing is monitored electronically as
flashes of light from the individual wells
when a nucleotide is incorporated
Many reactions can be carried out
in parallel
Rothberg and Leamon, “The development and impact of 454 sequencing” , Nature Biotech. 26, 1117 (2008)
454 sequencing
Can sequence ~250 bases per well
Have 1,600,000 wells per plate
Can sequence up to 400,000,000 bases per plate (8 hrs)
First individual human genome sequence
454 sequencing was used for the first
individual human genome sequence
project (2008)
Cost < $1,000,000
Genome sequence: James Watson
Reversible terminator sequencing
Reversible terminator sequencing
Basis for the Illumina DNA sequencing method
Reversible terminators
Bentley et.al. “Accurate whole human genome sequencing using reversible terminator chemistry” Nature 456, 53 (2008)
Reversible terminator chemistry
Add all four 3´-O-azidomethyl dNTPs,
each labeled with a different
fluorophore
DNA-OH polymerase Add DNA polymerase
Determine identity of incorporated
nucleotide by fluorescence
(only one nucleotide is added at a time)
tris(2-carboxyethyl)phosphine
(TCEP)
Remove fluorophore and 3´-O-azidomethyl
blocking group
linker scar
Regenerates a free 3´-hydroxyl group
for another cycle of nucleotide
addition
Bentley et.al. “Accurate whole human genome sequencing using reversible terminator chemistry” Nature 456, 53 (2008)
Bridge amplification
DNA
fragment ligate PCR
Genomic DNA fragments are generated by random shearing
Fragments are ligated to a pair of forked adaptor oligonucleotides
Ligated products are amplified by PCR (using PCR primers that
are complementary to the adaptor sequences)
Forms double-stranded blunt-ended products with a different
adaptor sequence on each end
Bentley et.al. “Accurate whole human genome sequencing using reversible terminator chemistry” Nature 456, 53 (2008)
Bridge amplification
3´ 3´ 3´ 3´
denature anneal
DNA fragments are denatured
Separated single strands are annealed to complementary PCR
primers that are covalently attached to a flow cell surface
Bentley et.al. “Accurate whole human genome sequencing using reversible terminator chemistry” Nature 456, 53 (2008)
Bridge amplification
3´ 3´
3´ 3´ 3´ 3´
copy denature
A new strand is copied from the original strand in an extension
reaction (Bst I polymerase) that is primed from the 3´-end of
the surface-bound PCR primer
The original strand is then removed by denaturation; the copy
strand remains covalently attached to the surface-bound PCR
primer
Bentley et.al. “Accurate whole human genome sequencing using reversible terminator chemistry” Nature 456, 53 (2008)
Bridge amplification
3´ 3´ 3´
repeat
3´
1. bridge denature
2. copy 3´ 3´
The adaptor sequence at the 3´-end of the copy strand can anneal to
a second surface-bound PCR primer
The bridging strand serves as a template for an extension reaction
that is primed from the 3´-end of the second PCR primer
Multiple cycles of annealing, extension, and denaturation result in
the growth of clonal DNA clusters
Bentley et.al. “Accurate whole human genome sequencing using reversible terminator chemistry” Nature 456, 53 (2008)
Clonal single molecule arrays
Can generate ~1000 copies of the DNA segment per cluster
And can have 100,000,000 clusters per cm2
Reversible terminator sequencing
3´ 3´
3´ 3´
3´
linearize denature 3´ sequence 3´
3´ 3´
The DNA in each cluster is linearized by cleavage within one adaptor
sequence and then denatured
Generates a unique single-stranded template which can then be
used for reversible terminator sequencing
Bentley et.al. “Accurate whole human genome sequencing using reversible terminator chemistry” Nature 456, 53 (2008)
Illumina DNA sequencing
Can only sequence ~35 bases per
reaction
But can monitor millions of reactions
simultaneously (in parallel)
Illumina DNA sequencing
Illumina DNA sequencer 9° N polymerase
Reengineered the active site of 9° N DNA polymerase to improve the
efficiency of incorporation of the reversible terminator dNTPs
Ion Torrent sequencing
DNA polymerase
H+
H+
Each nucleotide incorporation step is accompanied by
the release of a proton (H+)
Ion Torrent sequencing
In both 454 and Ion Torrent sequencing:
DNA fragments are immobilized and amplified
on beads, which are then placed in the wells
of a chip
The four dNTPs are then flowed sequentially
over the wells of the chip
In 454: nucleotide incorporation is detected
by the release of PPi (which triggers a series of
enzymatic reactions that results in the emission
of light)
In Ion Torrent: nucleotide incorporation is
detected by the release of H+ (which results
in a decrease in the pH of the reaction solution
in the well)
Churko et. al. “Overview of high throughput sequencing technologies “
Circ. Res. 112, 1613 (2013)
Ion Torrent DNA sequencing
Ion torrent DNA sequencer DNA ionogram
Ion Torrent website