L1: Design of oligonucleotides

To design oligonucleotides for your PCR you need a map of the DNA and protein sequences. You also need a codon usage table to select a codon used by Escherichia coli.

For the mutation with the PCR method you will need two oligos which both contain the mutation and which are complementary to each other. A third shorter oligo will be used in a PCR hybridization analysis.

This example shows you how you can approach the oligo design problem. Suppose we want to mutate of H89 to N. We can then choose to do this through a single base exchange CAT to AAT. Other possibilities should also be considered. One should where possible select high frequency codons.

wt
81 I  P  L  S  V  L  D  D  H  P  R  I  D  L  A  I  D  G  A  D        
  ATTCCGCTCTCCGTTCTCGATGATCATCCTCGAATTGACCTCGCCATTGATGGCGCCGAT


mutant
81 I  P  L  S  V  L  D  D  N  P  R  I  D  L  A  I  D  G  A  D       
  ATTCCGCTCTCCGTTCTCGATGATAATCCTCGAATTGACCTCGCCATTGATGGCGCCGAT

1. We need two complementary oligonucleotides with the mutation in the center. The oligos should be about 18 nt on each side of the mutation (max. length of oligo = 46 nt) and the GC content about 40-50% (optimally 50%). If possible, the first and last nucleotide of the oligos should be a G or a C. Our first mutation oligo FS.H89N1 thus has the following sequence:

   5' CTCTCCGTTCTCGATGATAATCCTCGAATTGACCTCG 3'

2. The second oligo nucleotide is an exact complement to FS.H89N1. To write the complementary oligo in 5' to 3' direction it is convenient to make use of the link http://arbl.cvmbs.colostate.edu/molkit/ (on a PC, you might need to use http://www.basic.northwestern.edu/biotools/oligocalc.html instead). Click on ÒManipulate SequencesÓ. Paste the oligo FS.H89N1 in the window and select ÒInverse ComplementÓ. This will give you the sequence of mutation oligo FS.H89N2.

   5' CGAGGTCAATTCGAGGATTATCATCGAGAACGGAGAG 3'

3. The melting temperature (Tm) of your mutation oligos are calculated using the following formula:

Tm = 81,5 + 0,41(%GC) - 675/N -%mismatch             (Tm should be ~78
¡C)

where N is the length (in nucleotides) of your oligo.

Calculate the Tm for at least one of your mutagenesis oligos using the formula above. Use this excel file (or this link) to check that your numbers are correct and to calculate the Tm for the remaining mutagenesis oligos.

4. And finally our analytical hybridization oligo (ex FS.H89N3) is:

   5' CTCTCCGTTCTCGATGATA 3'

This oligo will be used in an analytical PCR together with an oligo (RpiA reverse or RpiA forward), which binds complementary to the 3 ' or 5' end, respectively, of the rpiA gene. With the help of these two oligos we will amplify the DNA, but only if the mutation was successful.

5.
Calculate the melting temperature (Tm) of your analytical primer using the Primer design software available at http://www.cybergene.se/primerdesign/genewalker/genewalker11.html (alternatively, http://www.basic.northwestern.edu/biotools/oligocalc.html). Type in your oligo sequence in the ÒPrimer 1 sequenceÓ box, click on Ò2:ary structureÓ, and check out the result to the right. The melting temperature should be ~58¡C. NB, make sure to click ÒClear resultsÓ before you start analyzing the next oligo. You can use the same tool to check the secondary structure of your primer.

6.
Send your oligos by e-mail to Sanjeewani (sanjee.soori@icm.uu.se) with the following information included:
              
Mutation (e.g. D6A)
               Oligo id (what you want to call your oligo) ex. FS.D6A1, FS.D6A2, FS.D6A3 (analytical)
               Tm 

Back to top

RpiA protein sequence


        M   G   S   H   H   H   H   H   H
   CCC ATG GGA TCT CAT CAT CAT CAT CAT CAT

1   G   V   L   T   Q   D   D   L   K   K   L   A   A   E   K  
   GGA GTC TTA ACT CAA GAC GAT CTC AAG AAA CTC GCC GCC GAA AAA

16  A   V   D   S   V   K   S   G   M   V   L   G   L   G   T
   GCC GTC GAC TCC GTC AAA TCC GGC ATG GTT CTC GGT CTC GGA ACC

31  G   S   T   A   A   F   A   V   S   R   I   G   E   L   L
   GGA AGT ACT GCC GCA TTT GCT GTC TCG CGA ATC GGC GAG CTT CTC

46  S   A   G   K   L   T   N   I   V   G   I   P   T   S   K
   TCT GCC GGA AAA CTG ACC AAC ATC GTT GGA ATT CCT ACC TCG AAG

61  R   T   A   E   Q   A   A   S   L   G   I   P   L   S   V
   CGG ACC GCA GAG CAG GCG GCG TCT CTT GGA ATT CCG CTC TCC GTT

76  L   D   D   H    P   R   I   D   L   A   I   D   G   A   D
   CTC GAT GAT CAT CCT CGA ATT GAC CTC GCC ATT GAT GGC GCC GAT

91  E   V   D   P   D   L   N   L   V   K   G   R   G   G   A
   GAG GTT GAT CCT GAT CTT AAT CTG GTT AAG GGG CGC GGT GGG GCG

106 L   L   R   E   K   M   V   E   A   A   S   D   K   F   I
   CTC TTG AGA GAA AAG ATG GTT GAA GCT GCT AGT GAT AAA TTT ATT

121 V   V   V   D   D   T   K   L   V   D   G   L   G   G   S
   GTT GTT GTT GAT GAT ACT AAG CTT GTT GAT GGT TTG GGT GGT AGT

136 R   L   A   M   P   V   E   V   V   Q   F   C   W   K   Y
   CGT CTT GCT ATG CCT GTT GAA GTT GTT CAA TTT TGC TGG AAA TAT

151 N   L   K   R   L   Q   E   I   F   K   E   L   G   C   E
   AAT CTC AAG AGA TTA CAG GAG ATC TTT AAG GAG CTG GGT TGT GAG

166 A   K   L   R   M   E   G   D   S   S   P   Y   V   T   D
   GCA AAA TTG AGA ATG GAA GGG GAT AGC AGT CCT TAT GTG ACT GAC

181 N   S   N   Y   I   V   D   L   Y   F   P   T   S   I   K
   AAC TCG AAT TAC ATC GTG GAT TTA TAC TTC CCG ACC TCG ATT AAG

196 D   A   E   A   A   G   R   E   I   S   A   L   E   G   V
   GAT GCT GAA GCT GCA GGG AGA GAA ATT TCG GCC TTG GAA GGC GTA

211 V   E   H   G   L   F   L   G   M   A   S   E   V   I   I
   GTA GAA CAT GGG TTG TTC TTG GGT ATG GCT AGC GAA GTC ATC ATT

226 A   G   K   T   G   V   S   V   K   T   K  -
   GCT GGG AAA ACT GGA GTT AGT GTG AAA ACC AAG TGA

rpiA gene sequence in FASTA format 

(if you are having trouble with the format of the FASTA sequence, get it from this link instead)

CCCATGGGATCTCATCATCATCATCATCATGGAGTCTTAACTCAAGACGATCTCAAGAAA
CTCGCCGCCGAAAAAGCCGTCGACTCCGTCAAATCCGGCATGGTTCTCGGTCTCGGAACC
GGAAGTACTGCCGCATTTGCTGTCTCGCGAATCGGCGAGCTTCTCTCTGCCGGAAAACTG
ACCAACATCGTTGGAATTCCTACCTCGAAGCGGACCGCAGAGCAGGCGGCGTCTCTTGGA
ATTCCGCTCTCCGTTCTCGATGATCATCCTCGAATTGACCTCGCCATTGATGGCGCCGAT
GAGGTTGATCCTGATCTTAATCTGGTTAAGGGGCGCGGTGGGGCGCTCTTGAGAGAAAAG
ATGGTTGAAGCTGCTAGTGATAAATTTATTGTTGTTGTTGATGATACTAAGCTTGTTGAT
GGTTTGGGTGGTAGTCGTCTTGCTATGCCTGTTGAAGTTGTTCAATTTTGCTGGAAATAT
AATCTCAAGAGATTACAGGAGATCTTTAAGGAGCTGGGTTGTGAGGCaAAATTGAGAATG
GAAGGGGATAGCAGTCCTTATGTGACTGACAACTCGAATTACATCGTGGATTTATACTTC
CCGACCTCGATTAAGGATGCTGAAGCTGCAGGGAGAGAAATTTCGGCCTTGGAAGGCGTA
GTAGAACATGGGTTGTTCTTGGGTATGGCTAGCGAAGTCATCATTGCTGGGAAAACTGGA
GTTAGTGTGAAAACCAAGTGA


Back to top


RpiA reverse oligo sequence: 5'-CCAGCAATGATGACTTCGCTA-3'


RpiA forward oligo sequence: 5'-CTCAAGAAACTCGCCGCCGAA-3'


Codon usage in E. coli
from HŽnaut
and Danchin: Analysis and Predictions from Escherichia coli sequences. Escherichia coli and Salmonella, Vol. 2, Ch. 114:2047-2066, 1996, Neidhardt FC ed., ASM press, Washington, D.C.

 

Amino
Acid

Codon

Class


Amino
Acid

Codon

Class

I

II

III


I

II

III

Phe

ttt

55.09

29.08

67.14


Leu

ctt

9.70

5.56

19.00

ttc

44.91

70.92

32.86


ctc

10.40

8.03

9.04

Leu

tta

10.99

3.44

20.09


cta

3.09

0.83

6.81

ttg

13.02

5.47

15.05


ctg

52.79

76.67

29.99

Ser

tct

13.26

32.41

19.63


Pro

cct

13.71

11.23

28.30

tcc

15.02

26.56

11.34


ccc

11.19

1.63

16.26

tca

10.83

4.79

22.09


cca

18.63

15.25

31.50

tcg

16.88

7.39

10.60


ccg

56.47

71.89

23.94

Tyr

tat

54.42

35.23

69.60


His

cat

56.80

29.77

61.69

tac

45.58

64.77

30.40


cac

43.20

70.23

38.31

Stop

taa

 

 

 


Gln

caa

33.40

18.65

37.06

tag

 

 

 


cag

66.60

81.35

62.94

Cys

tgt

40.90

38.85

55.71


Arg

cgt

38.99

64.25

26.05

tgc

59.10

61.15

44.29


cgc

42.23

32.97

21.94

Stop

tga

 

 

 


cga

5.52

1.07

12.80

Trp

tgg

100.00

100.00

100.00


cgg

8.97

0.80

13.62

Ile

att

51.20

33.49

47.57


Val

gtt

23.74

39.77

34.33

atc

44.37

65.94

26.65


gtc

22.48

13.45

18.95

ata

4.43

0.57

25.78


gta

14.86

19.97

21.78

Met

atg

100.00

100.00

100.00


gtg

38.92

26.81

24.94

Thr

act

14.85

29.08

26.83


Ala

gct

14.52

27.54

22.86

acc

46.83

53.60

24.45


gcc

27.62

16.14

23.67

aca

10.52

4.67

27.93


gca

19.63

24.01

31.27

acg

27.81

12.65

20.80


gcg

38.23

32.30

22.19

Asn

aat

40.87

17.25

64.06


Asp

gat

62.83

46.05

70.47

aac

59.13

82.75

35.94


gac

37.17

53.95

29.53

Lys

aaa

75.44

78.55

72.21


Glu

gaa

68.33

75.35

66.25

aag

24.56

21.45

27.79


gag

31.67

24.65

33.75

Ser

agt

13.96

4.52

18.73


Gly

ggt

32.91

50.84

31.79

agc

30.04

24.33

17.61


ggc

43.17

42.83

24.51

Arg

aga

1.75

0.62

15.63

 

gga

9.19

1.97

24.75

agg

1.54

0.29

9.96

 

ggg

14.74

4.36

18.95


Genes are clustered by using factorial correspondence analysis into three classes. Class I contains genes involved in most metabolic processes. Class II genes correspond to genes highly and continuously expressed during exponential growth. Class III genes are implicated in horizontal transfer of DNA. One can see that the distribution of codons in class III genes is more or less even, whereas it is extremely biased in class II genes (in particular, codons terminated in A are selected against).

Back to top

Last modified: S. Mowbray, 8 September, 2023.