Amino Acids

From Proteopedia

(Difference between revisions)
Jump to: navigation, search
(Unusual Amino Acids)
Current revision (14:37, 11 May 2025) (edit) (undo)
(22 Standard Amino Acids and Mnemonics)
 
(106 intermediate revisions not shown.)
Line 1: Line 1:
For a general introduction to ''amino acids'', please see [http://en.wikipedia.org/wiki/Amino_acids Amino Acids in Wikipedia].
For a general introduction to ''amino acids'', please see [http://en.wikipedia.org/wiki/Amino_acids Amino Acids in Wikipedia].
-
==20 Standard Amino Acids and Mnemonics==
+
==22 Standard Amino Acids and Mnemonics==
-
Here are the names of the twenty standard amino acids, with their three and one-letter abbreviations. Mnemonic names are intended to help you to remember the one-letter codes, but are not the correct names.
+
Here are the names of the twenty-two standard amino acids, with their three and one-letter<ref>A one-letter notation for amino acid sequences (definitive rules). IUPAC—IUB Commission on Biochemical Nomenclature, pages 639-645, International Union of Pure and Applied Chemistry and International Union of Biochemistry, Butterworths, London, 1971. [https://old.iupac.org/publications/pac/1972/pdf/3104x0639.pdf Full Text].</ref> abbreviations. Mnemonic names are intended to help you to remember the one-letter codes, but are not the correct names.
 +
<center>
<table border=1 cellpadding=5>
<table border=1 cellpadding=5>
<tr>
<tr>
Line 14: Line 15:
Asn N Asparagi<b>n</b>e
Asn N Asparagi<b>n</b>e
<br>
<br>
-
Asp D Aspartic acid <br>&nbsp;(mnemonic:&nbsp;aspar<b>D</b>ic)
+
Asp D Aspartic acid <br>&nbsp;&nbsp;(mnemonic:&nbsp;aspar<b>D</b>ic)
<br>
<br>
Cys C <b>C</b>ysteine
Cys C <b>C</b>ysteine
</big></tt></td><td><tt><big>
</big></tt></td><td><tt><big>
-
Gln Q Glutamine<br>&nbsp;(mnemonic:&nbsp;<b>Q</b>uetamine)
+
Lys K Lysine<br>&nbsp;&nbsp;(mnemonic:&nbsp;li<b>K</b>esine)
<br>
<br>
-
Glu E Glutamic acid<br>&nbsp;(mnemonic:&nbsp;glu<b>E</b>tamic)
+
Met M <b>M</b>ethionine
<br>
<br>
-
Gly G <b>G</b>lycine
+
Phe F Phenylalanine<br>&nbsp;&nbsp;(mnemonic:&nbsp;<b>F</b>enylalanine)
<br>
<br>
-
His H <b>H</b>istidine
+
Pro P <b>P</b>roline
<br>
<br>
-
Ile I <b>I</b>soleucine
+
Pyl O Pyrr<b>o</b>lysine*
-
</big></tt></td><td><tt><big>
+
 
-
Leu L <b>L</b>eucine
+
</big></tt></td><td rowspan="2"><tt><big>
 +
&nbsp;
 +
&nbsp;
 +
&nbsp;
 +
mnemonic:
 +
<br>A Ala
 +
<br>B Asx&dagger;
 +
<br>C Cys
 +
<br>D Asp aspar<b>D</b>ic
 +
<br>E Glu glu<b>E</b>tamine
 +
<br>F Phe <b>F</b>enylalanine
 +
<br>G Gly
 +
<br>H His
 +
<br>I Ile
 +
<br><font color="#808080">J</font>
 +
<br>K Lys li<b>K</b>esine
 +
<br>L Leu
 +
<br>M Met
 +
 
 +
</big></tt></td><td rowspan="2"><tt><big>
 +
 
 +
&nbsp;
 +
&nbsp;
 +
&nbsp;
 +
mnemonic:
 +
<br>N Asn asparagi<b>N</b>e
 +
<br>O Pyl*
 +
<br>P Pro
 +
<br>Q Gln <b>Q</b>uetamine
 +
<br>R Arg a<b>R</b>ginine
 +
<br>S Ser
 +
<br>T Thr
 +
<br>U Sec*
 +
<br>V Val
 +
<br>W Trp t<b>W</b>ptophan
 +
<br>X Unk&dagger;
 +
<br>Y Tyr t<b>Y</b>rosine
 +
<br>Z Glx&dagger;
 +
 
 +
</big></tt></td></tr><tr><td><tt><big>
 +
Gln Q Glutamine<br>&nbsp;&nbsp;(mnemonic:&nbsp;<b>Q</b>uetamine)
<br>
<br>
-
Lys K Lysine<br>&nbsp;(mnemonic:&nbsp;li<b>K</b>esine)
+
Glu E Glutamic acid<br>&nbsp;&nbsp;(mnemonic:&nbsp;glu<b>E</b>tamic)
<br>
<br>
-
Met M <b>M</b>ethionine
+
Gly G <b>G</b>lycine
<br>
<br>
-
Phe F Phenylalanine<br>&nbsp;(mnemonic:&nbsp;<b>F</b>enylalanine)
+
His H <b>H</b>istidine
<br>
<br>
-
Pro P <b>P</b>roline
+
Ile I <b>I</b>soleucine
 +
<br>
 +
Leu L <b>L</b>eucine
 +
 
</big></tt></td><td><tt><big>
</big></tt></td><td><tt><big>
 +
Sec U Selenocysteine*<br>&nbsp;&nbsp;(mnemonic:&nbsp;seleni<b>U</b>m)
 +
<br>
Ser S <b>S</b>erine
Ser S <b>S</b>erine
<br>
<br>
Thr T <b>T</b>hreonine
Thr T <b>T</b>hreonine
<br>
<br>
-
Trp W Tryptophan<br>&nbsp;(mnemonic:&nbsp;t<b>W</b>yptophan)
+
Trp W Tryptophan<br>&nbsp;&nbsp;(mnemonic:&nbsp;t<b>W</b>yptophan)
<br>
<br>
Tyr Y T<b>y</b>rosine
Tyr Y T<b>y</b>rosine
Line 50: Line 96:
</big></tt></td>
</big></tt></td>
</tr>
</tr>
 +
<tr><td colspan="4"><center>
 +
<nowiki>*</nowiki>Since the mid-20th century, there were 20 standard amino acids. Since 2014,
 +
genetically encoded
 +
[[#Unusual_Amino_Acids|Selenocysteine and Pyrrolysine]] have been included as standard by the World Wide Protein Data Bank<ref name="pdb22" />.
 +
<br>
 +
&dagger; Asx: Asn or Asp; Glx: Gln or Glu; Unk: unknown.
 +
<br>
 +
[http://old.iupac.org/publications/pac/1972/pdf/3104x0639.pdf IUPAC/IUB 1971 publication] defining one-letter codes.
 +
</center></td></tr>
</table>
</table>
 +
 +
Click on the image below to see it full size with details on 21 amino acids.
 +
 +
{| align="center"
 +
|-
 +
|
 +
<imagemap>
 +
Image:1000px-Amino Acids.svg.png|200 px|
 +
default [http://proteopedia.org/wiki/images/1/13/1000px-Amino_Acids.svg.png]
 +
</imagemap>
 +
|}
 +
</center>
 +
There is also a [http://pdb101.rcsb.org/learn/guide-to-understanding-pdb-data/primary-sequences-and-the-pdb-format list of standard amino acids] at the Protein Data Bank.
 +
{{Clear}}
 +
 +
==L- and D-Amino Acids==
 +
 +
<table border='1' cellpadding='3' align='right' hspace="8" class="wikitable" style="margin:0px 0px 0px 8px;"><tr><td colspan='2'>
 +
PDB Nomenclature
 +
</td></tr><tr><td>
 +
L-form
 +
</td><td>
 +
D-form
 +
</td></tr><tr><td>
 +
ALA<br>ARG<br>ASN<br>ASP<br>CYS<br>GLN<br>GLU<br>HIS<br>ILE<br>LEU<br>LYS<br>MET<br>
 +
PHE<br>PRO<br>PYL<br>SEC<br>SER<br>THR<br>TRP<br>TYR<br>VAL
 +
</td><td>
 +
DAL<br>DAR<br>DSG<br>DAS<br>DCY<br>DGN<br>DGL<br>DHI<br>DIL<br>DLE<br>DLY<br>MED<br>
 +
DPN<br>DPR<br>?<ref name="ddd">In January, 2022, no D-selenocysteine nor D-pyrrolysine were present in the [[wwPDB]], according to David Armstrong of PDBe.</ref><br>?<ref name="ddd" /><br>DSN<br>DTH<br>DTR<br>DTY<br>DVA
 +
</td></tr></table>
 +
 +
The asymmetric &alpha;-carbon in 21 of the standard amino acids (all except glycine) is a chiral (stereogenic) center<ref name='chirality' />. Thus, each amino acid can exist in either of two optical isomers or enantiomers (mirror images), designated the L-form and D-form. Nearly all naturally-occuring amino acids are L-amino acids, so named because they are analogous to the L-form of glyceraldehyde<ref name='chirality'>See [http://en.wikipedia.org/wiki/Chirality_(chemistry) Chirality in Wikipedia]].</ref>. (Not all L-amino acids are levorotatory; some rotate polarized light clockwise<ref name='chirality' />).
 +
 +
More than 700 entries in the [[PDB]] contain one or more D amino acids. Although D-amino acids are rare in nature, they do occur (see [http://en.wikipedia.org/wiki/Chirality_(chemistry)#D-Amino_Acid_Natural_Abundance natural abundance in Wikipedia]). D-alanine (DAL) occurs in over 100 entries in the [[PDB]], D-leucine (DLE), D-phenylalanine (DPN), D-serine (DSN), and D-valine (DVA) each occur in over 50. Some of these entries are chemically synthesized, not naturally occuring. Gramicidin is a naturally occuring antibiotic containing several D-amino acids, e.g. [[1nrm]]. [[4glu]] is the 102 amino acid chain of VEGF-A synthesized entirely from D-amino acids.
==Unusual Amino Acids==
==Unusual Amino Acids==
-
Post-translational modifications of amino acids include phosphorylation and nitrosylation, e.g. to produce [[nitrotyrosine]]. However, there are at least two non-standard amino acids that are [http://en.wikipedia.org/wiki/Genetic_code genetically encoded], discussed below.
+
There are at least two historically &quot;non-standard&quot; amino acids now recognized to be [http://en.wikipedia.org/wiki/Genetic_code genetically encoded], discussed below. In 2014, the [[PDB]] began including these two to make 22 standard amino acids<ref name="pdb22">[https://www.wwpdb.org/news/news?year=2014#5764490799cccf749a90cddf Announcement: Standardization of Amino Acid Nomenclature], World Wide Protein Data Bank News, January 8, 2014.</ref>.
===Selenocysteine: the 21st amino acid===
===Selenocysteine: the 21st amino acid===
-
Rare proteins in all domains of life include '''[[Selenocysteine|selenocysteine (Sec, U)]]'''. Over a dozen entries in the [[PDB]] include selenocysteine, identified as CSE.
+
Rare proteins in all domains of life include '''[[Selenocysteine|selenocysteine (Sec, U)]]''', designated the 21st amino acid<ref name='21st'>PMID: 11028985</ref>. In 2024, over 90 entries in the [[PDB]] include selenocysteine, identified as SEC.
===Pyrrolysine: the 22nd amino acid===
===Pyrrolysine: the 22nd amino acid===
-
Some proteins in methanogenic archaea include '''[[Pyrrolysine|pyrrolysine (Pyl)]]'''. In July 2009, it appears that no entries in the [[PDB]] include pyrrolysine. ''Methanosarcina barkeri'' monomethylamine methyltransferase (MtmB, [[1nth]]) was the first identified structure containing this amino acid, and in the crystal structure it is identified as BGX, with the rest of the amnio acid identified as LYS202<ref>PMID: 12029132</ref>.
+
'''[[Pyrrolysine|pyrrolysine (Pyl, O)]]'''<ref>See [http://en.wikipedia.org/wiki/Pyrrolysine Pyrrolysine in Wikipedia].</ref> is a genetically encoded amino acid that occurs naturally in methanogenic archaea, now designated the 22nd amino acid<ref>PMID: 15380192</ref>. It occurs in [https://www.rcsb.org/search?request=%7B%22query%22%3A%7B%22type%22%3A%22group%22%2C%22logical_operator%22%3A%22and%22%2C%22nodes%22%3A%5B%7B%22type%22%3A%22group%22%2C%22logical_operator%22%3A%22and%22%2C%22nodes%22%3A%5B%7B%22type%22%3A%22group%22%2C%22nodes%22%3A%5B%7B%22type%22%3A%22terminal%22%2C%22service%22%3A%22text_chem%22%2C%22parameters%22%3A%7B%22attribute%22%3A%22rcsb_chem_comp_container_identifiers.comp_id%22%2C%22operator%22%3A%22in%22%2C%22negation%22%3Afalse%2C%22value%22%3A%5B%22PYL%22%5D%7D%7D%5D%2C%22logical_operator%22%3A%22and%22%7D%5D%2C%22label%22%3A%22text_chem%22%7D%5D%7D%2C%22return_type%22%3A%22entry%22%2C%22request_options%22%3A%7B%22paginate%22%3A%7B%22start%22%3A0%2C%22rows%22%3A25%7D%2C%22results_content_type%22%3A%5B%22experimental%22%5D%2C%22sort%22%3A%5B%7B%22sort_by%22%3A%22score%22%2C%22direction%22%3A%22desc%22%7D%5D%2C%22scoring_strategy%22%3A%22combined%22%7D%2C%22request_info%22%3A%7B%22query_id%22%3A%22354d8ac467aafef9f60bd804ac85c1b3%22%7D%7D several entries] in the PDB identified as PYL.
 +
See [[Pyrrolysine|more details]].
 +
<!--
 +
Editor's note: Contributions here by [[User:Andrea Gorrell]] were moved to the page on [[Pyrrolysine]].
 +
-->
===No 23rd amino acid?===
===No 23rd amino acid?===
Line 68: Line 161:
The 21st and 22nd amino acids are specified by the [http://en.wikipedia.org/wiki/Genetic_code stop codons UGA and UAG] respectively, modified with downstream stem-loop structures in the mRNA.
The 21st and 22nd amino acids are specified by the [http://en.wikipedia.org/wiki/Genetic_code stop codons UGA and UAG] respectively, modified with downstream stem-loop structures in the mRNA.
Lobanov ''et al.''<ref name='23rd'>PMID: 16713651</ref> searched "16 archaeal and 130 bacterial genomes for tRNAs with anticodons corresponding to the three stop signals". Their data suggest that "the occurrence of additional amino acids that are widely distributed and genetically encoded is unlikely."
Lobanov ''et al.''<ref name='23rd'>PMID: 16713651</ref> searched "16 archaeal and 130 bacterial genomes for tRNAs with anticodons corresponding to the three stop signals". Their data suggest that "the occurrence of additional amino acids that are widely distributed and genetically encoded is unlikely."
 +
 +
===Selenomethionine===
 +
Please see [[Selenomethionine]].
 +
 +
===Postranslational modifications===
 +
 +
Many unusual amino acids are formed by enzyme-catalyzed reactions that modify a genetically encoded standard amino acid after it has been included in a polypeptide; these are called post-translational modifications. Among them:
 +
* phosphorylation of serine, threonine, or tyrosine
 +
** Example: C kinase [[1bkx]]
 +
* hydroxylation of proline (to yield hydroxyproline, e.g. in [[collagen]])
 +
* acetylation of lysine or the N terminus
 +
** Example of N-terminal acetylation: [[4mdh]]
 +
* methylation of lysine
 +
* carboxylation of aspartate or glutamate
 +
* nitrosylation, e.g. to produce [[nitrotyrosine]]
 +
* acylation with a fatty acid (see N-terminal myristoylation of [[Recoverin]])
 +
* prenylation
 +
* glycosylation of serine, threonine or asparagine (yielding glycoproteins)
 +
** Example of N-linked glycolylation: [[1igy]]
==Structure, Properties, Behaviors in Proteins==
==Structure, Properties, Behaviors in Proteins==
-
Angel Herráez has provided a [http://biomodel.uah.es/en/model3/index.htm tutorial introduction to amino acid structure] in which each of the 20 amino acids may be visualized in 3D using Jmol.
+
Angel Herráez has provided a [http://biomodel.uah.es/en/model1/prot/aa-intro.htm tutorial introduction to amino acid structure] in which 20 amino acids may be visualized in 3D using JSmol.
-
The list below will gradually be expanded to include all 20 amino acids.
+
Please help to expand the list below to include all 22 amino acids.
 +
 
 +
===Glycine===
 +
Gly G: Small.
 +
 
 +
See also [[Glycine|Glycine in Proteopedia]] and [http://en.wikipedia.org/wiki/Glycine Glycine] in Wikipedia, where the structure is shown. Glycine is the smallest amino acid, since its side-chain is nothing but a single hydrogen. Glycine is frequently found in [[Turns in Proteins|turns]], since the steric clashes of larger amino acids would be problematic. Turns typically occur on the surfaces of proteins, between [[Secondary structure|beta-strands or alpha-helices]]. It is common to find '''[[Conservation, Evolutionary|highly conserved]] glycines in turns on protein surfaces''', since mutation to a bulkier residue would interfere with the fold. Glycine also occurs in [[Secondary structure|beta-strands and alpha-helices]], although its frequency in alpha helices is low (second lowest, after proline, p. 125 in <ref name="kesselbental">Kessel, Amit, and Nir Ben-Tal. ''Introduction to Proteins: Structure, Function, and Motion''. CRC Press, 2011. 653 pages.</ref>). Glycine is commonly found at the C-terminus of alpha helices, and is considered a ''helix terminator'' (p. 125 in <ref name="kesselbental" />).
===Histidine===
===Histidine===
Line 80: Line 197:
See [http://en.wikipedia.org/wiki/Histidine Histidine] in Wikipedia, where the structure is shown.
See [http://en.wikipedia.org/wiki/Histidine Histidine] in Wikipedia, where the structure is shown.
-
The sidechain of His can be positively charged (protonated), in which case both of the nitrogens in the sidechain imidazole ring have hydrogens, and the charge is delocalized between them. The pKa for protonation is 6.1. This means that, on average at any moment, half of the His sidechains are protonated when the pH is 6.1. At the pH of blood, 7.4 ±0.05, less than 10% of the His sidechains are positively charged. Therefore, His is not included in the usual list of positively charged amino acids ([[#Lysine|lysine]] and [[#Arginine|arginine]]).
+
The sidechain of His can be positively charged (protonated), in which case both of the nitrogens in the sidechain imidazole ring have hydrogens, and the charge is delocalized between them. The pKa for protonation is 6.1. This means that, on average at any moment, half of the His sidechains are protonated when the pH is 6.1. At the pH of blood, 7.4 ±0.05, His sidechains are positively charged less than 10% of the time. Therefore, His is not included in the usual list of positively charged amino acids ([[#Lysine|lysine]] and [[#Arginine|arginine]]).
 +
 
 +
===Phenylalanine===
 +
Phe F:
 +
Neutral, Aromatic, Bulky
 +
 
 +
See [http://en.wikipedia.org/wiki/Phenylalanine Phenylalanine] in Wikipedia, where the structure is shown. Phe participates in [[Cation-pi interactions]]. [http://en.wikipedia.org/wiki/Phenylketonuria Phenylketonuria] is a genetic disease in which the enzyme that converts phenylalanine to [[#Tyrosine|tyrosine]] is nonfunctional, leading to toxicity from excess phenylalanine, and tyrosine deficiency.
===Tryptophan===
===Tryptophan===
Line 87: Line 210:
See [http://en.wikipedia.org/wiki/Tryptophan Tryptophan] in Wikipedia, where the structure is shown.
See [http://en.wikipedia.org/wiki/Tryptophan Tryptophan] in Wikipedia, where the structure is shown.
-
In trans-membrane proteins, tryptophans often lie at the interface between water and lipid. An example is the potassium channel, e.g. [[1bl8]]. To see their positions, use the ''Find'' dialog in FirstGlance in Jmol.
+
In trans-membrane proteins, tryptophans often lie at the interface between water and lipid. An example is the potassium channel, e.g. [[1bl8]]. To see their positions, use the ''Find'' dialog in FirstGlance in Jmol. Trp participates in [[Cation-pi interactions]].
===Tyrosine===
===Tyrosine===
Line 93: Line 216:
Neutral, Polar, Aromatic, Bulky
Neutral, Polar, Aromatic, Bulky
-
See [http://en.wikipedia.org/wiki/Tyrosine Tyrosine] in Wikipedia, where the structure is shown.
+
See [http://en.wikipedia.org/wiki/Tyrosine Tyrosine] in Wikipedia, where the structure is shown, and [[Nitrotyrosine]]. Tyr participates in [[Cation-pi interactions]].
==See Also==
==See Also==
*[[Non-Standard Residues]].
*[[Non-Standard Residues]].
-
* [[Standard Residues]]
+
*[[Standard Residues]]
 +
*[http://en.wikipedia.org/wiki/Amino_acids Amino Acids in Wikipedia]
 +
*[[Basics of Protein Structure]]
==References==
==References==
<references />
<references />

Current revision

For a general introduction to amino acids, please see Amino Acids in Wikipedia.

Contents

22 Standard Amino Acids and Mnemonics

Here are the names of the twenty-two standard amino acids, with their three and one-letter[1] abbreviations. Mnemonic names are intended to help you to remember the one-letter codes, but are not the correct names.

Ala A Alanine
Arg R Arginine
Asn N Asparagine
Asp D Aspartic acid
  (mnemonic: asparDic)
Cys C Cysteine

Lys K Lysine
  (mnemonic: liKesine)
Met M Methionine
Phe F Phenylalanine
  (mnemonic: Fenylalanine)
Pro P Proline
Pyl O Pyrrolysine*

      mnemonic:
A Ala
B Asx†
C Cys
D Asp asparDic
E Glu gluEtamine
F Phe Fenylalanine
G Gly
H His
I Ile
J
K Lys liKesine
L Leu
M Met

      mnemonic:
N Asn asparagiNe
O Pyl*
P Pro
Q Gln Quetamine
R Arg aRginine
S Ser
T Thr
U Sec*
V Val
W Trp tWptophan
X Unk†
Y Tyr tYrosine
Z Glx†

Gln Q Glutamine
  (mnemonic: Quetamine)
Glu E Glutamic acid
  (mnemonic: gluEtamic)
Gly G Glycine
His H Histidine
Ile I Isoleucine
Leu L Leucine

Sec U Selenocysteine*
  (mnemonic: seleniUm)
Ser S Serine
Thr T Threonine
Trp W Tryptophan
  (mnemonic: tWyptophan)
Tyr Y Tyrosine
Val V Valine

*Since the mid-20th century, there were 20 standard amino acids. Since 2014, genetically encoded Selenocysteine and Pyrrolysine have been included as standard by the World Wide Protein Data Bank[2].
† Asx: Asn or Asp; Glx: Gln or Glu; Unk: unknown.
IUPAC/IUB 1971 publication defining one-letter codes.

Click on the image below to see it full size with details on 21 amino acids.

About this image

There is also a list of standard amino acids at the Protein Data Bank.

L- and D-Amino Acids

PDB Nomenclature

L-form

D-form

ALA
ARG
ASN
ASP
CYS
GLN
GLU
HIS
ILE
LEU
LYS
MET
PHE
PRO
PYL
SEC
SER
THR
TRP
TYR
VAL

DAL
DAR
DSG
DAS
DCY
DGN
DGL
DHI
DIL
DLE
DLY
MED
DPN
DPR
?[3]
?[3]
DSN
DTH
DTR
DTY
DVA

The asymmetric α-carbon in 21 of the standard amino acids (all except glycine) is a chiral (stereogenic) center[4]. Thus, each amino acid can exist in either of two optical isomers or enantiomers (mirror images), designated the L-form and D-form. Nearly all naturally-occuring amino acids are L-amino acids, so named because they are analogous to the L-form of glyceraldehyde[4]. (Not all L-amino acids are levorotatory; some rotate polarized light clockwise[4]).

More than 700 entries in the PDB contain one or more D amino acids. Although D-amino acids are rare in nature, they do occur (see natural abundance in Wikipedia). D-alanine (DAL) occurs in over 100 entries in the PDB, D-leucine (DLE), D-phenylalanine (DPN), D-serine (DSN), and D-valine (DVA) each occur in over 50. Some of these entries are chemically synthesized, not naturally occuring. Gramicidin is a naturally occuring antibiotic containing several D-amino acids, e.g. 1nrm. 4glu is the 102 amino acid chain of VEGF-A synthesized entirely from D-amino acids.

Unusual Amino Acids

There are at least two historically "non-standard" amino acids now recognized to be genetically encoded, discussed below. In 2014, the PDB began including these two to make 22 standard amino acids[2].

Selenocysteine: the 21st amino acid

Rare proteins in all domains of life include selenocysteine (Sec, U), designated the 21st amino acid[5]. In 2024, over 90 entries in the PDB include selenocysteine, identified as SEC.

Pyrrolysine: the 22nd amino acid

pyrrolysine (Pyl, O)[6] is a genetically encoded amino acid that occurs naturally in methanogenic archaea, now designated the 22nd amino acid[7]. It occurs in several entries in the PDB identified as PYL. See more details.

No 23rd amino acid?

The 21st and 22nd amino acids are specified by the stop codons UGA and UAG respectively, modified with downstream stem-loop structures in the mRNA. Lobanov et al.[8] searched "16 archaeal and 130 bacterial genomes for tRNAs with anticodons corresponding to the three stop signals". Their data suggest that "the occurrence of additional amino acids that are widely distributed and genetically encoded is unlikely."

Selenomethionine

Please see Selenomethionine.

Postranslational modifications

Many unusual amino acids are formed by enzyme-catalyzed reactions that modify a genetically encoded standard amino acid after it has been included in a polypeptide; these are called post-translational modifications. Among them:

  • phosphorylation of serine, threonine, or tyrosine
    • Example: C kinase 1bkx
  • hydroxylation of proline (to yield hydroxyproline, e.g. in collagen)
  • acetylation of lysine or the N terminus
    • Example of N-terminal acetylation: 4mdh
  • methylation of lysine
  • carboxylation of aspartate or glutamate
  • nitrosylation, e.g. to produce nitrotyrosine
  • acylation with a fatty acid (see N-terminal myristoylation of Recoverin)
  • prenylation
  • glycosylation of serine, threonine or asparagine (yielding glycoproteins)
    • Example of N-linked glycolylation: 1igy

Structure, Properties, Behaviors in Proteins

Angel Herráez has provided a tutorial introduction to amino acid structure in which 20 amino acids may be visualized in 3D using JSmol.

Please help to expand the list below to include all 22 amino acids.

Glycine

Gly G: Small.

See also Glycine in Proteopedia and Glycine in Wikipedia, where the structure is shown. Glycine is the smallest amino acid, since its side-chain is nothing but a single hydrogen. Glycine is frequently found in turns, since the steric clashes of larger amino acids would be problematic. Turns typically occur on the surfaces of proteins, between beta-strands or alpha-helices. It is common to find highly conserved glycines in turns on protein surfaces, since mutation to a bulkier residue would interfere with the fold. Glycine also occurs in beta-strands and alpha-helices, although its frequency in alpha helices is low (second lowest, after proline, p. 125 in [9]). Glycine is commonly found at the C-terminus of alpha helices, and is considered a helix terminator (p. 125 in [9]).

Histidine

His H: Charged: Basic, Aromatic, Bulky

See Histidine in Wikipedia, where the structure is shown. The sidechain of His can be positively charged (protonated), in which case both of the nitrogens in the sidechain imidazole ring have hydrogens, and the charge is delocalized between them. The pKa for protonation is 6.1. This means that, on average at any moment, half of the His sidechains are protonated when the pH is 6.1. At the pH of blood, 7.4 ±0.05, His sidechains are positively charged less than 10% of the time. Therefore, His is not included in the usual list of positively charged amino acids (lysine and arginine).

Phenylalanine

Phe F: Neutral, Aromatic, Bulky

See Phenylalanine in Wikipedia, where the structure is shown. Phe participates in Cation-pi interactions. Phenylketonuria is a genetic disease in which the enzyme that converts phenylalanine to tyrosine is nonfunctional, leading to toxicity from excess phenylalanine, and tyrosine deficiency.

Tryptophan

Trp W: Neutral, Polar, Aromatic, Bulky

See Tryptophan in Wikipedia, where the structure is shown. In trans-membrane proteins, tryptophans often lie at the interface between water and lipid. An example is the potassium channel, e.g. 1bl8. To see their positions, use the Find dialog in FirstGlance in Jmol. Trp participates in Cation-pi interactions.

Tyrosine

Tyr Y: Neutral, Polar, Aromatic, Bulky

See Tyrosine in Wikipedia, where the structure is shown, and Nitrotyrosine. Tyr participates in Cation-pi interactions.

See Also

References

  1. A one-letter notation for amino acid sequences (definitive rules). IUPAC—IUB Commission on Biochemical Nomenclature, pages 639-645, International Union of Pure and Applied Chemistry and International Union of Biochemistry, Butterworths, London, 1971. Full Text.
  2. 2.0 2.1 Announcement: Standardization of Amino Acid Nomenclature, World Wide Protein Data Bank News, January 8, 2014.
  3. 3.0 3.1 In January, 2022, no D-selenocysteine nor D-pyrrolysine were present in the wwPDB, according to David Armstrong of PDBe.
  4. 4.0 4.1 4.2 See Chirality in Wikipedia].
  5. Atkins JF, Gesteland RF. The twenty-first amino acid. Nature. 2000 Sep 28;407(6803):463, 465. PMID:11028985 doi:10.1038/35035189
  6. See Pyrrolysine in Wikipedia.
  7. Hao B, Zhao G, Kang PT, Soares JA, Ferguson TK, Gallucci J, Krzycki JA, Chan MK. Reactivity and chemical synthesis of L-pyrrolysine- the 22(nd) genetically encoded amino acid. Chem Biol. 2004 Sep;11(9):1317-24. PMID:15380192 doi:10.1016/j.chembiol.2004.07.011
  8. Lobanov AV, Kryukov GV, Hatfield DL, Gladyshev VN. Is there a twenty third amino acid in the genetic code? Trends Genet. 2006 Jul;22(7):357-60. Epub 2006 May 19. PMID:16713651 doi:10.1016/j.tig.2006.05.002
  9. 9.0 9.1 Kessel, Amit, and Nir Ben-Tal. Introduction to Proteins: Structure, Function, and Motion. CRC Press, 2011. 653 pages.

Proteopedia Page Contributors and Editors (what is this?)

Eric Martz, Joel L. Sussman, Angel Herraez, Andrea Gorrell

Personal tools