Introduction to Evolutionary Conservation

From Proteopedia

(Difference between revisions)
Jump to: navigation, search
Current revision (18:54, 4 July 2024) (edit) (undo)
 
(21 intermediate revisions not shown.)
Line 1: Line 1:
-
<StructureSection load='' size='350' side='right' scene='Introduction_to_Evolutionary_Conservation/Conservation/1' caption='MeCp2 protein bound to DNA (crystal structure [[3c2i]]). Conservation calculated by [[ConSurfDB_vs._ConSurf|ConSurf-DB]].'>
+
<StructureSection load='' size='350' side='right' scene='Introduction_to_Evolutionary_Conservation/Conservation/1' caption='MeCp2 protein bound to DNA (crystal structure [[3c2i]]), or enolase [[4enl]]. Conservation calculated by [[ConSurfDB_vs._ConSurf|ConSurf-DB]].'>
Mutations occur spontaneously in each generation, randomly changing an amino acid here and there in a protein. Individuals with mutations that impair critical functions of proteins may have resulting problems that make them less able to reproduce. Harmful mutations are lost from the gene pool because the individuals carrying them reproduce less effectively. Since the harmful mutations are lost, the amino acids critical for the function of a protein are '''conserved''' in the gene pool. In contrast, harmless (or very rare beneficial) mutations are kept in the gene pool, producing '''variability''' in non-critical amino acids.
Mutations occur spontaneously in each generation, randomly changing an amino acid here and there in a protein. Individuals with mutations that impair critical functions of proteins may have resulting problems that make them less able to reproduce. Harmful mutations are lost from the gene pool because the individuals carrying them reproduce less effectively. Since the harmful mutations are lost, the amino acids critical for the function of a protein are '''conserved''' in the gene pool. In contrast, harmless (or very rare beneficial) mutations are kept in the gene pool, producing '''variability''' in non-critical amino acids.
-
==Examples==
+
==Example==
 +
 
 +
===Rett Syndrome===
Consider the protein methyl CpG binding protein 2 (MeCP2; [http://www.uniprot.org/uniprot/P51608 UniProt MECP2_HUMAN]). Although its function is still unclear, it is expressed throughout the body, and disruption of its function causes problems with brain development and function<ref name="ghr">[http://ghr.nlm.nih.gov/gene/MECP2 MECP2 article] in the ''National Library of Medicine's Genetic Home Reference''</ref>. Some mutations in MeCP2 cause [http://workshops.molviz.org/slides/rett/rett.htm Rett Syndrome], a severely debilitating congential condition affecting mostly women. These women are unlikely to have children; hence, the mutations in their MeCP2 genes are lost from the human gene pool. Because the mutations are lost, the amino acids at the mutated positions remain unchanged (identical) in the vast majority of people. That is, they are conserved.
Consider the protein methyl CpG binding protein 2 (MeCP2; [http://www.uniprot.org/uniprot/P51608 UniProt MECP2_HUMAN]). Although its function is still unclear, it is expressed throughout the body, and disruption of its function causes problems with brain development and function<ref name="ghr">[http://ghr.nlm.nih.gov/gene/MECP2 MECP2 article] in the ''National Library of Medicine's Genetic Home Reference''</ref>. Some mutations in MeCP2 cause [http://workshops.molviz.org/slides/rett/rett.htm Rett Syndrome], a severely debilitating congential condition affecting mostly women. These women are unlikely to have children; hence, the mutations in their MeCP2 genes are lost from the human gene pool. Because the mutations are lost, the amino acids at the mutated positions remain unchanged (identical) in the vast majority of people. That is, they are conserved.
Line 99: Line 101:
===Sophisticated Analysis of Conservation: ConSurf===
===Sophisticated Analysis of Conservation: ConSurf===
-
The above analysis assigns each amino acid in enolase to one of three categories: conserved, similar, or different. This is very simplistic. In contrast, the analysis used in Proteopedia (and in the above image) is sophisticated, using many more sequences, and weighting the impact of each sequence in the multiple sequence alignment according to the phylogenetic tree calculated from the alignment. This sophisticated determination of conservation and variability is done by the [[ConSurfDB vs. ConSurf|ConSurf Servers]] (see also a [[ConSurfDB_vs._ConSurf#The_ConSurf-DB_Mechanism|summary of their mechanism]]). ConSurf divides conservation into 9 levels, and colors them as follows:
+
The above analysis assigns each amino acid in enolase to one of three categories: conserved, similar, or different. This is very simplistic, and sensitive to the addition or removal of one or a few sequences from the alignment which can have a large effect on the results. In contrast, the analysis used in Proteopedia (and in the molecular view at right) is sophisticated, using many more sequences, and weighting the impact of each sequence in the multiple sequence alignment according to the phylogenetic tree calculated from the alignment. This sophisticated determination of conservation and variability is done by the [[ConSurfDB vs. ConSurf|ConSurf Servers]] (see also summaries of their mechanism: [http://www.umass.edu/molvis/workshop/slides/consurf3.htm short version], or [[ConSurfDB_vs._ConSurf#The_ConSurf-DB_Mechanism|longer version]]). ConSurf's analysis is ''robust'': addition or removal of a few sequences has little effect. ConSurf divides conservation into 9 levels, and colors them as follows:
<center>{{Template:ColorKey_ConSurf_NoGray}}</center>
<center>{{Template:ColorKey_ConSurf_NoGray}}</center>
 +
====Sequence Colored by Conservation====
When ConSurf's colors are applied to the 436 amino acids in the sequence of enolase (based on a multiple sequence alignment containing 150 sequences), this is the result:
When ConSurf's colors are applied to the 436 amino acids in the sequence of enolase (based on a multiple sequence alignment containing 150 sequences), this is the result:
[[Image:4enl consurf150 sequence wb.jpg|400 px|left]]
[[Image:4enl consurf150 sequence wb.jpg|400 px|left]]
{{Clear}}
{{Clear}}
-
Notice that the conserved residues are scattered around the sequence with no obvious pattern. However, when the same <scene name='Introduction_to_Evolutionary_Conservation/Enolase_with_consurf_colors/1'>colors are applied to the amino acids in the 3D structure</scene>, they form a conserved patch around the catalytic site (marked with a <span style="background:black; color:#00ff00;">'''&nbsp;zinc ion colored green&nbsp;'''</span>.
+
Notice that the conserved residues are scattered around the sequence with no obvious pattern.
-
*Show [http://firstglance.jmol.org/fg.htm?mol=http%3A//bioinformatics.org/firstglance/fgij/localPDBFiles/4ENLA_ConSurf_DB_pipe.pdb conservation of enolase in FirstGlance in Jmol] ([[4enl]]).
+
 
 +
====3D Structure Colored by Conservation====
 +
However, when the same <scene name='Introduction_to_Evolutionary_Conservation/Enolase_with_consurf_colors/1'>colors are applied to the amino acids in the 3D structure</scene>, they form a conserved patch around the catalytic site (marked with a <span style="background:black; color:#00ff00;">'''&nbsp;zinc ion colored green&nbsp;'''</span>.
 +
*Show [http://firstglance.jmol.org/fg.htm?mol=http%3A//bioinformatics.org/firstglance/fgij/localPDBFiles/4ENLA_ConSurf_DB_pipe.pdb.gz conservation of enolase in FirstGlance in Jmol] ([[4enl]]).
Conserved surface patches identify functional regions of proteins. Less commonly, patches of high variability may also be functional. (Can you think of situations where high variability would be advantageous?<ref>Advantageous variability will be seen in these cases: [[5hmg]], [[2vaa]], [[3hi6]].</ref>)
Conserved surface patches identify functional regions of proteins. Less commonly, patches of high variability may also be functional. (Can you think of situations where high variability would be advantageous?<ref>Advantageous variability will be seen in these cases: [[5hmg]], [[2vaa]], [[3hi6]].</ref>)
For instructions on how to identify conserved regions of a molecule of interest, and how to show them in Proteopedia (for example with green links), please see [[How to see conserved regions]].
For instructions on how to identify conserved regions of a molecule of interest, and how to show them in Proteopedia (for example with green links), please see [[How to see conserved regions]].
 +
 +
==Expected vs. Unexpected Conservation==
 +
Conservation is '''expected''' for those amino acids that support the 3D structure and functions of a protein. Common examples are listed in the table below. When there is no known structural or functional explanation for conservation of an amino acid, or a cluster of amino acids, the conservation is '''unexpected'''. Unexpected conservation may provide clues for discovering new functions or structural features, e.g. through functional analysis of mutants.
 +
 +
<table class="wikitable"><tr>
 +
<th colspan="2"><center>
 +
Expected Evolutionary Conservation
 +
</center>
 +
</th></tr><tr><th>
 +
Amino Acids
 +
</th><th>
 +
Reason for Conservation
 +
</th></tr><tr><td>
 +
Gly, Pro in turns between helices or beta strands
 +
</td><td>
 +
Required for [[Evolutionary_Conservation#Conservation_for_Domain_Folding|protein domain folding]]
 +
</td></tr><tr><td>
 +
Charged amino acid (Lys, Arg, Asp, Glu) in a salt bridge
 +
</td><td>
 +
Required for [[Salt bridges|protein stability]]
 +
</td></tr><tr><td>
 +
Cys in a disulfide bond
 +
</td><td>
 +
Required for protein stability
 +
</td></tr><tr><td>
 +
N-terminal Met
 +
</td><td>
 +
Start codon for protein synthesis
 +
</td></tr><tr><td>
 +
Amino acids in a large cluster of highly-conserved residues
 +
</td><td>
 +
Required for protein function, e.g. catalytic or binding site
 +
</td></tr></table>
 +
 +
[http://FirstGlance.Jmol.Org FirstGlance in Jmol] makes it easy to locate turns, salt bridges, disulfide bonds, or the N-teminus. In FirstGlance:
 +
* Touch the conserved residue of interest to get its name and sequence number, e.g. Gly236 (in enolase 4enl).
 +
* Use ''Find'' to put yellow halos around the residue of interest, e.g. enter ''Gly236'' in the ''Find'' slot.
 +
** Turns: Views tab, Secondary Structure.
 +
** Salt bridges: Tools tab, Salt Bridges.
 +
** Disulfide bonds: Tools tab, Disulfide Bonds.
 +
** N terminus: Views tab, N->C Rainbow. You may also wish to check ''Sequence Numbers'' and/or ''Residue Names'' near the bottom of the control panel (upper left panel).
{{Clear}}
{{Clear}}
Line 119: Line 166:
*[[Evolutionary Conservation]]
*[[Evolutionary Conservation]]
*[[ConSurfDB_vs._ConSurf]]
*[[ConSurfDB_vs._ConSurf]]
 +
*[[ConSurf/Index]]: links to all Proteopedia pages about ConSurf and evolutionary conservation.
==Notes and References==
==Notes and References==

Current revision

MeCp2 protein bound to DNA (crystal structure 3c2i), or enolase 4enl. Conservation calculated by ConSurf-DB.

Drag the structure with the mouse to rotate

See Also

Notes and References

  1. MECP2 article in the National Library of Medicine's Genetic Home Reference
  2. Advantageous variability will be seen in these cases: 5hmg, 2vaa, 3hi6.

Proteopedia Page Contributors and Editors (what is this?)

Eric Martz, Alexander Berchansky, Verónica Gómez Gil

Personal tools
In other languages