Practical Guide to Homology Modeling

From Proteopedia

Revision as of 19:22, 28 December 2014 by Eric Martz (Talk | contribs)
Jump to: navigation, search

Contents

Terminology

  • Query sequence: The amino acid sequence for which a 3D model is wanted. More commonly called the target sequence, but talking about target vs. template gets confusing.
  • Template: An empirically determined 3D protein structure with significant sequence similarity to the query.
  • Structure will be used in this article to mean three-dimensional structure.

What Is A Homology Model?

Homology models, also called comparative models, are obtained by folding a query protein sequence (also called the target sequence) to fit an empirically-determined template model. The registration between residues in the query and template is determined by an amino acid sequence alignment between the query and template sequences.

Imagine that the template’s polypeptide backbone is a folded glass tube. Now imagine that the query sequence is a thin metal chain that can be pulled through the tube. The chain (query) will adopt the same fold as the tube (template). The sequence alignment specifies how far the chain should be pulled into the tube; that is, how the residues in the query sequence match up with the structure of the template.

Errors or uncertainties in the sequence alignment result in errors or uncertainties in the homology model. Portions of the query sequence cannot be modeled reliably when there are gaps in the sequence alignment due to insertions/deletions ("indels"), or portions of the template that lack coordinates due to crystallographic disorder. Provided there is sufficient sequence identity between the query and template (at least 30%), the main chain in homology models is usually mostly correct. However, the positions of sidechain rotamers in homology models are usually unreliable.

Nevertheless, homology models are useful for seeing low-resolution features, such as which residues are on the surface or buried, which are close to other features of interest (such as a putative active site), and the overall distribution of charges and evolutionary conservation.

Do you need a homology model?

You don’t need a homology model if the amino acid sequence of interest (the query sequence) already has an empirically determined 3D structure. Structures determined empirically, by X-ray crystallography or (much less often) by solution NMR, will almost always be more accurate than a homology model.

Is there an empirical model?

All published, empirically-determined, atomic-resolution, macromolecular 3D structures are available in the [[[Protein Data Bank]] (PDB, pdb.org).

Each model in the PDB has a unique 4-character identification code (PDB ID) that begins with a numeral, and has letters or numerals for the last 3 characters . Examples are 1d66, 4mdh, 9ins.

Here are two methods for finding out if your query amino acid sequence, or parts of it, have empirically-determined 3D structures in the PDB.

References

Proteopedia Page Contributors and Editors (what is this?)

Eric Martz, Juergen Haas, Jaime Prilusky

Personal tools