Long interspersed element 1 (LINE1) open reading frame 1 (L1ORF1) is a LINE-type retrotransposon (class I-transposon) in humans that is responsible for retrotransposition - a process increasing the DNA variety and size of the genome. The protein localizes to large L1 ribonucleoprotein particles, stress granules and nucleus (ref).
Introduction
During retrotransposition, the RNA transcript is reversely transcribed to DNA and integrated with the genome in different place than the original gene. The process is catalyzed by retrotransposons, which often integrate reverse transcriptase and endonuclease functions required in the process.
L1ORF1 localizes to ribonuclein particles, stress granules and nucleus. Although the protein has general affinity to nucleic acids, it displays a strong cis preference, what makes it bind primarily encoding RNA transcripts (ref). In a process known as target-primed reverse transcription, L1ORF1 reversely transcribes the mRNA at the point of genomic integration (ref) and assists with the integration with the DNA.
Structure
Main part of the L1ORF1 structure was solved by X-ray crystallography by Khazina et al. (ref) in 2011. The crystallized part of the protein is a 338-residue, 40 kDa chain composed of 3 domains: . The N-terminal residues were not crystallized, but based on previous atomic force microscopy experiments (ref) they are expected to extend the alpha-helix by approximately 50 residues. The domains are connected with two linker regions responsible for structure flexibility. The overall structure of the protein forms an L-shaped pocket formed by N terminal helix and central region with the flexible C-terminal domain “capping” the binding pocket (link to monomer).
Stabilization of the Trimeric Structure
L1ORF1 forms in which the unusually long alpha helix (residues 111-153) (according to work of Januszyk these helices begin with residue25) provides the trimerization axis forming a coiled coil structure (trimer overall cartoon). The very long N-terminal helices are stabilized by three unique structural features:
- coordination of on the hydrophobic interface inside the coiled coil structure by three asparagines (Asn142) and three arginines (Arg135), what promotes the trimeric state of the coiled coil;
- externally stabilizing between coiled coil helices.
- hydrophobic interactions within the hydrophobic core of the coiled-coil.
The trimerization is additionally stabilized on the C-terminal side of the molecule by 1 hydrogen bond between domain, while the CTD regions remain more flexible and do not interact with one another, what plays significant role in accommodation of RNA molecule.
Nucleic Acid Binding Surfaces
Here is an electrostatic map of the putative .
L1ORF1 is a RNA-interacting molecule although its structure in RNA-bound state was not crystallized. Careful mapping of protein surface electrostatic potential indicated the likely sites of RNA accommodation. The protein contains two major grooves with positively-charged surface whose affinity to negatively-charged RNA is highest:
- horizontal cleft between the RRM and CTD domains of each monomer, whose local electrostatic potential reaches up to +15 kT/e;
- vertical, wide groove between monomers, with positive local electrostatics somewhat lower.
It has been demonstrated by mutagenesis (ref Khazina) that arginine residues 220, 235 and 261 that localize to the surface between RRM and CTD domains are indispensable for the RNA affinity of the trimer. Therefore, in the RNA-bound state, the RNA is likely wound around L1ORF1 trunk with the strand passing through the negatively charged vertical and horizontal grooves. It was so far, however, impossible to experimentally prove the RNA-protein interactions.
Structural Fexlibility
The CTD domains are "hinged" to the RRM and coiled-coil domains, and are free to move relative to the rest of the structure. This might be important for reverse transcription, allowing the trimer to unwind the RNA slowly so as to prevent secondary structure formation.
3D Structures of L1 ORF1 Protein
Mus musculus (mouse)
Nuclear Magnetic Resonance: 2JRB, 2LDY, 2W7A
Homo sapiens (human)
X-Ray Crystallography: 2YKO, 2YKP, 2YKQ
A Cool Movie
["http://www.nature.com/nsmb/journal/v18/n9/extref/nsmb.2097-S2.mov"|Click]