User:Alexis Neyman/Sandbox 1
From Proteopedia
Biological Structure of SRp20
IntroductionAlternative RNA splicing is a significant post-transcriptional process that allows diversity of gene expression. Initially, a gene is transcribed into pre-messenger RNA (pre-mRNA). The pre-mRNA contains introns, which are sequences that are not translated into protein, and exons, which code for proteins. In alternative RNA splicing, exons are either removed or retained in the mRNA in different combinations, creating diverse arrangements of mRNAs from one pre-mRNA. This process is carried out with splicing factors which are proteins that remove introns and exons via spliceosomes [1]. The sequence-specific RNA binding protein SRp20 (gene name SRSF3) is a splicing factor and one of the smallest members of the serine and arginine-rich (SR) protein family that plays a significant role in alternative splicing of exons [1] [2]. It is a highly conserved Homo Sapien protein[3]. FunctionSRp20 is a splicing factor involved in regulation of many genes through alternative splicing of exons by associating with cis-elements of RNA[1][4]. It contains an auto-regulatory activity in which it can alternatively splice its own mRNA by including exon 4 thus reducing the length of its protein[5][6][7]. It has been speculated that SRp20 has been linked to termination of transcription by either activating enzymes responsible for degrading the RNA sequence downstream from the cleavage site or promoting the removal of RNA polymerase from the DNA[8]. SRp20 might play a role in export of mature mRNAs by promoting the recruitment of TAP, which is an export factor for mRNA export out of the nucleus[5]. It has been found that SRp20 and PCBP2, which is a protein that binds to internal ribosome entry site (IRES) RNA sequences in picornavirus, interact with each other to initiate viral translation[9]. Thus, these findings indicate SRp20 plays a role in protein translation. It is also suggested that SRp20 allows 3' terminal exon to be recognized by polyadenylation factors[10]. StructureThe structure of SRp20 was determined by heteronuclear single quantum coherence (HSQC) NMR. The structure is composed of one RNA recognition motif (RRM) at the N-terminus and one Ser/Arg (SR) domain at the C-terminus where the Ser residues are phosphorylated[1]. The RRM of SRp20 demonstrates the β1α1β2β3α2β3 topology seen in other RRMs. The role of the RRM region is to provide substrate specificity where SRp20 interacts with splicing enhancing sequences in mRNA. There have been no determined 3D structures of the SR domain thus it is unclear what its exact role is. However, there has been some speculation that it might be involved in aiding protein-protein interactions in the spliceosome. It contains 164 amino acids, half belonging to the RRM and other half to the SR domain (Figure 1). SRp20 has a molecular weight of 19 kDA[1].Poor Solubility ProblemThe SRp20 protein has poor solubility in its free state. This made it impossible to determine the structure of SRp20 using HSQC Spectroscopy without a modification to the free state protein. This problem was resolved by purifying the proteins after fusing the RRM (RNA-recognition motif) with the immunoglobulin G-binding domain 1 of Streptococcal Protein G GB1 solubility tag (Figure 2) [2].RNA Interactions1H-15N HSQC results showed a large hydrophobic β-sheet on the RRM binding to the RNA with all four bases interacting with one of the four aromatic residues via hydrophobic interactions [2]. β-hairpin amino acids are hydrogen bonded to bases on nucleic acid targets [11]. This suggests that the β-hairpin plays a role in SRp20 selectivity for specific ligands. The researchers used a smaller peptide chain to reduce the NMR broadening seen with longer peptides (allowing for structure determination), with the consequence of reduced binding affinity. The ligand used was . The conformation of U3 and C4 shows that U3 bulges out while C4 partially stacks over A2. Interactions with the RRM included in β1 and in β3. These aromatic side chains form hydrophobic interactions with the ligand when stacked (Figure 3). Also, the residue . . The amino proton of C1 hydrogen bonds with the carbonyl oxygen of Leu 80 and the side-chain carbonyl oxygen of Glu 79. The N3 of C1 hydrogen bonds with the amide of Asn 82, and the O2 of C1 hydrogen bonds with the hydroxyl group of Ser 81[2]. It was also noted that adopts an unusual syn conformation. U3 interacts with and with the β2-3 loop of the RRM. These residues are all hydrophobic, offering a large hydrophobic surface that helps bind the ligand, as well as preventing the solvent from binding. Additionally, C4 is maintained in its position by a [2].RRM StabilitySRp20 has a , which may contribute to the stability of the protein. A previous study, looking at the RRM in TDP-43 has suggested that the hydrophobic core may be a strong contributing factor to the protein’s stability [12]. In a different study, it was determined that, in the U11/U12-65K protein, the β-sheet packs against the two α-helices by way of hydrophobic interactions and that the resulting stabilization could be critical for the proper folding and orientation of elements for RNA binding [13]. Due to the conservative nature of RRMs, it could be speculated that the hydrophobic core found in SRp20, between the β-sheet and two α-helices, could contribute to the stability of its RRM in a similar fashion. However, additional studies need to completed, focusing specifically on SRp20, to confirm this supposition. RRM SpecificityFour nucleotides can be accommodated by the SRp20 RRM β-sheet, but its recognition is only partially sequence specific. A study was done showing that C1 was more specific, while A2 and U3 were less specific. When the RNA ligand was changed to GAUC, the affinity of the SRp20 RRM for the RNA ligand decreased 10-fold. It is uncertain whether C4 is specifically recognized by the RRM. It was also seen that A was preferred over G at the second position, but there was no indication of a preference over U or C. U3 is even less specific, as it could also be C, G or A. The recognition of C1 is functionally necessary because a C to G mutation within the histone mRNA can impair RNA export [2]. Because specific residue mutations have not been done on SRp20, it is difficult to determine exactly which residues of its RRM are essential to its functionality. Advantages of low specificityOne advantage of low specificity is that it puts less evolutionary pressure on bound RNA, which would be prefered for exonic sequences [11]. With lower specificity, a larger array of RNA sequences can be targeted. Additionally, SRp20 can associate/disassociate with RNA more easily, which is important for highly dynamic RNA metabolism processes. RNA binding affinity can be modulated by protein-protein interactions (which are dependent on the level of phosphorylation) [2]. In future research, a structural image of the SR domain would be beneficial in understanding the process of SR domain phosphorylation and how it controls splicing and specificity. Comparing RRMsOther RRM-containing proteins typically contain RRMs that specifically recognize anywhere from 2-8 nucleotides of the RNA ligand. Aromatic residues in the β-sheets and the loops between β-strands and α-helices are the residues that specifically recognize the nucleotides. The SR protein ASF/SF2 has a histidine in the α2/β1 loop that is crucial for RNA binding and specificity. When this histidine is mutated to alanine (His183Ala), ASF/SF2 loses much of its ability to crosslink RNA. The number of RRMs present in a protein also affects the proteins specificity. In general, the more RRMs a protein contains, the more specifically it binds to the RNA ligand [11]. Mutating an RRM disrupts the specificity of the protein so it can no longer recognize the correct RNA sequence and ultimately leads to incorrect gene splicing or no mRNA export. Specifically, looking at how mutations in the RRM of ASF/SF2 is relevant to our understanding of SRp20 because they both operate as alternative splicing factors in Homo Sapians. While these specific point mutations have not been done in the RRM of SRp20, it can be speculated that related mutations in the SRp20 RRM might have a similar effect on its specificity and ability to bind a ligand. Relationship to 9G8and splicing factor are both sequence specific RNA binding proteins (Figure 4) and are the smallest members of the Serine-and-Arginine Rich (SR) protein family. Both RNA Recognition Motifs (RRMs) have a similar βαββαβ topology. SRp20 and 9G8 are 80% identical. The sequence alignment shows the alignment of the RRMs of SRp20 and 9G8 [2](Figure 4). SRp20 binds pyrimidine rich areas while 9G8 binds purine rich areas.This difference in binding comes from the fact that 9G8 has a zinc knuckle that recognizes GAC triplets [14]. 9G8s RRM is followed by a zinc knuckle and then the SR domain whereas SRp20s RRM is followed directly by the SR domain. When 9G8 lacks a zinc knuckle, it binds pyrimidine-rich sequences like SRp20 [2]. The zinc knuckle of 9G8 contains glycine residues at positions 5 and 8 and charged residues at positions 6 and 13 that are highly conserved [14]. Due to the poor solubility problem, a structure for the zinc knuckle of 9G8 is not available to show in an image.
DiseaseCancerThere have been findings that support the role of SRp20 in cellular proliferation/maturation. It was discovered that there was an over expression of SRp20 in breast cancer tissues. When SRp20 was reduced in cancer cells via siRNA, targets SRp20 mRNA, there was reduction in cell proliferation and increase in cellular apoptosis. For example, it was speculated that SRp20 might be involved in alternative splicing of FoxM1, a transcription factor involved in cellular proliferation, by either the inclusion or exclusion of exon 9 in FoxM1 transcript. If exon 9 was excluded from the FoxM1 mRNA via SRp20, then there was an increase in FoxM1 expression, cellular proliferation, and reduction in cell apoptosis[3]. Apoptosis is a necessary function to maintain homeostasis, and an imbalance in the regulation in apoptosis can lead to uncontrolled cell proliferation and tumor development. Due to the alternative splicing functionality of SRp20, it effects many other genes involved in cancer such as CD44 gene, PK-M gene, TAU gene, TP53 gene, and involved in WnT signaling pathway[1]. Although it has been understood that SRp20 plays a crucial role in cancer cells, the mechanism by which SRp20 affects these genes, and how its structure contributes to the development of oncogenic genes, is still unclear[3][15]. Relevance and ConclusionsUnderstanding and recognizing the mechanisms that SRp20 is involved in can help find treatment and management of cancer patients. The use of SR proteins (such as SRp20) may in the future be used for targeted therapy. Because there is no known structure for the C-term domain, due to an inability to obtain a structural image for it, most of the focus has been on the RRM domain. Little is understood about how the SR domain might recognize structures or other proteins. References
| ||||||||||||
