Methods
SPRITE:
Enter the PBD ID for the protein of interest and exclude 2-residue hits. Once the search is complete, browse "List of Hits" results obtained. View the "Full details" result alignments. It is also possible to view the alignment between the protein of interest and the matched protein. View hits by each side of protein by viewing "Arranged by sites" function. Review alignments with an RMSD below 2.0 Angstroms and determine whether results are consistent with established function of the protein. Capture images of alignments that match proteins of interest the best and record RMSD values.
Chimera:
Enter PBD ID of protein of interest and click fetch in order to show the structure of the protein. Using the results from SPRITE, load the protein of known function into Chimera using its PBD ID. Hide everything but the subunit of interest, and then align the active site motif (type in "match... followed by any atoms of interest). An RMSD value will be shown, and use that to determine the quality of alignment (anything below 2.0 Angstroms is considered high quality). To better visualize the alignment, delete "match" and replace it with "sel". Make it so that the structures align in a similar way that they did on SPRITE. Name and save the file for future use.
Dali:
Open the website and click on PDB search lab, enter the four-letter PDB of the assigned protein (3B7F). Next, submit the structure with the chain identifier (never use one with DNA, pick one representing the sequence the group is interested in). Enter a meaningful description in the job name field, enter email address, submit the job and wait for a link to be emailed. Once the link is emailed, download the results.
BLAST:
Go to RCSB webpage and search for the protein by PDB ID, explore the page for the protein. Select "FASTA sequence: and look at the protein sequence and copy it. Go into the NCBI BLAST search page and paste the protein sequence into the sequence field. Using the sequence that shows up, try and identify proteins similar to the query protein.
InterPro:
Perform an InterPro search for the sequence of 3B7F. Conifer protein superfamily identification and domains found. Identify related proteins in the domain organization. Think about what these related proteins have in common with the protein of interest, and what the function of the domains are. Repeat the InterPro search using "View a Structure" link on the main page. Explore the links that InterPro provides and compare these findings with the seqience search.
Molecular Docking with SwissDock:
Select
Structural Alignment Through SPRITE, Chimera, Dali, and BLAST
Based on the findings through SPRITE and Chimera, 1XNY_C00 had the lowest RMSD value at 1.875 angstroms. Therefore, it is hypothesized that 3B7F was a carboxylase.
Dali gave mainly xyloglucanase matches and no matches were carboxylases, the hypothesis that 3B7F was a carboxylase was proven to be wrong. It is not hypothesized that 3B7F is a xyloglucanase. This was due to the fact that 1XNY did not show up as a match in Dali.
Xyloglucanases break down xyloglucan, a hemicellulose in plant cell walls. The substrate is xyloglucan and water for the (endo-)beta-1,4-xyloglucanases, and a common cofactor is calcium.
The BLAST search shows glycosyl hydrolases having similar sequences to 3B7F.
Using InterPro to Predict Protein Function
Research shows that 4Q7Q is a member of the SGNH Hydrolase protein super family. BLAST and InterPro both suggested 4Q7Q’s inclusion in this family, and the known conserved residues seen from SPRITE analysis—Serine, Glycine, Asparagine, and Histidine—line up with those observed throughout this family.D,E Notably, this superfamily is also referred to as the GDSL Hydrolase superfamily.D,E
4Q7Q’s inclusion in this family also supports its SPRITE-derived hypothetical functionality. Rhamnogalacturonan Acetylesterase—the enzyme with one of the best SPRITE-based alignment relative to 4Q7Q—is a member of this family.F Proteins in this family are also known for containing a “unique hydrogen bond network that [stabilizes]” the active site.F
Regarding what protein family 4Q7Q belongs to, DALI results suggest it is a part of a sub-family of the greater GDSL/SGNH superfamily. A PDB90% DALI search labels 4Q7Q as a part of the “Lipolytic Protein G-D-S-L Family,” which refers to enzymes that hydrolyze lipid substrates.I
Molecular Docking with SwissDock
The primary sequence of 4Q7Q shows several conserved sequences between it and esterase-like proteins. A sequence of GDSI—similar to the GDSL sequence seen from its family and superfamily—can be seen between 4Q7Q and enzymes like Isoamyl Acetate-Hydrolyzing Esterase. Other noteworthy conserved sequences between esterases and 4Q7Q include GxND and DGxH.
These enzymes also share similar secondary structures. Segments of alpha-helixes and beta-sheet strands appear and remain nearly entirely conserved throughout esterase analysis. A few conserved coils appear, but these sections do not appear as often as the other two secondary structures.
Similar conserved sequences could be found between 4Q7Q and lipases. The GDSI, GxND, and DGxH sequences can be seen from lipases like 7BXD.? The same secondary structure segments can also be located in the lipases analyzed.
Protein Purification
Right-hand SPRITE analysis revealed 4Q7Q exhibited residues like those seen from enzymes operating with acetyl-like substrates. Specifically, residues Ser. 30, Gly. 69, Asn. 97, Asp. 251, and His. 254 on the A and B chains of 4Q7Q line up with similarly positioned residues on esterases like Platelet-Activating Factor Acetylhyd (PAFA), which exhibited an RMSD of 0.25 angstroms when compared to 4Q7Q.
Other proteins with similar motifs of note are Thioesterase I and Rhamnogalacturonan Acetylesterase, with RMSD values of 0.46 and 0.61, respectively. These alignments focus on the same active site as PAFA did, suggesting the acetyl-like substrates 4Q7Q focuses on are similar to esters.
PFAM graphics from DALI revealed significant structural equivalence between 4Q7Q, a lipase-like protein, Rhamnogalacturonan Acetylesterase, and Sialate O-acetylesterase.
Protein Concentration
SwissDock analysis showed a preference for larger molecules, specifically fatty acids. Lactide, Ethyl Butyrate, and Triethylene Glycol exhibited noticeably weak binding affinities to the theorized active site of 4Q7Q. These ligands may be ill-suited to act as substrates for 4Q7Q as they are remarkably polar, and lipids—one of the potential categories of substrates for 4Q7Q—are mostly non-polar.
Despite this, these ligands show noticeable hydrophobic interactions with the active site. This implies 4Q7Q uses hydrophobic regions to help guide substrates into the right orientation for enzymatic processes. This also further supports the possibility that 4Q7Q primarily operates with hydrophobic lipid-based substrates. This also explains why Methyl Acetate exhibited a relatively weaker affinity for 4Q7Q, as its smaller structure prevented hydrophobic interactions.
Protein Analysis
Highlight the data that helped you come to your conclusion here including any relevant figures. Make sure include potential substrates and binding sites.