Part:BBa_K3187028
Sortase A7M (Ca2+-independent variant)
Profile
Name | Sortase A7M |
Base pairs | 450 |
Molecular weight | 17.85 kDa |
Origin | Staphylococcus aureus, synthetic |
Properties | Ca2+-independent, transpeptidase, linking sorting motif LPXTG to poly-glycine Tag |
Structure
Usage and Biology
Transpeptidase: Sortase
Sortases belong to the class of transpeptidases and are mostly found in gram-positive bacteria.
The high rate of resistance to several antibiotics targeting gram-positive bacteria is also based on the
property of this enzyme class. Sortases can non-specifically attach virulence and
adhesion‐associated proteins to the peptidoglycans of the cell-surface.
In general, sortases are divided into six groups (A-F) that have slightly different properties and
perform three tasks in cells. Group A and B attach proteins to the cell-surface while Group C and D help
building pilin-like structures. Group E and F are not properly investigated yet which is why their exact
function is not known.
For our project we are especially interested in the sortases of the group A since they
covalently attach various proteins or peptides on the cell membrane as long as their targeting
motif is at the C-terminus of the corresponding protein. In comparison to other transpeptidases
Sortase A has the advantage that it is rather stable regarding variations in pH
Sortase A catalyzes the formation and cleavage of a peptide bond between the C-terminal
LPXTG amino acid motif and an N-terminal poly-glycine motif. The enzyme originates from
Staphylococcus aureus and is able to connect any two proteins as long as they possess those
matching target sequences. In the pentapeptide motif LPXTG, X can be any amino acid except cysteine.
Sortase A is rather promiscuous with regard to the amino acid sequence directly upstream of
this motif, a fact that makes it optimal for labeling applications. Even better, amino acids C-terminal
of the poly-glycine motif are not constrained to a certain sequence.
Reaction
To better understand how the enzymatic reaction works it is necessary to look at the crystal structure of Sortase A. The enyzme consists of an eight-stranded β‐barrel fold structure. The active site is hydrophobic and contains the catalytic cysteine residue Cys184 as well as a key histidine residue H120 that can form a thiolate-imidazolium with the neighboring cysteine. An additional structural property that also other sortases show is the calcium binding site formed by the β3/β4 loop. The binding of a calcium ion slows the motion of the active site by coordinating to a residue in the β6/β7 loop. This helps binding the substrate and increasing the enzymatic activity nearly eightfold. When a substrate gets into the active site, the cysteine attacks the amide bond between the threonine and the glycine in the LPXTG motif. After this the protonated imidazolium serves as an acid for the departing glycine with unbound NH2 of the former amide bond while the rest of the motif is bound to the cysteine residue. Another glycine nucleophile is then necessary in its deprotonated form to attack the thioester and re-establish an amide bond at the LPET-motif. This reaction is dead-ended if the used nucleophile is water. Due to the fact that the mechanism is based on protonated forms of the catalytic residues the reaction is quite pH-dependent. Although the Sortase A in general is relatively stable between pH 3 and 11 the reaction works best around pH 8.
Sortase variants
Due to the fact that the wildtype Sortase A shows rather slow kinetics, a pentamutant has been developed (Sortase A5M). This version of the enzyme carries mutations in P94R/D160N/D165A/K190E/K196T which lead to a 140- fold increase in activity. Thereby, reaction rates are improved even at low temperature, however, Sortase A5M is still Ca2+-dependent. This dependence interferes with potential in vivo usage, as the concentrations of calcium in living cells can vary considerably. Hence a sortase mutant that acts across high differences in calcium concentrations or even works completely Ca2+-independently would be required for in vivo applications of sortase. To attain a high yield enzyme which is also calcium-independent Ca2+-independent mutations were combined with the Sortase A5M resulting in Sortase A7 variants such as the Sortase A7M. The newly achieved calcium-independence of these variants enable sortase applications not only in vitro but in vivo as well.
Sortase A7M
For our project we chose to work with this optimized Sortase A7M. Its size is about 17.85 kDa and it has been shown to be stable for several weeks in the fridge at 4 °C. It also possesses the same properties of pH stability like other sortases but comes with the advantage of being calcium independent. "Sortagging" applications have included the cyclization of proteins and peptides , modification and labeling of antibodies and the synthesis of protein conjugates with drugs, peptides, peptide nucleic acids and sugars.Moreover it poses a lot of advantages for the binding of two proteins in vivo since it has relatively small tags which avoids putting too much metabolic burden on the cells when expressing the proteins of interest. This also avoids disturbing the folding of the proteins of interest and the later biological functions since the Sortase A7M is able to work under physiological conditions. Other methods like the intein- based labeling of surfaces require large fusion-proteins with the intein domain which puts stress on the living cells and might cause folding and solubility issues. Another application for sortase-mediated systems is the anchoring of proteins on the cell wall of gram-positive bacteria which can be used for display of heterologous proteins. It is also possible to attach non-biological molecules to the respective tag. The accessibility and flexibility determine the ability of a sortase enzyme to recognize the sorting motif and catalyzing the transacylation.
Methods
Cloning
The methods used for cloning of the different mutants of the sortase were restriction and ligation via NdeI and SalI and Gibson assembly. The Sortase A5M was cloned into pET24(+) vector via restriction and ligation NdeI and SalI as restriction enzymes. The vector posesses a kanamycin resistance and the srta7m is controlled through a T7 promoter, which can be induced with IPTG. Sortase A7M is controlled by the same T7 promoter. Sortas A, introduced by iGEM Stockholm 2016, was cloned via Gibson assembly into PSB1C3. This has a chloramphenicol resistance and is also controlled under a T7 promoter. Cloning of all products was checked via sequencing.
Expression and purification
After successfully transforming our sortase genes in BL21 cells, we inoculated 100 mL overnight cultures, with the respective antibiotic. The next day 1 L cultures were inoculated with the overnight culture to reach OD600 = 0.1. Subsequently the cultures were incubated under constant shaking at 37 °C until they reached OD600 = 0.6. At OD600 = 0.6 the cultures were induced with 0.5 mL of 1 M Isopropyl-β-D-thiogalactopyranosid (IPTG). The gene expression was performed at 30 °C under constant shaking overnight. After expression of Sortase A7M, Sortase A5M, and Sortase A from Stockholm (BBa_K2144008) in BL21 cultures the cells were crushed via EmulsiFlex (Avestin) and proteins were purified through affinity chromatography via Fast Protein Liquid Chromatography (FPLC) with the ÄKTA pure (GE Healthcare, Illinois, USA). His-Tag was used for purification of Sortase A7M and Sortase A (Stockholm) and Strep-Tag II was used for purification of Sortase A5M.
SDS-Page
To verify the successful production of of Sortase A7M, Sortase A5M, and Sortase A SDS-PAGEs were performed. The resulting bands were compared to the molecular weight of the different sortase variants. Also, SDS-PAGEs were completed to verify enzymatic activity in assays prior to measuring sortase properties via Fluorescence Resonance Energy Transfer (FRET).
Flourescence Resonance Energy Transfer (FRET)
To determine the kinetics of our transpeptidase variants, FRET assays were performed in 384 well-plates (dark) using a Tecan plate reader. A FRET relies on the phenomenon that an excited fluorophore (donor) transfers energy to another fluorophore (acceptor), thereby exciting it. This process only works if both fluorescent molecules are in close proximity and depends on the FRET-Pair. By transferring the energy from donor to acceptor, the donor's emission is reduced and the intensity of the acceptors emission is increased . The efficiency depends on the distance between the fluorophore, the orientation and the spectral characteristics . You can see the principle of FRET in Fig. 2.
Mass Spectrometry
To estimate the product yield of catalyzed reactions by Sortase A7M we performed mass spectrometry. The tested molecules can be distinguished between products and educts due to desorption and ionization. Therefore, we used the electrospray ionization (ESI) technique for the mass spectrometry. This technique has a low resolution but is a very soft ionization method, which makes it an optimal method for biological molecules.
Results
Characterization of Sortase A7M (and comparison to BBa_K3187016)
How do we measure if our purified sortases are active?
After purification of the sortases, we first performed SDS-PAGEs to verify that they are pure and monomeric. You can see in Fig. 3 that the purifications were successful. Next, we tested if the purified sortases connect two proteins that carry the important Sortase-recognition tags, N-terminal polyG and C-terminal LPETGG. Therefore, we added the sortases to a mix of GGGG-mCherry and mCherry-LPETGG. The reactions were performed in different buffers, at different enzyme-to-substrate ratios and for different time spans. We performed an SDS-PAGE, and prior to Coomassie staining, we recorded fluorescent images of the gel. Thereby, we could identify mCherry bands in the gel.
How do we measure sortase reaction kinetics
In the above described assays, we noticed the impact of enzyme-substrate ratio and reaction duration on the overall product yield. We thought about how to further measure the kinetics of the sortase reaction. In the literature, sortase reaction kinetics are often measured by FRET-assays. Therefore, we designed a suitable FRET-assay. In the end, we came up with a new FRET pair not described in the literature to date: 5-TAMRA-LPETG and GGGG-sfGFP.
Development of a new FRET pair
For characterization of the reaction kinetics of Sortase A7M, Sortase A5M and Sortase A, we decided to develop a suitable FRET pair. In order to find an optimal FRET pair, we first recorded an emission and absorption spectrum of 5-Carboxytetramethylrhodamin-LPETG (TAMRA) and GGGG-mCherry to verify the suitability for the FRET effect, checking for a possible overlap between the donor's emission and the acceptor's extinction.
TAMRA is a chemical fluorophore that has an absorbance maximum at 542 nm and an emission
maximum at
570 nm. The
terminal carboxy
group of the dye was linked via a lysine linker to the LPETG sequence (see Fig. 5).
mCherry has
an N-terminal poly-glycine sequence and can therefore be linked to the LPETG motif of TAMRA via
the
Sortase A. For a sufficient FRET-effect, it is also necessary that the distance between
donor and
acceptor is lower than the Förster radius. The Förster radius describes the distance between two
fluorophores at which 50 % of the energy is transferred.
First, we wanted to identify which concentrations are needed for our experiment, then set up the
reaction
and measured fluorescence intensities. Over time, a decline in the emission of TAMRA can be
observed as
Sortase A7M/A5M is converting more educts to products.
The emission and extinction spectra of TAMRA and mCherry exhibit an overlap of emission of TAMRA and extinction of mCherry. Based on this output, a FRET-assay for the kinetics of Sortase A7M was performed to confirm whether the FRET-pair is working. As TAMRA is excited with light of a lower wavelength than mCherry, the former serves as FRET donor and the latter as acceptor. We chose the excitation wavelength at 485 nm to prevent unnecessary “leak” excitation of mCherry. Nevertheless, an extinction of mCherry could not be excluded and may have negative effects on the visibility of the FRET.
The analysis of the data shown in Fig. 7 confirmed the aforementioned suspicion that mCherry is also excited at 485 nm, which makes differentiation of the fluorescence more difficult. Furthermore, Fig. 8 shows that the difference in the decline of TAMRA is not significant. Accordingly, a decline in the emission maximum of TAMRA over time is also visible in the negative control. One reason might be bleaching of TAMRA through the excitation by the laser. Nevertheless, conversion by the Sortase A7M can be observed by comparing the results with the negative control.
To confirm the functionality of the Sortase A7M, another more sufficient FRET-pair was developed. The measured absorbance and emission spectra indicated that TAMRA and superfolder green fluorescence protein (sfGFP) are a possible FRET-pair. The sfGFP has an N-terminal polyglycine sequence and can therefore be linked to TAMRA with the sorting motif, in the same way as mCherry was connected. However, the small overlap between the extinction spectra of sfGFP and TAMRA could solve the previous “simultaneous excitation” problem we observed for the mCherry-TAMRA FRET-pair. Because of the lower excitation maximum of sfGFP compared to TAMRA, sfGFP was chosen as donor and TAMRA as acceptor. sfGFP was excited at 465 nm to minimize the unnecessary leak excitation of sfGFP.
The transfer of energy from sfGFP to TAMRA can be seen by the decrease in emission of sfGFP and increase in emission from TAMRA. Compared to TAMRA as an acceptor, the sfGFP bleaches significantly less and is consequently more suitable as a donor for FRET. Furthermore, the afore mentioned problem of simultaneous donor and acceptor excitation seems to be solved. It seems that we have found a FRET-pair with superior properties.
Due to the collected data of both FRET-pairs we decided to use the TAMRA-LPETG and GGGG-sfGFP FRET-pair for further characterization of our Sortase A variants. Two reasons justify this decision:
- TAMRA bleaches stronger than sfGFP when excited with a laser.
- The spectral overlap between TAMRA and mCherry disturbs “clean” energy transfer, thus the FRET-effect would be less visible and could not be used for analysis of the sortase-mediated reaction.
For recording of sortase reaction parameters we recommend using the FRET-pair sfGFP-TAMRA. As this pair of fluorophores proved to have near perfectly aligned spectra and since the bleaching effect is visibly lower on sfGFP than on TAMRA, we chose to use this FRET-pair in most of our following assay. Nevertheless, we do not rule out the use of TAMRA-mCherry as a FRET-pair since we used it in several FRET-assays as well.
Why are enzyme-substrate ratio and duration important parameters of the sortase reaction?
In one of our first FRET experiments, we addressed the simple theory: More sortase in the reaction mix improves the initial product formation. For this, we used the TAMRA-LPETG : GGGG-mCherry FRET pair. We measured the FRET change over time in a multiwell platereader (Fig. 15).
However, in this assay we observed a striking feature of the sortase reaction. In the reaction with more Sortase A7M present, the FRET change started to decrease after a certain maximum was reached! We suspected some kind of dead-end product formation, as the sortase does also catalyze the reverse reaction of product to educts. Therefore, the overall reaction duration is a very important parameter. We gathered more details about the role of the reverse reaction during our comparison of Sortase A7M and Sortase A5M. Just keep reading if you want to know more!
Who wins - Sortase A7M or Sortase A5M
In our introduction we described that Sortase A7M and Sortase A5M are both fascinating enzymes, although each of them has a unique „selling point“. Sortase A5M is faster, whereas Sortase A7M is Ca2+-independent. We confirmed both of these points in extensive FRET-assays. According to the literature, Sortase A5M works best with a Ca2+-concentration of 2 mM. In contrast, Sortase A7M is a calcium-independent mutant of the enzyme. Moreover, Ca2+ even seems to inhibit this enzyme variant slightly .
Firstly, we confirmed that in contrast to Sortase A5M, Sortase A7M is Ca2+-independent. The results are shown in Fig. 16 Sortase A7M also works in presence of Ca2+, but these FRET experiments made us suspect that Ca2+ may even inhibit Sortase A7M.
Secondly, we confirmed that Sortase A5M is inactive if Ca2+ is absent, which can be seen in Fig. 17 As expected, Sortase A5M shows increasing enzymatic activity with increasing Ca2+ levels. The reaction runs fastest with 2 mM Ca2+, and the maximal FRET change (in terms of ΔRFU) is reached after 37.5 min. Strikingly, the FRET change decreases afterwards. We observed this phenomenon before and assume this to be due to dead-end product formation caused by the reverse reaction.
According to the results of this assay, Sortase A7M is definitely Ca2+-independent, since it shows linking activity without calcium in the vicinity. The enzyme mutant also works in presence of Ca2+ (Fig. 17), but these FRET experiments made us suspect that Ca2+ may even inhibit Sortase A7M, since it shows less activity with calcium around than without calcium.
To better address this question, an ELISA was performed. Therefore, a piece of paper functionalized with GGGβA was connected to a protein domain, which binds antibodies to the LPTEG-tag. The results are shown in Fig. 18.
As shown in Fig. 18, the highest absorption was measured in well 2. Thus, Sortase A7M works more efficiently when no Ca2+ is around. The absorption is also relatively high for the negative control, which can be explained by poor washing before the substrate for Horeseradish peroxidase (HPR) was added. This assay shows the functionality of Sortase A7M even in context of surfaces since we confirmed that Sortase A7M is able to connect tags attached to paper. This shows that the surface structure is not a relevant factor for the enzyme.
When we compare the reaction speed of Sortase A5M and Sortase A7M, Sortase A5M is the clear winner (see Fig. : 19). However, this means of course that the reverse reaction is also faster in the case of Sortase A5M. Consequently, Sortase A7M is the best variant for in vivo modification of our VLPs as it is Ca2+-independent. On the other hand, Sortase A5M is a suitable enzyme variant for in vitro modification due to its high efficiency.
What about other substrates?
Primary Amines
The literature describes Sortase A7M as somewhat „promiscuous“ towards other substrates than GGGG(polyG) as long as the substrate possesses a primary amine. To confirm this, we performed additional assays with other substrates in the lab of Prof. Kolmar. The Sortase A7M used for this assay was stored in the fridge at 4 °C for two weeks. The substrates were TAMRA with a KLPETG bound to TAMRA via the lysine side chain and 3-azidopropanamine as the example for a primary amine. The reaction was performed for two hours at 37 °C. It was then analyzed by electron spray ionization mass spectrometry (ESI-MS) (Fig. 20).
Fig. 20 shows the educt-peak in the mass spectrum. TAMRA with the LPETG-tag weighs 1054 g/mol. Shown above in green is the single charged molecule at 1054.27 g/mol and the double charged molecule at 528.75 g/mol.
Fig. 21 shows the product-peak in the mass spectrum. The primary amine that was taken as an example has a molecular weight of 100 g/mol. After the reaction the glycine of the LPETG-tag has been removed and therefore the product only consists of TAMRA-KLPET-3-azidopropanamine. When adding the two molecular weights and subtracting the weight of the glycine it adds up to a total weight of 1078 g/mol which can be seen in the single loaded 1079.37 g/mol peak (Fig. 21), since the ESI-MS we used has a small error margin. The peak in black again is the double loaded peak at 541.55 g/mol. This clearly shows that the sortase reaction took place. Furthermore, we can conclude that the Sortase A7M accepts any primary amine as a substrate. However, mass spectrum does not show the ratio of educt and product, which is why we cannot estimate whether the turnover is as high as when using a polyG-tag as substrate. Additionally this assay confirms our suspicion that the Sortase A7M is stable at 4 °C and still functional if stored at said temperature for at least two weeks.Yield
For the characterization of Sortase A7M an assay was designed to show the coupling efficiency between the TAMRA-LEPTG and the tetrapeptide GGG-Beta-Alanin (GGGβA) catalyzed by the Sortase. The Sortase reaction was performed for 1h at 30˚C and was stopped by enzyme separation through centrifugal filtration. For analysis mass spectrometry (ESI-MS) was used. The mass spectrometry enables differentiation between products and educts. It allowed us to make an estimate of the product yield. The calculated theoretical molecular masses are 1054 g/mol for TAMRA and 1240 g/mol for TAMRA-LPETGGGβA. Therefore, peaks are expected at mass/n, with n ∈ N. By comparison of the number of corresponding peaks, estimation of the product yield is possible as both molecules possess the same amount of ionizable groups and thus the difference in the ionizability of both molecules is negligible.
In Fig. 22 the 621.56 peak can be assigned to the TAMRA-LEPTGGGβA and the 528.85 to the TAMRA-LPETG. The count ratios of the two molecules mentioned show an excess of the product.
Is Sortase A7M able to attach cargo to P22 coat protein?
We performed the linking reaction with CP-LPETGG and GGGG-mCherry as substrates and applied them to an SDS-PAGE. We saw products at the expected size (28 kDa + 49 kDa = 77 kDa) thus the requirement is fulfilled. However, a lot of additional bands appeared that we did not expect. These bands also appeared when only Sortase A7M and CP were mixed.
Figure 23:
a) Sortase A7M band is at expected height (17.85 kDa). The two negative controls containing only GGGG-mCherry (28 kDa) and CP-LPETGG (49 kDa) at the expected respective heights. b) Shown are sfGFP-SP and CP-LPETGG each incubated with both Sortase A7M and Sortase A5M. Both gels display multimers when coat and a sortase variant are in a sample together.
To investigate this issue, we had a look at the literature and found a matching description in the publication of Patterson et al.. They performed a similar experiment with P22 capsid proteins and observed the same multimers in their SDS-PAGEs . Comparing both SDS-PAGEs, we came to the following assumption:
Because of the promiscuity of Sortase A7M to accept primary amines as substrates, as we discussed previously, the formation of CP multimers occurs, unspecifically catalyzed by Sortase A7M.
Parallel to these experiments, we successfully modified the exterior of pre-assembled VLPs in vitro (VLP assembly). These modified VLPs were homogenous and overall correctly assembled. Therefore, we conclude that the described multimer problem only occurs when Sortase A7M encounters free CP.
Does methionine affect Sortase linking?
Sortase A7M preferably attaches N-terminal poly-G to C-terminal LPETGG. However, the first amino acid of a protein is methionine (to be specific, formylmethionine in bacteria). For our constructs that possess N-terminal polyG-tags, we have to ask ourselves the question: If the initial methionines are not cleaved off after the proteins have been produced, will this interfere with the Sortase reaction?
To investigate this, we cloned and purified another protein: TVMVsite-GGGG-mCherry. This protein can be treated with TVMV-protease, leading to *GGGG-mCherry. This *GGGG-mCherry was then compared to (M)GGGG-mCherry we used in all previous assays.
To investigate this, we cloned and purified two other proteins: TVMVsite-GGGG-mCherry and TEVsite-GGGG-sfGFP. Then we treated these proteins with the respective proteases, resulting in *GGGG-mCherry and *GGGG-sfGFP. Following this *GGGG-mCherry was then compared to (M)GGGG-mCherry which we used in all previous assays. Assays were also conducted on Fig. 24 the processed *GGGG-sfGFP substrate. Fig. 24 confirmed our assumptions that the unprocessed substrate cannot be linked to the sorting motif via Sortase A7M. Subsequently, *GGGG-sfGFP (after protease digest) demonstrate successful linkage via sortase-mediated ligation.
Due to these findings we modified our VLPs with *GGGG-sfGFP.
We performed FRET-assays with TAMRA-LPETG and either of the following reaction partners:
- (M)GGGG-mCherry, a protein sample that might still carry an N-terminal methionine
- *GGGG-mCherry that does not carry any additional N-terminal residue
Before the FRET-assay was started, we adjusted the mCherry-concentrations of both fluorescent protein solutions to the same level. To do so, we diluted them until both showed the same fluorescence at 610 nm.
Figure 25: FRET of the sortase reaction connecting TAMRA-LPETG and GGGG-sfGFP mediated by Sortase A7M. The concentration of the Sortase A7M was kept at the same level why the concentration of sfGFP was either 7.8 mM or 1 mM. The graphs show that the reverse reaction happens earlier if if the GGGG-substrate concentration is lower.
Strikingly, only the (M)GGGG-mCherry construct showed a clear decrease in delta RFU after the maximum delta RFU was reached (at about 160 min).
We assume the following: Although we adjusted the overall mCherry concentration by fluorescence, we cannot determine the absolute amount of MGGGG-mCherry in the (M)GGGG-mCherry sample. However, if this amount was relatively high, the effective substrate concentration that could enter the sortase reaction would be low. That is because MGGGG is a worse sortase substrate than GGGG – if any at all. If we furthermore consider that a low substrate concentration correlates with a faster reverse reaction, we can explain the observed decrease in delta RFU for the (M)GGGG-mCherry sample that contrasts the delta RFU trend of the *GGGG-mCherry sample.
On this basis we can assume that a certain, yet unknown portion of the (M)GGGG-mCherry sample still carries an N-terminal methionine.
These FRET-assays let us assume that methionine disturbs or at least interferes with the sortase reaction mechanism. Indeed, our modeling suggests that methionine affects the interaction of polyG and the flexible loop near the active site of Sortase A7M. Click here if you want to know more about our modeling results!
This strengthens our hypothesis: If there is any amino acid in front of the poly-glycine sequence, substrate binding to Sortase A7M is negatively influenced.
Modeling
Introduction
In synthetic biology, theoretical models are often used to gain insights, predict and
improve
experiments. In our project we are modifying Virus-like particles (VLPs) by attaching
proteins to the
surface of the P22 capsid
through a linker. The linking is
catalyzed using
the enzyme Sortase A7M, which is a calcium independent mutant of the wild type Sortase A
from Staphylococcus aureus. We performed
modeling to predict the unknown structure of the
Sortase A7M, to improve the linker between proteins and therefore optimizing the
modification
efficiency.
Two different modeling approaches were used to determine the structure of Sortase A7M.
We compared
machine learning approaches to traditional comparative, Monte-Carlo based modeling
methods. The
results were evaluated using an energy-scoring function and molecular dynamics (MD)
simulations. The
most promising Sortase A7M structures were used to perform a docking simulation to
screen for
optimal linkers.
Structure determination
In silico modeling and simulation of proteins requires a 3D structure, which can be obtained from the RCSB Protein Data Bank. However, if no 3D structures are annotated, as it is the case with sortase A7M, the structure has to be determined by other means. The structure prediction of sortase A7M was done using two different approaches.
RosettaCM
Results
The run yielded 15,000 structures which have been compared using the Rosetta scoring functions (talaris2013). From the 15,000 structures generated, we inspected the ten best scoring structures.
As can be seen in Fig. 27 27, the most prominent differences can be found in the regions close to the N- and C-terminus. As fluctuations in those regions are not untypical, we decided to use the best scoring structure, candidate S_14771 (Fig. 28), as the input for the simulations to follow.
In order to evaluate the secondary structure of the Sortase A7M candidate S_14771 a Ramachandran plot has been created and compared to the five sortases used as input for the comparitive modeling. Comparisons were also drawn with the Sortase predicted by Deep Learning as well as a database of randomly sampled proteins. Ramachandran plots of dihedral angles (Fig.> 29) can be a first indicator whether the structures computed are valid.
Figure 29: The Ramachandran plot of randomly sampled proteins {zitat} and the input structures of the comparative modeling show similar secondary structures. Secondary structure analysis of both sortase candidates reveals absence of secondary structures for the ML candidate. This is not the case with candidate S_14771 as the Ramachandran plot shows all relevant structures.
The Ramachandran plot (Fig. 29) showing α-helices and β-sheets is a strong indicator of a successful structure determination, as those secondary structures are crucial for the functionality of sortases.
Conclusion
We used machine learning methods, as well as monte-carlo simulations
to
determine the structure of the mutated transpeptidase Sortase A7M.
The machine
learning approach using AlQuarishi's Deep Neural Network yielded a
structure which seemed to
not have any secondary structures. To exclude the possibility of an
error in the
PyMOL visualization software by Schroedinger, a Ramachandran plot
(figure xyz)
was created. The plot shows that no typical secondary structures are
present
which is a strong indicator of a failed approach to determine a
structure.
The approach, using Rosetta Comparative Modeling, yielded
15,000
structures scored with the talaris2013 scoring function. The ten
best structures
were aligned and exhibited almost identical secondary structures
(figure xzy).
The greatest structural differences are present in the N- and
C-terminal
regions. Since terminal regions tend to fluctuate more strongly than
non-terminal segments of the protein, we deemed those fluctuations
non-relevant
for the proteins functionality.
Being the best scoring candidate, structure S_14771 was analyzed
structurally
using a Ramachandran plot (figure xyz). The plot shows all the
relevant and
typical structures sortases exhibits and serves as an indicator for
a
successful structure prediction.
In the steps to follow, a molecular dynamics (MD)
simulation will be performed on both structures. Even though
structure CASP12
does not seem to be a valid structure, refolding processes during a
MD
simulation might lead to a relaxation of the protein and allow for a
promising
prediction of the sortase A7M structure.
Molecular dynamics
Results
The first possible indicators of a stable protein structure are converging root-mean-square deviation (RMSD), small root-mean-square fluctuation (RMSF) values as well as converging radii of gyration. Using the Python software package and the module Biotite we calculated these quantities and plotted the results for both candidate S_14771 and candidate CASP12.
Typical RMSDs and radii of gyration converge towards a value dependent on the size of the protein. Convergence of those quantities can be interpreted as a stable state of the protein structure. As it can be seen in Figures x and y both the RMSD and the radius of gyration stabilize at the same time as the simulation reaches 110,000 ps (110 ns), suggesting a now stabilized structure of candidate S_14771 solvated in water. Another indicator of a functional protein is the RMSF. Instead of being averaged over all atoms, the RMSF is averaged over time with respect to each amino acid. It provides insights in both protein stability and functionality. Fig xzf reveals the RMSF of residues 105 to 115 to be significantly higher than that of other residues. This hints at the presence of a functional unit along these residues. As commented on in the section describing our structure prediction approaches, the N- and C-terminal regions tend to fluctuate more strongly as a result of the absence of stabilizing structures.
RMSD and gyration of radius calculations of candidate CASP12 (figures x and y) provide evidence of folding. However, the RMSF values show values significantly higher, an effect possibly caused by instability or refolding. Nevertheless, the strongest fluctuations, disregarding the terminal regions, can be seen in the region of residue 105 to 115. This insight consolidates the theory that residues 105 to 115 might be a part of a functional unit.
We were unsure whether candidate CASP12 can be considered a plausible structure and how to interpret the findings concerning the prominent fluctuations. Therefore, we decided to perform a Principle Component Analysis.
Principle component analysis
To analyze our system further Principle Component Analysis (PCA) was performed using GROMACS.
Animation 33: A Principle Component Analysis of a fast (blue) and a slow (red) mode showing the most prominent movements of the Cα-chain of candidate S_14771. Both modes show movement of the β6/β7 loop consisting of residues 105 to 115 towards the active site . Thus we can assume that the closing β6/β7 loop is involved in the reaction mechanism.
The results from the Principle Component Analysis of candidate S_14771 (animation xy) show a movement of the residues 105 to 115 towards the active site, supporting our theory that residues 105 to 115 are important for the reaction mechanism. Since the slow mode (red), which shows the most relevant movement of the sortase, moves further towards the active site, it is possible that the β6/β7 loop either closes the binding site of the ligand peptides or even transports one peptide towards the other.
Animation xyz shows the results of the Principle Component Analysis of candidate CASP12. As the RMSF calculations suggested (fig xyz), the whole protein seems to be moving randomly with no directed movement. In addition the active site amino acids are spread across the protein confirming our assumption that the protein is not in a stable or plausible conformation.
Conclusion
We gained evidence that at least on of our Sortase A7M models is a valid and stable candidate by performing various methods to analyse the structural stability and validity of our two Sortase A7M candidates. The candidate S_14771 that was generated using RosettaCM appears to be a fitting candidate not only due to successful analyses, but also since the residues of the active site are close enough to each other to catalyze a ligation reaction. Our model created through deep learning excelled only in terms of RMSD and gyration radius calculations. Not only the RMSF and Principle Component Analysis but also the conformation of the active site have proven candidate CASP12 to be of no use for further calculations as it does not portray a valid conformation of Sortase A7M.
Docking
Now that the binding site of the Sortase had been found, the peptide ligand needed to be inserted into the binding site to create a peptide-protein complex. The procedure of choice for the introduction of a ligand into the binding site of a protein is called docking. In the following sections, we will present the protocol and methods we used as well as the results they yielded.
Results
For sequences MGGGGPPPPPP(M-polyG), GGGGPPPPPP(polyG) and PPPPPPLPETGG(LPETGG) 50,000 structures have been created and clustered. After the clustering the sample consisted of 100 structures of docked complexes.
Analysis of the scores has shown a similar score for all the three dockings. The best scoring results of the LPETGG docking show a tendency of the glycines to face the active site while also being in close proximity to the active site.
Figure lpetgg shows the docking result of the LPETGG peptide to the sortase. The results shown are the best scoring structures of the clustering with respect to the total score, interface score and reweighted score. As the best scoring structure is the same for the total score and the reweighted score only two peptides are shown. This also applies to figures x and y. For both results the reacting glycin residues (yellow) are facing the active site. Additionally, the same residues are in close proximity to the active site.
The figures x ad y show the docking of the both polyG and M-polyG. While polyG results align well and seem to be interacting with the β6/β7 loop rather than with the active site, this does not seem to be the case for M-polyG. Instead of both structures interacting with the β6/β7 loop or active site one (best interaction score; dark blue) interacts with the β6/β7 loop and the other (best reweighted/total score; light blue-gray) appears to interact with the active site.
As can be seen in figure 16 visualizing the result of the the docking simulation total/reweighted score) suggests an interaction of methionine and two of the active sites namely arginine139 and cysteine126. Visualizing the result of the according docking simulation, as can be seen in figure 16, suggests an interaction between methionine and two active site residues, namely arginine139 and cysteine126. Figure 17 shows the interaction of M-polyG with the β6/β7 loop. The glycines still interact with the β6/β7 loop. Instead of binding above the β6/β7 loop, which is the case for polyG as illustrated in fig z, the interaction seems to be influenced by methionine. By interacting with the residues in the β-helix methionine could potentially hinder binding of glycine to the β6/β7 loop by partial immobilization of the peptide. Overall peptide binding and orientation is less uniform compared polyG without the leading methionine, which could be an indicator of lesser binding affinity of M-PolyG towards the β6/β7 loop.
Conclusion
To computationally investigate binding affinities of the polyG and M-polyG as well as the LPETGG tags we performed docking simulations using the Rosetta FlexPepDock application. We used a modified version of the recommended protocol as the modified version was easier to automate and served our purpose better than the standard protocol. From the calculated scores only, we could not see a difference in binding affinities. Thus, we inspected the best scoring structures regarding the total score, the interface score and the reweighted score using PyMOL. Since the best structures with respect to total score and reweighted score were the same for all simulations, only two structures have been inspected per run. A polyproline tag was appended to all the peptides to simulate the modification of the VLPs with a small peptide.
As expected, the results showed that for LPETGG, the glycines of both peptides oriented towards the active site. This is unsurprising as peptides with the sequence LPXTGG are known to be substrate of the Sortase. It was more surprising to see the polyG tag oriented away from the active site since polyG also is a known substrate of the sortase. Both polyG peptides were facing the β6/β7 loop (residues 105 to 115) uniformly and appeared to be interacting with it. The M-polyG peptides did not show a uniform orientation or interaction scheme. On one hand the visualization of the best result concerning the total and reweighted score has shown interaction of methionine with the cysteine126 and arginine139, two residues of the active site. On the other hand, the visualization of the best result with respect to the interface score shows the M-polyG facing the mobile β6/β7 loop. In contrast to the polyG peptide the lacking the methionine, the M-polyG peptide is pulled down below the β6/β7 loop by the methionine interacting with one of the β-sheets leading to the active site. This is not the case with the polgG results, which lie aligned in one plane with the β6/β7 loop.
Modeling Conclusion
For our project it was key to understand and characterize Sortase A7M. As there is no annotated 3D structure for this specific Sortase, an in silico structure determination was performed. This problem was tackled using two different approaches. The Deep Learning approch did not yield a promising model as later analysis also confirmed. Howerver Comparative modeling with Rosetta produced valid structures. We used the best structure, candidate S_14771, for extensive characterization. We evaluated the model with regard to its secondary structure using Ramachandran plots. The Ramachandran plot suggested plausible secondary structures.
Molecular Dynamics simulations were used to investigate stability and dynamic properties of the candidate. The RMSD and radius of gyration stabilized over the course of the simulation, a first indicator of an equilibrated structure. Interestingly, RMSF analysis showed strond fluctuations of residues 105 to 115. We further investigated this by performing Principle Component Analysis. Doing so, we extracted the principle movements of the model. We could observe movement of the β6/β7 loop towards the active site, suggesing the presence of a binding site. Consequently, we performed docking simulations.
FlexPepDock was used to conduct the docking simulations with target peptides. Each run yielded 50,000 structures. In multiple steps we reduced the amount of complexes to 100 clusters with respect to total, reweighted and interface score. We extracted the best scoring complexes and investigated interactions.
For LPETGG we observed a uniform binding to the active site, fullfilling our expectation. Strikingly, polyG appeared to bind to the β6/β7 loop in a uniform manner. As it is know from literature polyG is a functioning ligand of sortase. Supported by literature and our data, we postulate the following mechanism: the β6/β7 loop transports bound polyG towards the active site of Sortase A7M, thereby lowering the activation energy of the linking reaction.
As the theory is neither backed up by nor contradicts experimental data, further research is required.
Sequence and Features
- 10COMPATIBLE WITH RFC[10]
- 12COMPATIBLE WITH RFC[12]
- 21INCOMPATIBLE WITH RFC[21]Illegal XhoI site found at 445
- 23COMPATIBLE WITH RFC[23]
- 25COMPATIBLE WITH RFC[25]
- 1000COMPATIBLE WITH RFC[1000]
None |