Coding

Part:BBa_K5033004

Designed by: Jonas Martin Westphal   Group: iGEM24_Aachen   (2024-09-16)

OncoBiotica: mFadA[B]_GSLinker_CDA[Epsilon]

If you are interested in an overview of the parts designed by the iGEM Team Aachen 2024, visit our Parts page.


This part, developed by iGEM Aachen 2024, is our 'Epsilon' mutant of the basic part BBa_K5033000. Within the codA cytosine deaminase (CDA) domain, it contains a V152A (the valine on position 152 has been exchanged with alanine), a F316C (the phenylalanine on position 316 has been exchanged with cysteine), a D317G (the aspartatic acid on position 317 has been exchanged with glycine) as well as a D314A mutation (the aspartic acid on position 314 has been exchanged with alanine). In the context of our fusionprotein they are V190A, F354C, D355G and D352A mutations. This mutant has been selected for research in the lab with the help of literature research and modeling.
Its main use is to research and understand the catalytic activity of our fusionproteins.
It is a variant of BBa_K5033000 which served as the foundation for exploring the concept of microbiota-directed cancer therapy. This part encodes a fusion protein designed to combine two functionalities. Binding specific bacteria and having an optimized enzymatic function. This part is to be cloned into a vector based on an inducable expression system. iGEM Aachen 2024 used a pET21b(+) vector.
iGEM Aachen 2024 showed that the 'Epsilon' variant has a no significant catalytic activity in comparison to the CDA-wild type fusionprotein. This is one of five mutants analyzed by iGEM Aachen 2024 in addition to the CDA-wild type fusionprotein.
See the other four variants: 'Alpha' (BBa_K5033001), 'Beta' (BBa_K5033002), 'Gamma' (BBa_K5033003) and 'Theta' (BBa_K5033005).

GOI
Figure 1: Schematic view of the fusion protein's coding sequence.

Part Composition

The first protein domain is derived from the part BBa_K4990002 but has been codon optimized for expression in E. coli. It is the mFadA B-domain, found in various Fusobacterium strains. This part has already been well described by the iGEM23_CPU-CHINA team. This domain should be able to bind to FadA pili on Fusobacterium nucleatum and its former subspecies Fusobacterium nucleatum, F. polymorphum, F. vincentii, F. animalis via self assembly.
To further investigate the binding domain's functionality, iGEM Aachen 2024 created the basic part BBa_K5033006. This variant replaces the enzyme in our fusion protein with eGFP as a reporter protein.
The second functional protein domain is linked to the mFadA B-domain by a synthetic flexible linker consisting of Glycin and Serine in alternating order. This linker is eleven amino acids long.

This second functional domain of the fusionprotein is a mutant of the codA cytosine deaminase (CDA) that is native to E. coli. This mutant contains the V152A, F316C, D317G and D314A mutations of the enzyme.
The fusionprotein encoded by this part also contains a downstream hexa-histidine tag for protein purification.


Protein Modeling

To find interesting mutations that shall be investigated in the lab, our team used a research based modeling approach.
Before transformation of this biological part (cloned into the pET21b(+) plasmid backbone), the structure of the expected fusionprotein was modeled.

Selection of the Mutant

The 'Epsilon' (V152A/F316C/D317G/D314A) mutant has a predicted folding stability of -3.37 kcal/mol, making it more stable than the 'Theta' BBa_K5033005 (V152A/F316C/D317G mutant). This enhanced stability can likely be attributed to the D314A mutation (as in 'Alpha' (BBa_K5033001)), as it demonstrates the highest stability among the mutants predicted by CompassR [1], contributing to the overall stabilization of the protein.

Biochemical Properties

The fundamental biochemical properties like molecular mass and extinction coefficient are important for a lot of SynBio work done with proteins. To see an overview of these properties, have a look at figure 2.

properties
Figure 2: Biochemical properties of mFadA[B]_GSLinker_CDA[Epsilon] as described by Benchling.

Protein Structure Prediction

3D
Figure 3: Tertiary structure of mFadA[B]_GSLinker_CDA[Epsilon] as predicted by AlphaFold2. From left to right: mFadA[B] as an alpha helix (blue), the flexible linker, the enzyme-mutant and the freely accessible His-Tag (orange).
The tertiary structure has been predicted using AlphaFold2 by DeepMind. In this case it is especially important, that the binding domain and the His-Tag are freely available.

Modeling of Substrate and Active Site Interaction

Why RoseTTAFold All-Atom?

Proteins rarely act alone. Although substantial progress in the prediction of protein structures has been made, modeling of proteins and their ligands still remains challenging. The development of RoseTTAFold All-Atom (RFAA) aims to tackle this issue by building a neural network that is trained to accurately model general biomolecules containing a wide range of nonprotein components. In contrast to other tools that only include sequence based modeling, RFAA incorporates a graphical representation that models non-protein molecules at the atomic level, capturing their chemical bonds and interactions. In combination with the training data set that also includes ligand-bound protein structures from the Protein Data Bank (pdb), it allows RFAA to predict protein structures, ions and non-protein ligands. Interestingly, during our project, DeepMind released a new AlphaFold version (v3) that includes selected ions and ligands. However, an earlier release would not have been advantageous for us, as 5-FC and cytosine are not among the selected ligands that AlphaFold3 includes. Nevertheless, this shows that the improvements made this year mark a significant step forward, paving the way for more refined and accurate modeling of proteins and ligands in the future.

We observed a good overall structural alignment of the wild type enzymes' crystal structure [2] to the RoseTTAFold All-Atom model. Upon closer inspection of the active site, we noticed small differences in torsion angles of the side chains which naturally led to slight differences in bond lengths between amino acids and the ligand. However, these differences are inherent to the modeling process and do not reflect significant deviations. Therefore, they do not compromise the reliability of our approach for predicting structural changes in the mutants. This allowed us to apply the approach to the generated mutants by CompassR.

Modeling Results

5-FC
Figure 4: Model of interactions of the amino acids in the active site of the fusionprotein with 5-fluorocytosine. As predicted using RoseTTAFold All-Atom.
In the predicted model with 5-FC as the ligand, the effect of the triple substitution (shift of D314) appears to be amplified, increasing the distance from 2.801 Å to 3.690 Å. This could be explained by the fact that D314A alone already shifts this distance, leading to a cumulative effect when combined with the triple substitution. Additionally, the replacement of the carboxyl group with a methyl group results in the same stabilizing effect observed in both the D314A and the triple mutation models, by providing a more favorable environment for the fluoride ion. Importantly, no critical interactions necessary for the catalytic mechanism, such as those involving E217, H245, or D313, are lost.

In the model with cytosine, the pyrimidine ring is once again rotated (like in V152A/F316C/D317G ('Theta' (BBa_K5033005)) and R91T/D314A ('Gamma' (BBa_K5033003))), preventing key amino acids involved in the catalytic mechanism from interacting at their correct positions. This could indicate a shift in selectivity from cytosine to 5-FC. Although the model suggests a possible shift in substrate preference, a published study [3] has empirically demonstrated that mutations at positions 314 and 317 are mutually exclusive, contradicting the predicted shift from cytosine to 5-FC. To evaluate the catalytic function of the enzyme within our fusionprotein we decided to see if we can reproduce the observation that mutations at positions 314 and 317 are mutually exclusive. The mutations alone are very promising and given a small chance of them working in our favour we used the Epsilon mutant to evaluate this possibility.

Cloning of the Plasmid

To build the plasmid containing the gene for our Epsilon variant. We used the plasmid we already had for our WT-Fusionprotein (BBa_K5033000; pET21b(+)_mFad[A]_GSLinker_CDA[WT]). The gene sequence for this part contains a BamHI restriction site between the linker and the enzyme. The backbone contains a XhoI restriction site at the end of the gene insert.
After modeling of the Epsilon variant we ordered the gene fragment, encoding this variant. We made sure to include the correct restriction sites.

The backbone was prepared using the BamHI and XhoI restriction enzymes. After digestion, the cut backbone was cleaned up using an agarose gel and a gel extraction kit. The same was done for the Insert.

After gel cleanup the cut backbone and insert were ligated using the T4 Ligase.

To enhance the efficiency of the plasmid transformation into E. coli BL21 (DE3) the plasmid was first propagated via transformation in E. coli DH5α.

The propagated pET21b(+)_mFadA[B]_GSLinker_CDA[Epsilon] plasmid could then be purified with a plasmid miniprep kit and used for transformation into the production organism E. coli BL21 (DE3).

Producing the Fusionprotein

After successful transformation of the pET21b(+)_mFadA[B]_GSLinker_CDA[Epsilon] plasmid into the production organism E. coli BL21 (DE3) the protein could be expressed and purified. The pET21b(+) backbone has a lac operon (including the lacI repressor), which can be induced with IPTG (IUPAC: Propan-2-yl 1-thio-β-D-galactopyranoside).

Expression and Purification of the Fusionprotein

The fusionprotein was expressed by adding IPTG to the medium to a final concentration of 1mM.
The His-tagged protein was purified using a Protino Ni-IDA 2000 packed column by Macherey & Nagel®.

purification
Figure 5: SDS pages showing the proteins in the elution fractions. The number corresponds to the imidazole concentration (in mM) in the elution buffer respectively. Example: E10 is an elution buffer with 10mM imidazole.
The fusionptrotein is expected to have a molecular weight of 51.95kDA (cf. Fig. 2). This corresponds to the big bands visible on the gel.
This gel shows that the E10 fraction still has a lot of impurities. The E50 fraction was desalted and stored in 50mM TRIS buffer, to use for the kinetic assays.


Kinetic Assays

High-Performance Liquid Chromatographie (HPLC)

We used Reverse Phase High Performance Liquid Chromatography for quantitative Analysis of 5-fluorocytosine and 5-fluorouracil in mutual solution. The results seen below were all measured with the same method. (see Experiments page)
Standards at between 10 µM and 500 µM were made to translate the peak area into compound concentration. After measuring, the chromatograms were evaluated with “OpenChrom” by Lablicate. For this, a baseline subtraction filter was applied, after this the standard first derivitave peak detector and trapezoid peak integrator were run. We identified para-aminobenzoicacid as a potential internal standard, but no problems which would necessitate the use of an internal standard arose.

example HPLC
Figure 6: Example HPLC Chromatogram obtained from measurement using 400 µM 5-FC and 200 µM 5-FU; the first peak being the 5-FC, the second 5-FU.


Enzyme Kinetics

HPLC measurements have shown that the Epsilon variant has no significant catalytic activity, compared to the wild type fusionprotein.

In figure 7 you can see an analysis of the reaction mixture indicating no significant formation of product within the first five minutes of reaction time. Other variants like Alpha (BBa_K5033001), Beta (BBa_K5033002) and Gamma (BBa_K5033003) showed nearly a complete turnover from 5-FC to 5-FU within this time frame (when measured with the same concentrations).

Figure 8 displays the last sample taken after 35 minutes of reaction time. In this time frame the wild type fusionprotein showed clear signs of catalytic activity. The Epsilon variant on the other hand only shows a minimal second peak, which may correspond to the formation of some 5-FU.

example HPLC
Figure 7: HPLC Chromatogram obtained from measurement after 5 minutes of reaction time.

example HPLC
Figure 8: HPLC Chromatogram obtained from measurement after 35 minutes of reaction time.
epsilon-35mins.png

Conclusion

Due to the very little formation of our product 5-FU within the analyzed time frame we came to the conclusion, that the combination of the tripple mutation (described for Theta (BBa_K5033005)) with the very promising D314A mutation (as in Alpha (BBa_K5033001)) does not result in a potent fusionprotein.
We rather showed that the catalytic activity is extremely low when combining these mutations. This matches with the results from the study cited for in the modeling part [3].

In conclusion the codA CDA seems to have similar behaviour within our fusionprotein in comparisson to on its own (the existing studies analyzed mutations for the enzyme as it is).

References

[1] Cui, H., Cao, H., Cai, H., Jaeger, K., Davari, M.D., Schwaneberg, U., 2020. Computer‐Assisted Recombination (CompassR) Teaches us How to Recombine Beneficial Substitutions from Directed Evolution Campaigns. Chemistry – A European Journal 26, 643–649.. https://doi.org/10.1002/chem.201903994
[2] PDB Entry - 1RA0. https://doi.org/10.2210/pdb1RA0/pdb
[3]



Sequence and Features


Assembly Compatibility:
  • 10
    COMPATIBLE WITH RFC[10]
  • 12
    COMPATIBLE WITH RFC[12]
  • 21
    INCOMPATIBLE WITH RFC[21]
    Illegal BamHI site found at 115
    Illegal XhoI site found at 1396
  • 23
    COMPATIBLE WITH RFC[23]
  • 25
    INCOMPATIBLE WITH RFC[25]
    Illegal NgoMIV site found at 1254
    Illegal NgoMIV site found at 1341
  • 1000
    COMPATIBLE WITH RFC[1000]
[edit]
Categories
Parameters
None