Coding

Part:BBa_K5033000

Designed by: Jonas Martin Westphal   Group: iGEM24_Aachen   (2024-09-13)

OncoBiotica: mFadA[B]_GSLinker_CDA[WildType]

If you are interested in an overview of the parts designed by the iGEM Team Aachen 2024, visit our Parts page.


This part, developed by iGEM Aachen 2024, serves as the foundation for exploring the concept of microbiota-directed cancer therapy. It encodes a fusion protein designed to combine two functionalities. Binding specific bacteria and having an enzymatic function. This part is to be cloned into a vector based on an inducable expression system. iGEM Aachen 2024 used a pET21b(+) vector.
iGEM Aachen 2024 successfully demonstrated that the enzymatic function remains intact within the fusion protein. The team even analyzed the catalytic behavior by testing five variants of the fusionprotein.
See: 'Alpha' (BBa_K5033001), 'Beta' (BBa_K5033002), 'Gamma' (BBa_K5033003), 'Epsilon' (BBa_K5033004) and 'Theta' (BBa_K5033005).

GOI
Figure 1: Schematic view of the fusion protein's coding sequence.

Part Composition

The first protein domain is derived from the part BBa_K4990002 but has been codon optimized for expression in E. coli. It is the mFadA B-domain, found in various Fusobacterium strains. This part has already been well described by the iGEM23_CPU-CHINA team. This domain should be able to bind to FadA pili on Fusobacterium nucleatum and its former subspecies Fusobacterium nucleatum, F. polymorphum, F. vincentii, F. animalis via self assembly.
To further investigate the binding domain's functionality, iGEM Aachen 2024 created the basic part BBa_K5033006. This variant replaces the enzyme in our fusion protein with eGFP as a reporter protein.
The second functional protein domain is linked to the mFadA B-domain by a synthetic flexible linker consisting of Glycin and Serine in alternating order. This linker is eleven amino acids long.

This second functional domain of the fusionprotein is the codA cytosine deaminase (CDA) that is native to E. coli. This Enzyme converts cytosine to uracil in its host organism but it is also able to convert 5-fluorocytosine (5-FC) to 5-fluorouracil (5-FU).[1]

In this case the Enzyme can be used for an enzyme directed prodrug therapy. To be precise, 5-FC is the non-toxic substrate and 5-FU is the active chemotherapeutic agent.
The fusionprotein encoded by this part also contains a downstream hexa-histidine tag for protein purification.


Protein Modeling

Before transformation of this biological part (cloned into the pET21b(+) plasmid backbone), the structure of the expected fusionprotein was modeled.

Biochemical Properties

The fundamental biochemical properties like molecular mass and extinction coefficient are important for a lot of synBio work done with proteins. To see an overview of these properties, have a look at figure 2.

properties
Figure 2: Biochemical properties of mFadA[B]_GSLinker_CDA[WT] as described by Benchling.

Protein Structure Prediction

3D
Figure 3: Tertiary structure of mFadA[B]_GSLinker_CDA[WT] as predicted by AlphaFold2. From left to right: mFadA[B] as an alpha helix (blue), the flexible linker, the enzyme and the freely accessible His-Tag (orange).
The tertiary structure has been predicted using AlphaFold2 by DeepMind. In this case it is especially important, that the binding domain and the His-Tag are freely available.

Modeling of Substrate and Active Site Interaction

Why RoseTTAFold All-Atom?

Proteins rarely act alone. Although substantial progress in the prediction of protein structures has been made, modeling of proteins and their ligands still remains challenging. The development of RoseTTAFold All-Atom (RFAA) aims to tackle this issue by building a neural network that is trained to accurately model general biomolecules containing a wide range of non-protein components. In contrast to other tools that only include sequence based modeling, RFAA incorporates a graphical representation that models non-protein molecules at the atomic level, capturing their chemical bonds and interactions. In combination with the training data set that also includes ligand-bound protein structures from the Protein Data Bank (pdb), it allows RFAA to predict protein structures, ions and non-protein ligands. Interestingly, during our project, DeepMind released a new AlphaFold version (v3) that includes selected ions and ligands. However, an earlier release would not have been advantageous for us, as 5-FC and cytosine are not among the selected ligands that AlphaFold3 includes. Nevertheless, this shows that the improvements made this year mark a significant step forward, paving the way for more refined and accurate modeling of proteins and ligands in the future.

We observed a good overall structural alignment of the wild type enzymes' crystal structure [2] to the RoseTTAFold All-Atom model. Upon closer inspection of the active site, we noticed small differences in torsion angles of the side chains which naturally led to slight differences in bond lengths between amino acids and the ligand. However, these differences are inherent to the modeling process and do not reflect significant deviations.

Modeling Results

5-FC
Figure 4: Model of interactions of the amino acids in the active site of the fusionprotein with 5-fluorocytosine. As predicted using RoseTTAFold All-Atom.

To compare the structural differences when using cytosine and 5-fluorocytosine (5-FC) as substrates RoseTTAFold All-Atom was employed to model the interactions of both ligands within the enzyme's active site.

Both models revealed the presence of key amino acids required for substrate stabilization and enzymatic activity. Specifically, E217, H246, and D313 must adopt the correct conformation for water activation and subsequent proton transfer, as well as for metal coordination during catalysis [3]. These amino acids were observed in both cytosine- and 5-FC-bound structures. The hydrogen bonds modelled between E217, D313 and the amine group of the substrate, suggest that the reaction is feasible with either substrate.

Additionally, D314 was positioned near the substrate in both structures, forming interactions. With cytosine, an extra hydrogen bond forms with the amine group, providing enhanced stabilization. In contrast, with 5-FC, a repulsive interaction occurs between the negatively charged carboxyl group of D314 and the partially negatively charged fluorine atom. This electrostatic repulsion likely explains the reduced activity of the wild type enzyme with 5-FC as a substrate.

Furthermore, Q156 forms a hydrogen bond with the pyrimidine ring in both substrate-bound structures, contributing to substrate stabilization. Literature suggests a second hydrogen bond may also form, though this bond appears to be absent, likely due to a slight difference in the torsion angle of Q156 [3]. As previously discussed, such minor inconsistencies in torsion angles seem to be a limitation of RoseTTAFold All-Atom predictions.


Producing the Fusionprotein

After successful transformation of the pET21b(+)_mFadA[B]_GSLinker_CDA[WT] plasmid into the production organism E. coli BL21 (DE3) the protein could be expressed and purified. The pET21b(+) backbone has a lac operon (including the lacI repressor), which can be induced with IPTG (IUPAC: Propan-2-yl 1-thio-β-D-galactopyranoside).

Expression of the Fusionprotein

EXP
Figure 5: SDS page showing the proteins in the soluble fraction of the cells after expression of the fusionprotein with 0.1mM IPTG in TB medium (TB1), 0.5mM IPTG in TB (TB2), 1.0mM IPTG in TB (TB3) and 0.1mM IPTG in LB medium (LB4) respectively.
The fusionptrotein is expected to have a molecular weight of 52.12kDA (cf. Fig. 2). This corresponds to the big bands visible on the gel.
Testing different media (LB and TB) as well as different IPTG concentrations we determined TB medium induced with a final concentration of 1mM IPTG to work best. This way of expression resulted in the thickest band on the gel. Because all samples were handled identically, this indicates the highest concentrations of soluble fusionprotein in the cells that were induced that way.

Purification of the Fusionprotein

The His-tagged protein was purified using a Protino Ni-IDA 2000 packed column by Macherey & Nagel®.

purification
Figure 6: SDS page showing the proteins in the (dilluted) cell lysate, flow through, washing and elution fractions. The number corresponds to the imidazole concentration (in mM) in the elution buffer respectively. Example: E25 is an elution buffer with 25mM imidazole.
This gel shows an rather early elution of the fusionprotein from the Ni-IDA column. Following this the elution fractions were altered for other purifications (including the purifications of the five variants of this fusionprotein). The new elution fractions were: E10, E12.5, E25, E50 and E100.


Kinetic Assays

Finally, we confirmed the functionality and activity of the purified enzyme using HPLC and NMR. These analyses verified that the enzyme retained its catalytic function, demonstrating that the expression and purification steps were successfully executed.
If you are interested in the methods used, take a look at our Experiments page.


High-Performance Liquid Chromatographie (HPLC)

We used Reverse Phase High Performance Liquid Chromatography for quantitative Analysis of 5-fluorocytosine and 5-fluorouracil in mutual solution. The results seen below were all measured with the same method (found on the Experiments page). Standards at between 10 µM and 500 µM were made to translate the peak area into compound concentration. After measuring, the chromatograms were evaluated with “OpenChrom” by Lablicate. For this, a baseline subtraction filter was applied, after this the standard first derivitave peak detector and trapezoid peak integrator were run. We identified para-aminobenzoicacid as a potential internal standard, but no problems which would necessitate the use of an internal standard arose.
In figure 7 you can see an example chromatogram from an HPLC measurement with our substrate 5-FC and our product 5-FU.

Fig. 7: Example HPLC Chromatogram obtained from measurement using 400 µM 5-FC and 200 µM 5-FU; the first peak being the 5-FC, the second 5-FU.


19F-NMR

To get insights into the enzyme's kinetic, NMR (nuclear magnetic resonance) experiments, especially 19F-NMR, were performed for the wild type and the gamma mutant (BBa_K5033003). The results of the wild type in the following.

Reaction monitoring of the wild type

Fig. 8 shows exemplarly the third 19F-NMR spectrum of the wild type's examination which was taken 26 minutes after the reaction's start. Note that this spectrum is shown to see all peaks and not for comparison with the gamma mutant's spectra.

Fig. 8 Third 19F-NMR spectrum of the wild type's examination using an NMR device as stated in the protocol.

Due to prior measurements of the substrate and the product, the singlet signals at chemical shifts δ = -167.98 ppm and δ = -169.14 ppm can be assigned to 5-fluorocytosine (5-FC) and 5-fluorouracil (5-FU), respectively. The compound causing the signal located at δ = -119.77 ppm is not identified but it could be a impurity of the 5-FC because the supplier of 5-FC guaranteed only a purity of 97%. Moreover, the signal does not increase or decrease over time which leads to the assumption that this compound is not a by-product of the enzymatic reaction. Focussing on the integrals of 5-FC (I5-FC = 323.4E9) and 5-FU (I5-FU = 73.89E9), it is clear that the amount of the substrate is higher at the beginning compared to the amount of product since the peak areas are an indicator for the relative abundance of both compounds in the reaction mixture.

As mentioned in the protocol for NMR experiments of the enzym's kinetic, the integrals of the substrate's and product's signal are used to compute ratios of the products's and substrate's integral Ii. The integrals might be taken from the analysing software such as MestReNova.

If the amount of substance of the substrate is known, it can be multiplied with the integral ratio introduced above leading to the current substrate's amount of substance as mentioned in the protocol.

The integrals shown in Fig. 9 were evaluated as stated in the protocol - note that the y-axis contains the number of each spectrum indicating progressing time.

Fig. 9: Stacked 19F-NMR spectra of the wild type's examination using an NMR device as stated in the manual. The left signal shows the decreasing substrate's amount and the right signal the increasing product amount in the reaction mixture.

Focussing on the integrals, Fig. 10 shows the evolution of the substrate's and product's integrals over time.

Fig. 10: Evolution of the integrals of the wild type's examination obtained from the reaction monitoring.

To obtain the integrals, the range was set for 5-FC as δ = [-167.797,-168.198] and for 5-FU as δ = [-169.048,-169.297]. The green curve which represents the integral of 5-FC's signal is constantly decreasing from 333.8E9 to 91.57E9 which is according to our expectation because the substrate is consumed by the enzyme and therefore the amount of it in the reaction mixture decreases. The shape of this curve is similar to the one known from enzyme kinetics (reaction velocity vs. time, also known as process curve).

The data points do not represent a perfect curve because there are some outlier (e.g. for the spectrum measured at t = 1036 min), there are two possible reasons for them: First, before each measurement there is a shimming process that homogenizes the magnetic field may leading to small deviations. The second reason could be side reactions that do lead to other products than 5-FU. Hence, no other signals as mentioned came up, this effect can be neglected.

The orange curve represents the increasing product amount in the reaction mixture. Here, the integral value increases from 75.52E9 at the beginning to 269.8E9 at the end which is also according to the literature. The reason for the curve's behavior are stated above: Since the substrate is consumed by time, more and more product comes up leading to a higher amount of product in the reaction mixture which effects an increasing integral value.

It can be summerized that the integrals give a first hint for a successful reaction since the product is forming.

Conversion Rate of the Reactions Using the Wild Type Fusionprotein

An important quantity for reactions is the so called conversion rate X which describes how much of substrate is converted at a given time. To obtain this quantity, the concentrations for the substrate and product ci for each time have to be computed as described in the protocol on our Experiments page. To get the concentrations, the amount of substance ni need to be divided by the volume of the reaction mixture (see protocol). In the next step, the conversion rate can be calculated using the following equation where ctotal = 20 mM.



Enzyme Kinetics

The enzyme kinetics of cytosine deaminase followed Michaelis-Menten kinetics; however, the limited solubility of 5-fluorocytosine (5-FC) prevented us from achieving substrate concentrations sufficient to reach the enzyme’s maximum velocity (Vmax). This issue introduced variability, complicating the reliable determination of key kinetic parameters such as the Michaelis constant (Km) and Vmax. Small fluctuations in substrate concentration and enzyme activity further impacted measurement accuracy, making direct kinetic analysis challenging.

To comprehensively evaluate the wild type enzyme's performance, three key metrics were used: (1) the time required to reach half-maximal product formation t1/2RP, (2) the initial reaction rate (initial velocity) v0, and (3) the total product formed over time, which served as a measure of relative efficiency RFA. These parameters were determined for each method (HPLC and NMR) by generating graphs where relative substrate and product concentrations were plotted against time. The time points at which half-maximal product formation was reached were graphically determined and presented in Figure 10, which illustrates the substrate and product concentrations plotted against time for the wild type fusionprotein.

To quantify the enzyme's relative efficiency, the area under the curve was calculated for each method over a defined time interval, assuming linearity between data points. This area represents the total product formed over time and serves as a measure of the enzyme’s efficiency. The initial reaction velocities were approximated using the linear slope of the first 80 seconds of the reaction for HPLC data. In NMR experiments, due to the lower enzyme concentrations and longer reaction times, the initial velocity was calculated over a longer interval. In Figure 11, the first data points for the wild type enzyme are plotted against time, with the slope of the linear regression representing the relative velocity.

A
B



C


Fig. 11: Graph of the relative amount of substance nrel plotted against the time t for A and B HPLC measurements of the wild type and C an NMR measurement of the wild type.


Conclusion

In conclusion, we successfully demonstrated that the wild type cytosine deaminase functions within the context of the fusion protein, effectively converting 5-fluorocytosine (5-FC) into its active product. Since the used analyzing techniques are not sufficient to absolutely determine the product's structure, further analytics should be done for complete structural elucidation. For this reason we got into contact with the mass spectrometry unit of the IOC (institute of Organic Chemistry) to plan those for the future. However, as previously documented in the literature, the turnover rate of 5-FC for the wild type enzyme is suboptimal, and its affinity for cytosine is significantly higher. This poses a challenge for therapeutic applications, where selective activation of 5-FC is critical to avoid off-target effects in the body. As part of our protein engineering efforts, we sought to address this issue by improving both the catalytic activity and the selectivity of the enzyme. We used computational modeling to identify mutations that could enhance 5-FC conversion while reducing cytosine affinity. Details of these efforts can be found on the Dry Lab page. The laboratory results of these engineered variants are presented in the Results page for protein variants (Results page; and on the part pages respectively), which outlines how our modifications translated into improved enzyme kinetics and selectivity.


References

[1] Aučynaitė, A., Rutkienė, R., Tauraitė, D., Meškys, R., Urbonavičius, J., 2018. Discovery of Bacterial Deaminases That Convert 5-Fluoroisocytosine Into 5-Fluorouracil. Frontiers in Microbiology 9.. https://doi.org/10.3389/fmicb.2018.02375
[2] PDB Entry - 1RA0. https://doi.org/10.2210/pdb1RA0/pdb
[3] Hall, R.S., Fedorov, A.A., Xu, C., Fedorov, E.V., Almo, S.C., Raushel, F.M., 2011. Three-Dimensional Structure and Catalytic Mechanism of Cytosine Deaminase. Biochemistry 50, 5077–5085.. https://doi.org/10.1021/bi200483k


Sequence and Features


Assembly Compatibility:
  • 10
    COMPATIBLE WITH RFC[10]
  • 12
    COMPATIBLE WITH RFC[12]
  • 21
    INCOMPATIBLE WITH RFC[21]
    Illegal BamHI site found at 115
    Illegal XhoI site found at 1396
  • 23
    COMPATIBLE WITH RFC[23]
  • 25
    INCOMPATIBLE WITH RFC[25]
    Illegal NgoMIV site found at 1254
    Illegal NgoMIV site found at 1341
  • 1000
    COMPATIBLE WITH RFC[1000]
[edit]
Categories
Parameters
None