Part:BBa_K5007000
CsHCS1-T2A-CsOLS-T2A-CsOAC
Composite coding sequence for the enzymes needed for the synthesis of olivetolic acid, a precursor in cannabinoid synthesis, from hexanoic acid, intercalated by the T2A self-cleaving sequence. Composite contains parts BBa_K5007002, BBa_K5007003 and BBA_K4393003, intercalated by part BBA_K1993019.
This composite CDS is for use in eukaryotic chassis, and is currently optimized for Saccharomyces cerevisiae. Assembled with no scars, suited for cloning and assembling into plasmids and other parts through Gibson Assembly.
Usage and Biology
Sequence and Features
- 10COMPATIBLE WITH RFC[10]
- 12COMPATIBLE WITH RFC[12]
- 21COMPATIBLE WITH RFC[21]
- 23COMPATIBLE WITH RFC[23]
- 25INCOMPATIBLE WITH RFC[25]Illegal NgoMIV site found at 228
Illegal NgoMIV site found at 1531
Illegal AgeI site found at 1090
Illegal AgeI site found at 3271 - 1000INCOMPATIBLE WITH RFC[1000]Illegal BsaI site found at 771
Design Choices
This composite CDS was designed with the coding sequences for CsHCS1, OLS and OAC. These enzymes are responsible for the synthesis of Olivetolic Acid from hexanoate, a precursor to several cannabinoids. This composite is designed for eukaryotic chassis, due to the T2A self-cleaving sequence placed in between each enzyme’s coding sequence, which is highly efficient in separating multiple enzymes from a monocistronic mRNA, as is the one encoded by this part, but is only functional in eukaryotes.
Figure 1: Diagram for part BBa_K5007000
This composite, as is, is optimized for Saccharomyces cerevisiae, but it is designed in such a way that further optimization for other eukaryotes, such as Pichia pastoris, Aspergillus nidulans or </i>Nicotiana benthamiana</i> is fast and easy, allowing this part to be re-designed, improved and developed into different cassettes and chassis. This ease-of-use is intended for the further development of an open-source cannabinoid cell-factory, facilitating the access to these important medications for patients in need.
Structural Modeling
The addition of the self-cleaving T2A sequence is not entirely flawless. It is highly efficient in the separation of different proteins in a single monocistronic mRNA, but when the synthesized proteins are separated, they are left with fragments in the N-, C- or both terminals. In order to assure that these fragments will not affect each enzyme’s function, structural modeling and model evaluation was carried out for each enzyme, with each possible position for the T2A fragments.
Structural comparative modeling was carried out via the MODELLER extension on PyMOL, using the respective, previously modeled, native enzyme as template and the modified sequence as query. The template and query were aligned via MUSCLE, and the generated models were color-annotated.
The energetic and geometric quality of each generated model was evaluated through QMEAN and MolProbity, in order to assure the modified proteins had not lost their important structural and stability features. This composite was developed with the enzymes’ coding sequences positioned in their best configurations - that is, the position where the fragment(s) has(have) the least effect in the protein’s structure and function.
The enzymes encoded in this composite are depicted below, with the T2A fragments shown in red.
Figure 2:Cartoon-surface models for the CsHCS1 (in green), OLS (in tan) and OAC (in purple). T2A fragments shown in red. OLS and OAC are shown in monomeric states.
Phylogenetics and docking
Cannabinoids, Cannabis and its compounds were, for a long time, taboos and illegal. Even though in many countries and regions that is still the case, there has been lots of research finally being conducted regarding these fascinating and pharmacologically relevant compounds. Due to the delay and relatively recent endeavors in elucidating the mechanism and structure of the enzymes needed to synthesize cannabinoids, there are not many crystallized structures available for the scientific community, and even information regarding the proteins’ catalytic and binding positions.
As the proteins encoded in this composite were modified, there was a need to assure that this modification would not compromise the structural features needed for the binding and conversion of substrates and cofactors. So, a phylogenetic analysis of each protein was performed, and these results were cross-examined with docking studies of substrates and cofactors.
The phylogenetic analysis was performed first by running a BLAST analysis on the FASTA sequences of each of the proteins in this composite. At least five sequences were selected and aligned with the proteins, and conservation of catalytic and binding positions across the different, but similar, proteins from different species was evaluated, showing that catalytic and binding pockets that were well characterized in other proteins were also present in the proteins contained in this composite. Therefore, it was possible to infer that these conserved regions were the sites of interest in the enzymes coded in this composite. It is worth noting that most proteins in this composite were not well-characterized regarding the binding and catalytic sites, so this analysis provided important information for not only this project, but also for the entire scientific community.
Docking studies were performed at HADDOCK, ClusPro 2.0 and CavityPlus, and the resulting models were superimposed in PyMOL, generating our final models.
Figure 3: Phylogenetic analysis performed for CsHCS1. Substrate binding sites highlighted in green, cofactor binding sites highlighted in red.
Figure 4: Phylogenetic analysis performed for OLS. Catalytic residue highlighted in yellow.
Figure 5: Phylogenetic analysis performed for OAC. Substrate binding sites highlighted in green, cofactor binding sites highlighted in red. Catalytic residue highlighted in yellow.
The phylogenetic analysis allowed for the confirmation of the docking models, showing that the modified proteins are, indeed, structurally predicted to behave normally. The following images show our docking results, as confirmed by cross-examining the phylogenetic analysis results and docking models generated by the molecular docking servers.
Figure 6:Docking model for CsHCS1 in this composite. Main chain shown in purple, cavities shown in tan, phylogenetic analysis results in red, substrate shown in green. Cavity and substrate are in the vicinity of the phylogenetic analysis’ predicted positions.
Figure 7: Docking model for OLS (monomeric state) in this composite. Main chain shown in cyan, cavities shown yellow, phylogenetic analysis results in red, substrates and products shown in ball-and-stick models. Cavity and substrate are in the vicinity of the phylogenetic analysis’ predicted positions.
Figure 8: Docking model for OAC (dimeric state) in this composite. Main chain shown in green, cavities shown yellow, phylogenetic analysis results in red, substrates and products shown in pink ball-and-stick models. Substrates are in the vicinity of the phylogenetic analysis’ predicted positions.
proteins | This part contains the CDS for the Cannabis sativa Acyl activating enzyme 1; 3,5,7-trioxododecanoyl-CoA synthase and Olivetolic acid cyclase, which work sequentially to produce olivetolic acid, a precursor to cannabinoid biosynthesis. The final enzymes will have T2A fragments associated with them in the N-, C- or both terminals. Our basis for designing these parts are the references contained in doi:10.1038/s41586-019-0978-9, doi:10.1016/j.jbiotec.2017.07.008 and doi:10.1371/journal.pone.0018556. |