Part:BBa_K5007006

CsCBDAS-T2A-CsPT4-NoTP

Composite coding sequence for the enzymes needed for the synthesis of cannabidiolic acid, a phytocannabinoid of pharmaceutical interest, from olivetolic acid and geranyl-diphosphate, intercalated by the T2A self-cleaving sequence. Composite contains parts BBa_K5007004, and BBa_K5007001, intercalated by BBa_K1993019. This composite CDS is for use in eukaryotic chassis, and is currently optimized for Saccharomyces cerevisiae. Assembled with no scars, suited for cloning and assembling into plasmids and other parts through Gibson Assembly.

Usage and Biology

Sequence and Features

Assembly Compatibility:

10
COMPATIBLE WITH RFC[10]
12
COMPATIBLE WITH RFC[12]
21
INCOMPATIBLE WITH RFC[21]
Illegal BglII site found at 306
Illegal BglII site found at 768
23
COMPATIBLE WITH RFC[23]
25
COMPATIBLE WITH RFC[25]
1000
INCOMPATIBLE WITH RFC[1000]
Illegal BsaI.rc site found at 2412
Illegal SapI site found at 1986

Design Choices

This composite CDS was designed with the coding sequences for CsCBDAS and CsPT4. These enzymes are responsible for the synthesis of Cannabidiolic Acid, the direct precursor to cannabidiol. This composite is designed for eukaryotic chassis, due to the T2A self-cleaving sequence placed in between each enzyme’s coding sequence, which is highly efficient in separating multiple enzymes from a monocistronic mRNA, as is the one encoded by this part, but is only functional in eukaryotes.

Figure 1: Diagram for part BBa_K5007006

This composite, as is, is optimized for Saccharomyces cerevisiae, but it is designed in such a way that further optimization for other eukaryotes, such as Pichia pastoris, Aspergillus nidulans or Nicotiana benthamiana is fast and easy, allowing this part to be re-designed, improved and developed into different cassettes and chassis. This ease-of-use is intended for the further development of an open-source cannabinoid cell-factory, facilitating the access to these important medications for patients in need.

Structural modeling

The addition of the self-cleaving T2A sequence is not entirely flawless. It is highly efficient in the separation of different proteins in a single monocistronic mRNA, but when the synthesized proteins are separated, they are left with fragments in the N-, C- or both terminals. In order to assure that these fragments will not affect each enzyme’s function, structural modeling and model evaluation was carried out for each enzyme, with each possible position for the T2A fragments.

Structural comparative modeling was carried out via the MODELLER extension on PyMOL, using the respective, previously modeled, native enzyme as template and the modified sequence as query. The template and query were aligned via MUSCLE, and the generated models were color-annotated.

The energetic and geometric quality of each generated model was evaluated through QMEAN and MolProbity, in order to assure the modified proteins had not lost their important structural and stability features. This composite was developed with the enzymes’ coding sequences positioned in their best configurations - that is, the position where the fragment(s) has(have) the least effect in the protein’s structure and function.

The enzymes encoded in this composite are depicted below, with the T2A fragments shown in red.

Figure 2: Cartoon-surface models for the CsPT4 (left) and CsCBDAS (right). T2A fragments shown in red.

Phylogenetics and docking

Cannabinoids, Cannabis and its compounds were, for a long time, taboos and illegal. Even though in many countries and regions that is still the case, there has been lots of research finally being conducted regarding these fascinating and pharmacologically relevant compounds. Due to the delay and relatively recent endeavors in elucidating the mechanism and structure of the enzymes needed to synthesize cannabinoids, there are not many crystallized structures available for the scientific community, and even information regarding the proteins’ catalytic and binding positions.

As the proteins encoded in this composite were modified, there was a need to assure that this modification would not compromise the structural features needed for the binding and conversion of substrates and cofactors. So, a phylogenetic analysis of each protein was performed, and these results were cross-examined with docking studies of substrates and cofactors.

The phylogenetic analysis was performed first by running a BLAST analysis on the FASTA sequences of each of the proteins in this composite. At least five sequences were selected and aligned with the proteins, and conservation of catalytic and binding positions across the different, but similar, proteins from different species was evaluated, showing that catalytic and binding pockets that were well characterized in other proteins were also present in the proteins contained in this composite. Therefore, it was possible to infer that these conserved regions were the sites of interest in the enzymes coded in this composite. It is worth noting that most proteins in this composite were not well-characterized regarding the binding and catalytic sites, so this analysis provided important information for not only this project, but also for the entire scientific community.

Docking studies were performed at HADDOCK, ClusPro 2.0 and CavityPlus, and the resulting models were superimposed in PyMOL, generating our final models.

Figure 3: Phylogenetic analysis performed for CsPT4. Substrate binding sites highlighted in green, cofactor binding sites highlighted in red.

Figure 4: Phylogenetic analysis performed for CsCBDAS. Substrate binding sites highlighted in green, cofactor binding sites highlighted in red.

The phylogenetic analysis allowed for the confirmation of the docking models, showing that the modified proteins are, indeed, structurally predicted to behave normally. The following images show our docking results, as confirmed by cross-examining the phylogenetic analysis results and docking models generated by the molecular docking servers.

Figure 5: Docking model for CsPT4 in this composite. Main chain shown in purple, cavities shown in tan, phylogenetic analysis results in red, substrates shown in ball-and-stick models. Cavity and substrate are in the vicinity of the phylogenetic analysis’ predicted positions.

Figure 6: Docking model for CsCBDAS in this composite. Main chain shown in purple, cavities shown in tan, phylogenetic analysis results in red, substrates shown in ball-and-stick models. Cavity and substrate are in the vicinity of the phylogenetic analysis’ predicted positions.

References

Luo X, Reiter MA, d'Espaux L, Wong J, Denby CM, Lechner A, Zhang Y, Grzybowski AT, Harth S, Lin W, Lee H, Yu C, Shin J, Deng K, Benites VT, Wang G, Baidoo EEK, Chen Y, Dev I, Petzold CJ, Keasling JD. Complete biosynthesis of cannabinoids and their unnatural analogues in yeast. Nature. 2019 Mar;567(7746):123-126. doi: 10.1038/s41586-019-0978-9. Epub 2019 Feb 27. Erratum in: Nature. 2020 Apr;580(7802):E2. PMID: 30814733.

Taura F, Sirikantaramas S, Shoyama Y, Yoshikai K, Shoyama Y, Morimoto S. Cannabidiolic-acid synthase, the chemotype-determining enzyme in the fiber-type Cannabis sativa. FEBS Lett. 2007 Jun 26;581(16):2929-34. doi: 10.1016/j.febslet.2007.05.043. Epub 2007 May 25. PMID: 17544411.

Kim J H, Lee S R, Li L H, et al. High cleavage efficiency of a 2A peptide derived from porcine teschovirus-1 in human cell lines, zebrafish and mice. [J]. Plos One, 2011, 6(4): e18556.