Part:BBa_K316012

TEV protease S219P autocatalysis resistant variant

TEV protease S219P autocatalysis resistant variant. This part had been reversed for the 3' strand in order to reduce any read-through that may be caused by upstream elements.

Introduction :

This is the nuclear inclusion protease, endogenous to Tobacco Etch Virus and is used in the late lifecycle to cleave polyprotein precursors. The recognition sequence is ENLYFQG/S 1 between QG or QSDue to it’s stringent sequence specificity, TEV is commonly used to cleave genetically engineered proteins.

Uses:

TEV proteinase is used to cleave fusion proteins. It is useful due to its high degree of specificity1 and potential to be used in vivo or in vitro applications.

Auto-inactivation:

Wild type TEV protease also cleaves itself at Met 218 and Ser 2192. This leads to auto-inactivation of the TEV protease and progressive loss of activity of the protein. The rate of inactivation is proportional to the concentration of protease. More stable Mutants have been produced by single amino acid substitutions S219V (AGC(serine) to GTG(valine) and S219P (AGC(serine) to CCG(proline)3.

Table I.

Kinetic parameters for wild-type and mutant TEV proteases with the peptide substrate TENLYFQSGTRR-NH2. From original paper by Kapust et.al. 20013

Enzyme	K_m (mM)	k_cat (s^-1)	k_cat /K_m (mM^-1 s^-1)
Wild type	0.061 ± 0.010	0.16 ± 0.01	2.62 ± 0.46
S219V	0.041± 0.010	0.19 ± 0.01	4.63 ± 1.16
S219P	0.066 ± 0.008	0.09 ±0.01	1.36 ± 0.22

S219V* - retains same activity as wild type

S219P* - virtually imperivious to autocatalysis

Sequence and Features

Assembly Compatibility:

10
COMPATIBLE WITH RFC[10]
12
COMPATIBLE WITH RFC[12]
21
COMPATIBLE WITH RFC[21]
23
COMPATIBLE WITH RFC[23]
25
INCOMPATIBLE WITH RFC[25]
Illegal AgeI site found at 32
1000
COMPATIBLE WITH RFC[1000]

References

1 pmid=8179197
2 pmid=7793070
3 pmid=2047602

</biblio>

For more information about our project please visit our wiki.

Contribution

Group: TecCEM, iGEM 2024
Authors: Giovana Andrea Osorio León, Ana Laura Torres Huerta, Aurora Antonio Pérez, Lorena Gallegos Solís
Summary: The sequences of TEV proteases S219P and 1LVM were compared to determine if TEV S219P exhibits greater efficiency and interaction with a substrate. Sequence alignment was performed using ESPript3, and both enzymes were modeled with AlphaFold for structural analysis. Additionally, heatmaps of protein stability were generated using Protein-Sol and molecular docking was carried out with HADDOCK 2.4 and BIOVIA Discovery Studio Visualizer to compare their interactions.

Documentation

The Tobacco etch virus (TEV) protease is a 27 kDa catalytic domain of the nuclear inclusion polyprotein a (NIa) in TEV. It specifically recognizes the amino acid sequence ENLYFQG/S and cleaves between the Q and G/S residues. TEV belongs to the Potyviridae family of viruses, which includes other positive-strand RNA viruses. The TEV genome is translated into a single large polyprotein that is subsequently processed by virus-encoded proteins with proteolytic activity. Despite its substrate specificity, the use of TEV protease is limited due to its self-inactivation through autocleavage and its low solubility during purification, caused by its high hydrophobicity ^[1].

Studies have shown that TEV protease undergoes self-cleavage near its C terminus after residue 218, at a site that does not follow the canonical sequence, resulting in a truncated protein with reduced activity ^[2]. According to UniProt, its active sites are in 214, 223, 256, 649, 722, 2083, 2118, and 2188 ^[3].

In this study, the sequences of mutant TEV protease S219P and wild type 1LVM were compared to determine if TEV S219P exhibits greater efficiency and interaction with its recognition site or synthetic substrate (ENLYFQG). Structure-based sequence alignment was performed using ESPript3, and both enzymes were modeled with AlphaFold for structural analysis. Additionally, heatmaps of protein stability were generated using Protein-Sol and molecular docking was carried out with HADDOCK 2.4 and BIOVIA Discovery Studio Visualizer to compare their interactions. Finally, to corroborate the information obtained from modeling and docking we made our cloning design for TEV S219P in the plasmid pET-24b(+) using Benchling.

Initially, the sequences of TEV protease S219P and TEV 1LVM were analyzed using Clustal Omega to generate a protein alignment in ALN format, which was subsequently processed with ESPript3. ESPript3 provides a detailed representation of sequence similarities and secondary structure information for the aligned sequences. As shown in Figure 1, both sequences exhibit a high degree of similarity. However, the TEV S219P sequence consists of 237 amino acids, while the TEV 1LVM sequence contains 229 amino acids, indicating that TEV S219P is longer.

In panel A, differences are observed between amino acids 1-8 of TEV 1LVM and TEV S219P, attributed to the presence of a His-tag, as well as at position 10. Additionally, amino acids 229-237 in TEV S219P represent seven extra residues not present in TEV 1LVM. The differing amino acids between the two proteases are highlighted in yellow. The catalytic sites for both proteases are located at amino acids 214 and 223, as indicated by green triangles in panel A of Figure 1. The peptidase C4 domain, marked in dark blue, spans residues 1-218, corresponding to the coding sequence shared by both proteins.

In panel B, the domains of chain A of TEV 1LVM, selected for comparison with TEV S219P, are illustrated. The domains of 1lvmA01 are shown in aqua, while those of 1lvmA02 are highlighted in pink. Table 1 in Figure 1 provides a comprehensive description of these domains.

Proteín alignment of TEV 1LVM with TEV S219P and characteristics of TEV 1LVM domains in chain A. The structure of TEV S219P obtained from the iGEM registry (BBa_K316012) does not exhibit the S219P mutation as stated. Therefore, the subsequent analyses were conducted using the specified mutation. Image A retrieved from ESPript. [6]

To evaluate the impact of the differences in the sequence of the mutant with respect to the wild-type protein, modeling and simulation of the interaction with the target peptide were performed. Afterwards, the program ChimeraX was used to obtain the 3D model of both TEV proteases using the AlphaFold tool for structure prediction, as shown in Figure 2. The image labeled with letter A is the model obtained for the TEV protease S219P, and the image labeled with letter B is the model for chain A of TEV 1LVM. Both proteases were overlapped to compare them and observe similarities, as shown in image C. The overlap is highlighted in two colors: light blue for the TEV protease S219P and purple for TEV 1LVM, showing areas where the structures align or differ. It can be seen that most of the chains of both proteins align, as well as the main beta sheets and alpha helices, indicating that the S219P mutation does not significantly affect the native structure of the enzyme. However, there are regions at the ends where conformational differences between TEV S219P and TEV 1LVM are observed. These differences may influence the function or stability of the protein.

Subsequently, the Protein-Sol program was used to generate energy and charge heatmaps, providing insights into the stability of the proteases (TEV 1LVM and TEV S219P) under varying pH and ionic strength (salt concentration) conditions. The Debye-Hückel (DH) method was employed to model interactions between ionizable groups and to calculate pKa values. Each heatmap consists of 91 combinations of pH and ionic strength. For qualitative comparison with the experiment, the CH2 and CH3 domains of the IgG1 structure (PDB 1HZH) were used, as pH and ionic strength variations have been reported to affect the stability of these domains in IgG^[4].

TEV 1LVM

The heatmap results for TEV 1LVM are shown in Figure 3. The color scale ranges from red (positive values) to green (negative values). Positive values (ranging from red to orange) suggest conditions where the protease has higher energy, potentially indicating instability or less favorable folding. Conversely, negative values (green) indicate more stable conditions with lower energy.

At a pH between 2.0 and 4.0 and high ionic strengths (0.1–0.3 M), the TEV 1LVM protease displays higher energy values, which may indicate enzyme instability. As the pH increases beyond 5.0 and the ionic strength decreases, the energy values drop, suggesting that these conditions favor a more stable folded state of the protease. However, the optimal conditions for this enzyme are observed at a pH of 6.5 to 8.0 and low ionic strengths (0.005–0.1 M), which promote the stability of the TEV 1LVM protease^[4].

**Figure 3. Energy heatmap for TEV 1LVM. Image obtained from Protein-Sol.[4]**

Figure 4 displays the net charge per amino acid (e/aa) under various pH and ionic strength conditions. The color scale ranges from blue (positive charges) to red (negative charges). At low pH values (between 2.0 and 4.0), the protease exhibits a positive charge per amino acid. As the pH increases (between 5.0 and 8.0), the charges approach zero, particularly in the pH range of 6.0 to 8.0. This suggests that within these pH ranges, the protease is near its isoelectric point, where the net charge is essentially neutral. Figure 4 indicates that the TEV protease (1LVM) is more positively charged under acidic conditions (low pH) and approaches a neutral charge under more neutral to basic conditions (pH 6.0–8.0) ^[4].

**Figure 4. Charge heatmap for TEV 1LVM. Image obtained from Protein-Sol [4].**

TEV Protease S219P

Figure 5 shows that TEV S219P exhibits increased stability (negative energy) under pH conditions between 6.5 and 8.0, particularly at low ionic strengths (0.005–0.1 M). In contrast, at low pH (2.0–4.0), the enzyme displays very high energy values (deep red color) across the entire range of ionic strengths, indicating significant instability or a propensity for denaturation. This instability is attributed to the S219P mutation, where a serine (AGC) was replaced with a proline (CCG). Proline often has negative effects on protein structure because it is the only amino acid whose side chain is covalently bonded to the alpha amine, forming a rigid cyclic structure. This structure introduces rigid turns in the peptide chain, stabilizing the edges of beta-sheets and alpha-helices, thereby facilitating protein folding. Additionally, peptides with consecutive proline residues can fold into a characteristic polyproline helix (PII helix), a common motif in protein-protein interactions ^[5]. However, this effect was not observed in the 3D model generated by AlphaFold, as both sequences are identical except for a single amino acid change and the addition of a histidine tag. Furthermore, homology modeling might not capture such stability changes. Several studies on peptide bond formation with proline suggest that proline can hinder the rate of protein synthesis by increasing the entropy of peptide bond formation. In general, the substitution of any amino acid can destabilize the native conformation of the protein ^[5], leading to its instability.

**Figure 5. Energy heatmap for TEV S219P . Image obtained from Protein-Sol [4].**

Figure 6 displays the net charge map for TEV S219P. Instability at low pH is shown, with high energies in this range. However, the magnitude of the energy values in this TEV is lower compared to TEV 1LVM, which could indicate that TEV S219 is slightly more tolerant to acidic conditions. The charge of the protein decreases both with increasing pH and ionic strength, which is a typical behavior for many proteins. The smaller change in charge at higher pH suggests that residues affecting the net charge (probably basic groups) reach a point where their protonation is minimal ^[4].

**Figure 6. Charge heatmap for TEV S219P . Image obtained from Protein-Sol [4].**

Molecular docking to evaluate binding to the TEV recognition site

Proposal for in silico Cloning Design in Expression Vector

To corroborate the information obtained from modeling and docking, experiments are necessary. We planned the cloning and expression of this mutant protease, however due to lack of time this was not possible. We show and make available our cloning design in case someone wants to clone this biobrick.

Benchling was used for in silico cloning of the mutant TEV protease previously mentioned, to insert its sequence into the pET-24b(+) vector (resistant to kanamycin). The steps followed for cloning are mentioned below.

First, the file of the plasmid pET-24b(+) was downloaded from Snapgene and then inserted into Benchling for its further modification. Also, the biobrick sequence was downloaded from the iGEM registry and copied into a new file in Benchling. Then, in the same biobrick file, the restriction sites for the enzymes NdeI and XhoI were added, as well as the extra base pairs that each enzyme required. For NdeI, in the 5’ extreme, we added 3 bases and also for XhoI, in the 3’ extreme, like it seems in Figure 9 and Figure 10. On the other hand, an additional 21 extra base pairs were added on each side of the sequence (5' and 3') to facilitate in silico cloning into the pET-24b(+) vector, as well as to increase the sequence length and qualify for the IDT sponsorship. Afterward, the modified TEV protease sequence was forward translated.

Figure 9. Addition of the restriction site for NdeI and the 24 extra pairs of bases for the in silico cloning.

Figure 10. Addition of the restriction site for Xho and the 24 extra pairs of bases for the in silico cloning.

Later, a forward and reverse primer were designed, with the following characteristics mentioned in Table 4.

Table 4. Characteristics of the forward and reverse primers.
	Sequence	GC (%)	Length (bp)	Min ΔG Homodimer (kcal)	Min ΔG Monomer (kcal)	Tm (°C)
Primer Forward	agccatatgaatgggaag	44.44	18	-6.4	-2.4	54.3
Primer Reverse	cacagttaatgaacdtcg	44.44	18	-3.9	-0.3	54.4

After we designed both primers, they were analyzed in the OligoAnalyzer tool for searching hetero-dimers, like it seems in Figure 11.

**Figure 11. Hetero-dimers analyzer for both primers.**

Next, an in silico digestion was performed with these enzymes in pET-24b(+) and in the file that contained the TEV protease S219P sequence. For both pET-24b(+) plasmid and construct, the buffer 2.1, 3.1, and 4/CS had an efficiency of 100%. After we performed both digests, with the assembly wizard tool, we made a ligation of pET-24b(+) with the TEV protease S219P sequence. Finally we confirmed that our cloning was successful by making a reverse translation on the open reading frame after the lac operator, where the amino acids of the modified TEV protease sequence were correctly translated and the histidine tag that will help us to purify this protease was conserved like the original plasmid before the cloning, like it seems in Figure 12.

**Figure 12. Successful cloning file of TEV S219P in pET-24b(+).**

The Benchling files and other details of both cloning proposals are explained in Table 5.

Table 5. Cloning proposal of TEV S219P in pET-24b(+).
Proposal	Description and link	Sequence	Length (bp)
TEV protease sequence ready to clone. TEV-Protease-BB	The cutting sites for the restriction enzymes XhoI and NdeI were added to the BioBrick sequence (BBa_K316012), as well as the extra base pairs that each enzyme occupies to cut correctly. In total, three extra bases were added to both the 5' and 3' extremes. Additionally, another 21 extra pairs of bases were added to facilitate in silico cloning into the pET-24b(+) vector, as well as to increase the sequence length and qualify for the IDT sponsorship. The insert is compatible with pET-24 b(+) and pET-29 a(+). Also, a forward and reverse primer were created for amplifying the TEV sequence by PCR. - Link to PDF of Original BioBrick in Benchling - Link to PDF of Modified BioBrick ready to clone in Benchling - Link to TEV protease cloned in pET24b(+) in Benchling	acgtggggcaaatttccctat agccatatga atgggaagatcgttgtttaaaggaccacgtgac tataatccgattagctcgactatt tgccatctga ccaatgagagtgatggtcataccactagcttgtat ggcattggctttgggccattcatcatcacgaacaa acacctctttaggcgcaataatggtacact gttgg tacaatcccttcatggagtctttaaggtcaaaaac acaacgacgcttcagcaacatct gatagatggaag agacatgattatcattcgaatgccgaaagactttc caccgtttcctcagaaactcaagtttcgcgaacct cagcgtgaagaacggatctgct tagtcacaacaaa ctttcagaccaaatctatgtcctcaatggtatcag acactagctgtacattccctag ctctgatggcatc ttttggaagcattggattcagacaaaagatgggca atgtggctctcctcttgtgtca acacgggatgggt ttattgtgggcatacactctgcgtcaaacttcacc aatacgaataattactttacgagtgttcccaagaa cttcatggagttactgacgaac caagaagctcaac aatgggtttcaggctggagactgaatgcagattcc gttctttggggaggtcacaaag tgttcatggataa accggaagaaccgtttcaaccggttaaagaggcca cacagttaatgaacCTCGAGagc ATCGCTAGGCAC ACGTCGCCG	772

The previous steps on how to make an in silico cloning of the TEV protease S219P into the plasmid or destiny vector pET-24b(+) are summarized in Figure 13.

**Figure 13. Diagram of the steps followed to make an in silico cloning of the TEV protease S219P into the plasmid or destiny vector pET-24b(+).**

References

↑ Nam, H., Hwang, B. J., Choi, D. Y., Shin, S., & Choi, M. (2020). Tobacco etch virus (TEV) protease with multiple mutations to improve solubility and reduce self-cleavage exhibits enhanced enzymatic activity. FEBS Open Bio, 10(4), 619–626. https://doi.org/10.1002/2211-5463.12828
↑ Nunn, C. M., Jeeves, M., Cliff, M. J., Urquhart, G. T., George, R. R., Chao, L. H., Tsuchia, Y., & Djordjevic, S. (2005). Crystal structure of tobacco etch virus protease shows the protein C terminus bound within the active site. Journal of Molecular Biology, 350(1), 145–155. https://doi.org/10.1016/j.jmb.2005.04.013
↑ P04517 · POLG_TEV. (2024). UniProt. https://www.uniprot.org/uniprotkb/P04517/entry
↑ ^4.0 ^4.1 ^4.2 ^4.3 Hebditch, M., Warwicker, J. Web-based display of protein surface and pH-dependent properties for assessing the developability of biotherapeutics. Sci Rep 9, 1969 (2019). https://doi.org/10.1038/s41598-018-36950-8/entry
↑ ^5.0 ^5.1 Sergey, Melnikov., Sergey, Melnikov., J., Mailliot., J., Mailliot., Lukas, Rigger., Sandro, Neuner., Byung-Sik, Shin., Gulnara, Yusupova., Gulnara, Yusupova., Thomas, E., Dever., Ronald, Micura., Marat, Yusupov. (2016). Molecular insights into protein synthesis with proline residues.. EMBO Reports, 17(12):1776-1784. doi: 10.15252/EMBR.201642943

[1] Nam, H., Hwang, B. J., Choi, D. Y., Shin, S., & Choi, M. (2020). Tobacco etch virus (TEV) protease with multiple mutations to improve solubility and reduce self-cleavage exhibits enhanced enzymatic activity. FEBS Open Bio, 10(4), 619–626. https://doi.org/10.1002/2211-5463.12828

[2] Nunn, C. M., Jeeves, M., Cliff, M. J., Urquhart, G. T., George, R. R., Chao, L. H., Tsuchia, Y., & Djordjevic, S. (2005). Crystal structure of tobacco etch virus protease shows the protein C terminus bound within the active site. Journal of Molecular Biology, 350(1), 145–155. https://doi.org/10.1016/j.jmb.2005.04.013

[3] P04517 · POLG_TEV. (2024). UniProt. https://www.uniprot.org/uniprotkb/P04517/entry

[Hebditch2019-4] 4.0 ^4.1 ^4.2 ^4.3 Hebditch, M., Warwicker, J. Web-based display of protein surface and pH-dependent properties for assessing the developability of biotherapeutics. Sci Rep 9, 1969 (2019). https://doi.org/10.1038/s41598-018-36950-8/entry

[Sergey2016-5] 5.0 ^5.1 Sergey, Melnikov., Sergey, Melnikov., J., Mailliot., J., Mailliot., Lukas, Rigger., Sandro, Neuner., Byung-Sik, Shin., Gulnara, Yusupova., Gulnara, Yusupova., Thomas, E., Dever., Ronald, Micura., Marat, Yusupov. (2016). Molecular insights into protein synthesis with proline residues.. EMBO Reports, 17(12):1776-1784. doi: 10.15252/EMBR.201642943

[1]

[2]

[3]

[4]

[5]