Part:BBa_K5398002

The 5' intron of td gene from T4 phage

This part is a component of the td intron (5' side), an intron of the td gene from T4 phage belonging to group I introns, which can form a circular mRNA (cmRNA) to make the ribosomes repeatedly translate the extron. This year, we utilized the td intron to produce the squid ring proteins with various long tandem repeats. We explored different production and purification strategies of target protein produced by cmRNA and examined the function of protein.

Due to special internal structure, the td intron, also called RNA cyclase ribozyme, can splice themselves out without assistance from the spliceosome or other proteins, and instead rely on a free guanosine nucleotide to initiate the splicing reaction in vivo. This process results in joining of the flanking exons and circularization of the intervening intron to produce an intronic circRNA (Fig. 1). So it is a strategy to produce circular RNAs in vivo.

Fig. 1 | Mechanism of group I introns. (GOMES R M O da S et al. 2024)

Therefore, an engineering cmRNA was designed by employing the RNA cyclase ribozyme mechanism. This elaborate design of cmRNA sequence circularizes the exon to form a back-splice junction (BSJ) in a reaction catalyzed by guanosine. To ensure that the ribosomes do not translate the open reading frame (ORF) of gene of interest (GOI) from unprocessed linear mRNA, the ribosome binding sequence (RBS) and start codon ATG were placed downstream of GOI coding sequence. Consequently, the regulatory sequences were located upstream of the coding sequence only after circularization of the mRNA. To purify the resulting polypeptides, a His tag was incorporated into the GOI. If the mRNA is circularized, the ribosome could circle the cmRNA, producing a long repeating polypeptide (Fig. 2).

Fig. 2 | Design of a circular mRNA based on td flanking introns.

iGEM Gifu 2014 also used a similar part (BBa_K1332005). If you want to learn more about the td intron, please click the link above. https://parts.igem.org/Part:BBa_K1332005

Usage and Biology

In our project, given the positive correlation between number of repeat units and magnitude of cohesive force, we designed a circular mRNA on which the OFR of TRn5 ( BBa_K5398001) between the 3' and 5' intron of td gene from T4 phage (BBa_K5398002 and BBa_K5398003). This strategy could use short sequences to express highly repetitive squid ring teeth proteins. A self-cleaving RNA cyclase ribozyme was incorporated to form the circular mRNAs, allowing ribosomes to repeatedly translate the sequence of interest and producing proteins with different repeat numbers, thus we could obtain proteins with exceptional self-healing properties.

Characterization

Protein expression

The synthetic plasmid pET-29a(+)-cmRNA(TRn5) was transformed into E.coli BL21 (DE3) and recombinant proteins were expressed using LB medium (Fig. 3).

Fig. 3 | The plasmid map of pET-29a(+)-cmRNA(TRn5).

Optimization of incubation temperature

Aim: To determine which incubation temperature is beter for protein expression using mRNA circularization.

Methods: The cells were inoculated in LB media at 37℃ for 5 h, 23℃ for 16 h and 16℃ for 20 h respectively. The cultures were induced with 1 mM IPTG and the proteins were expressed. An SDS-gel was used to assess the results.

Results:

① Proteins formed a ladder on the gel

The TRn polypeptide was composed of repeating units with a size of 16 kDa, which was formed by the ribosome traveling one round along the cmRNA. Due to uncertainty of the round number that the ribosome traveled, TRn sample was a mixture of proteins with various sizes that formed a ladder on the gel. According to the protein marker, we supposed that the sizes of the proteins ranged from about 8 to 96 kDa, indicating that the ribosome could travel along the cmRNA at least 6 rounds (Fig. 4).

② The strategy of cmRNA facilitated the solubility of TRn

It was proved that TRn is a sort of inclusion body protein expressed in E.coli from plenty of literature. In our SDS-PAGE results, though part of TRn in the precitate, a substantial portion of TRn existed in inclusion body protein supernatant, which indicated the strategy using cmRNA could improve protein solubility.

③ Incubation temperature barely influenced the TRn expression

From the SDS-PAGE of expression products of cmRNA at different incubation temperatures (Fig. 4), we found there were few differences among them. This showed the strategy employing cmRNA to express TRn had a low requirement.

Fig. 4 | SDS-PAGE of expression products of cmRNA at different incubation temperatures.

a. SDS-PAGE of cmRNA expressed at 23℃. Lane 1: marker; lanes 2-4: whole-cell lysate, supernatant and pellet from induced cells, respectively. b. SDS-PAGE of cmRNA expressed at 37℃ and 16℃. Lane 1: marker; lanes 2-5: whole-cell lysate, supernatant, pellet and diluted pellet from induced cells at 37℃, respectively; lane 6: marker; lanes 7-9: whole-cell lysate, supernatant and pellet from induced cells at 16℃, respectively.

Optimization of IPTG concentration

Aim: To determine which IPTG concentration is beter for protein expression using mRNA circularization.

Methods: The cells were inoculated in LB media at 37℃ for 5 h. The cultures were induced with 0.5 mM and 1 mM IPTG and the proteins were expressed. An SDS-gel was used to assess the results.

Results: From the SDS-PAGE (Fig. 5), we found that the TRn expression level at two IPTG concentration (0.5 mM and 1 mM) had little difference and the protreins also formed a ladder on the gel.

Fig. 5 | SDS-PAGE of expression products of cmRNA induced with different IPTG concentration.

Lane 1: marker; lanes 2-4: whole-cell lysate, supernatant and pellet from induced cells with 0.5 mM IPTG, respectively; lanes 5-7: whole-cell lysate, supernatant and pellet from induced cells with 1 mM IPTG, respectively.

Protein purification by Immobilized Metal Affinity Chromatography (IMAC)

Aim: To purify the protein by IMAC (Immobilised Metal Affinity Chromatography) using Ni-NTA resin.

Methods: The cells were induced with 0.5 mM IPTG and inoculated in LB media at 37℃ for 5 h. The cultures were centrifugated to get the supernatant and pellet. Next, the following steps were used:

Denature the supernatant with 8 mM urea overnight;

Renature the detured solution by dialysis with 20 mM Tris-HCl buffer for 20 h (changing dialysate every 8 h);

Purify proteins on a HisTrap Ni-NTA column with different concentrations of imidazole;

Assess the results using SDS-PAGE.

Results: From the SDS-PAGE (Fig. 6), we found that the TRn expression level was too low to verify by SDS-PAGE. We supposed the His tag on TRn could not function well because it was not at the C or N terminal of targeting proteins like others, which posed a challenge for protein purification.

Fig. 6 | SDS-PAGE of expression products of cmRNA purified by IMAC.

Lanes 1-6: induced cell samples at 16℃; lane 1: sample after being bound to Ni-NTA resin; lane 2: sample eluted with 20 mM Tris-HCl; lanes 3-6: samples eluted with 50, 150,300 and 500 mM imidazole; lane 7: marker; lanes 8-13, induced cell samples at 37℃; lane 8: sample after being bound to Ni-NTA resin; lane 9: sample eluted with 20 mM Tris-HCl; lanes 10-13: samples eluted with 50, 150 and 300 mM imidazole.

Protein purification using a new protocol

Aim: To purify the protein using a new protocol containing 5% acetic acid.

Methods: The cells were induced with 0.5 mM IPTG and inoculated in LB media at 37℃ for 5 h. The cultures were centrifugated to get the supernatant and pellet. Next, the following steps were used:

Wash the pellets twice with 100 mL urea extraction buffer [100 mM Tris, pH 7.4, 5 mM EDTA, 2 M urea, 2% (vol/vol) Triton X-100] and centrifugat them to remove cell debris and other soluble proteins;

Wash the pellets with 100 mL washing buffer (100 mM Tris, pH 7.4, 5 mM EDTA) and centrifugat them to remove urea and TritonX-100.

Dissolve the pellets in 5% acetic acid.

Assess the results using SDS-PAGE.

Results: From the SDS-PAGE (Fig. 7), we found that the TRn dissolved in 5% acetic acid still presented a ladder on the gel. And due to unpredictable and intermittent translation, the bands of TRn were a little shallow to recognize.

Fig. 7 | SDS-PAGE of expression products of cmRNA using a new protocol.

Lane 1: marker; lanes 2-4: whole-cell lysate, supernatant and pellet from induced cells at 37℃, respectively; lane 5: sample washed with 5% acetic acid.

Self-healing test

We obtained protein samples of TRn by freezedrying 24 h. The final yield was about 187.2 mg/L bacterial culture. Next, we dissolved protein samples in 5% acetic acid to reach 20 mg/μL, cast them into square models and dried them at 70℃ for 3 h to obtain protein films.

Fig. 8 | The freeze-dried protein sample.

Model

In order to test the stability of proteins formed by different translation times of cmRNA, we performed mathematical and biological simulations. In mathematical simulation, we described the stability of the protein through the hydrogen bond network formed between β-sheet.

Simulation of the individual hydrogen bond network

First, we selected proteins which was translated 1, 2, 3, 4, 5 and 6 times respectively through circular mRNA to study the stability of the intermolecular hydrogen bond network. In the modeling process, we placed several points in a grid and abstracted β-sheet as a point. In order to more realistically reflect the distribution of the β-sheet in proteins, we set the concentration of the points according to the relevant parameters of the grid. To enable different points to interact with each other, we set the connection rules according to the bond length of the hydrogen bond. If there is a connection, it means that the two points interact through hydrogen bonds. After the first hydrogen bond network was formed, we perturbed the system 20 times, mainly setting perturbations on the position of the points to simulate the reaction of the proteins when the environment changes, and then regenerated a new hydrogen bond network. We calculated the point-line ratio for each of the 20 hydrogen bond networks formed, and took the average value as an evaluation index for the stability of the protein under this number of translations.

The ratio of the number of points to the number of lines can reflect the stability of the network structure to a certain extent. When the ratio of points to lines is low, it means that the number of lines is relatively large to the number of points, and the network is more tightly connected. Networks with dense lines usually have higher stability and can more effectively resist external disturbances. In addition, such networks are not prone to deformation due to the dense distribution of their lines. When a line is broken, other surrounding lines can share its load and reduce the impact of the break on the overall structure. Therefore, in microscopic protein structures with self-healing functional materials, networks with dense lines can often recover faster and show stronger self-healing ability. The figure below shows the changes and result analysis of the intermolecular hydrogen bond network formed by proteins under different translation times.

1234

图2

From Fig. 18, we can see that as the number of TRn repetitions increases, the point-line ratio of the hydrogen bond network gradually decreases, indicating that the network gradually changes from loose to tight. Specifically, when the number of TRn repetitions increases from 5 to 10 times, the point-line ratio drops sharply from 1.2167 to 0.3921, which means that the density of the connection line is significantly improved, and the stability of the formed network structure converges rapidly. When the number of repetitions further increases to 15, 20, 25 and 30 times, the rate of decrease of the point-line ratio begins to slow down but still decreases, falling to 0.3247, 0.3157, 0.2540 and 0.2109, respectively. This shows that a further increase in the number of repetitions will still improve the stability of the formed structure.

In summary, a more stable hydrogen bond network can be formed in proteins with a large number of repetitions. This stable hydrogen bond network not only helps to maintain the integrity of the protein structure, but also improves the ability of the protein to bind to other molecules in biological processes, thereby promoting the realization of more complex biological functions. Therefore, the more times circular mRNA is translated, the higher the biological activity and functional stability of the protein.

Simulation of the overall hydrogen bond network

Subsequently, we introduced several proteins with the same number of translations and placed them in a three-dimensional space to simulate the overall hydrogen bond network. We represented the three-dimensional space as a "box" in the molecular dynamics simulation, abstracted the protein molecule into a sphere, set the size of the sphere according to the molecular weight and abstracted the β-sheet into a point. Then we placed four spheres in the space and placed several evenly distributed points in the sphere according to the number of β-sheet. When placing the spheres, in order to improve the accuracy and rationality of the simulation, we limited the distance between different spheres to a certain range and constrained the distance from the sphere to the boundary of the space to prevent the protein from running out of the "box" during the simulation. Relevant literature shows that the probability of hydrogen bonds between β-sheet in the same molecule is greater than the probability of bonds between different molecules. Therefore, we set different connection probabilities based on whether the two points belong to the same sphere and combined them with the length of the hydrogen bond to formulate the initial connection rules. When simulating the structure of the intermolecular hydrogen bond network, we did not consider the breaking of hydrogen bonds due to their high strength. However, in the simulation of the overall hydrogen bond network, we determined the probability of bond breaking based on the degree. The degree of a point is the number of lines that extend from the point. If a β-sheet forms more hydrogen bonds, the strength of the bond will be relatively weakened. After the initial hydrogen bond network was formed, we perturbed the system for 20 times. The results after each perturbation were based on the previous network. We specifically determined the generation and disappearance of bonds based on the degree of the two points, the spheres they belonged to and the distance between them. Finally, we calculated the point-to-line ratio for each hydrogen bond network formed, and took its average and variance as evaluation indicators for the stability of systems composed of certain proteins. The following is a simulation display and result analysis:

图3

图4

As can be seen from Fig. 18, the point-to-line ratio of the hydrogen bond network between multiple protein molecules is generally lower than that of the hydrogen bond network within a protein, indicating that the hydrogen bond interactions between multiple proteins are more intensive and the connectivity of the formed network is stronger. Specifically, as the number of protein repetitions increases, the point-to-line ratio of the network continues to decrease, from 0.3461 to 0.0687, reflecting the increasing density of the connection line of the structure. When the circular mRNA is translated once to generate TRn5, the hydrogen bond network at this time has a higher point-to-line ratio, indicating that the hydrogen bond network is relatively sparse and the network has poor anti-interference ability. As the number of repetitions of TRn increases, the point-to-line ratio of the hydrogen bond network gradually decreases. When the number of repetitions reaches 30, the point-to-line ratio is only 0.0687, indicating that the overall structure tends to be saturated. At this time, although there is still the possibility of new connection formation, most of the points in the structure interactions have maintained stable through sufficient connections, and the impact of further perturbations on the structure becomes smaller and smaller.

In summary, the more TRn is repeated, the more stable the hydrogen bond network between protein molecules. This can make the interacting protein molecules more resistant to deformation when facing external perturbations, and have higher self-healing ability and adaptability.

We then used GROMACS to perform molecular dynamics simulations and calculate the energy in the system, hoping to verify the validity of our mathematical modeling through molecular dynamics methods. However, due to the large molecular weight of our system and limited computing resources, we only simulated TRn20 at most.

It should be noted that since GROMACS could not directly calculate the energy of hydrogen bonds in the system, we directly used the total energy of the system (including the energy contained in van der Waals forces and electrostatic forces) to reflect the stability of the network, because the total energy included the energy generated by hydrogen bonds, while the energy from sources other than hydrogen bonds was basically the same for each TRn. In addition, to eliminate the influence of the number of protein molecules on the results, we added the same number of protein molecules to each system. These molecules were composed of TRn with different numbers of repetitions (for example, 5 TRn5 and 5 TRn10). When calculating the energy, we divided the total energy by the number of molecules and the number of repetitions of TRn, ultimately obtaining the energy corresponding to a unit TRn in the system. We used this energy as a measure of system stability, and the results were as follows:

图5

From the results, we can see that as the number of TRn repetitions increases, the energy of the system becomes lower and lower, so the protein becomes more stable, which is consistent with the results we obtained through mathematical modeling, and to a certain extent, it shows the accuracy of the mathematical modeling results.

More information about the project for which the part was created: SAMUS (NAU-CHINA 2024).

Sequence and Features

Assembly Compatibility:

10
COMPATIBLE WITH RFC[10]
12
COMPATIBLE WITH RFC[12]
21
INCOMPATIBLE WITH RFC[21]
Illegal XhoI site found at 35
23
COMPATIBLE WITH RFC[23]
25
COMPATIBLE WITH RFC[25]
1000
COMPATIBLE WITH RFC[1000]

Reference

[1] LIU L, WANG P, ZHAO D, et al. Engineering Circularized mRNAs for the Production of Spider Silk Proteins[J]. Appl. Environ. Microbiol., 2022, 88(8): e00028-22.

[2] PERRIMAN R, ARES M. Circular mRNA can direct translation of extremely long repeating-sequence proteins in vivo[J]. RNA, 1998, 4(9): 1047-1054.

[3] LEE S O, XIE Q, FRIED S D. Optimized Loopable Translation as a Platform for the Synthesis of Repetitive Proteins[J]. ACS Cent. Sci., 2021, 7(10): 1736-1750.

[4] OBI P, CHEN Y G. The design and synthesis of circular RNAs[J]. Methods, 2021, 196: 85-103.

[5] GOMES R M O da S, SILVA K J G da, THEODORO R C. Group I introns: Structure, splicing and their applications in medical mycology[J]. Genet. Mol. Biol., 2024, 47: e20230228.