SUMO fusion domain with the USP peptide
This contruct codes for a protein generating device with an N-terminal SUMO fusion domain preceded by a 6x-HIS tag.
The protein following this fusion tag is a synthetic star peptide-precursor termed the USP peptide. The precursor may be produced in the cell, extracted, and then treated with a protease to produce a branched peptide architecture. See the University of Melbourne 2014 iGEM team project page for background on the design rationale of the peptide.
This gene was synthesized by GenScript, as the gene codes for a de novo designed peptide.
Note that any protein coding region can be cloned into this BioBrick expression vector by taking advantage of the AgeI site in SUMO present at base pairs 355-360 (ACCGGT). The cloning strategy would be to use the AgeI site at the 5' end of the sequence and any restriction site in the BioBrick suffix at the 3' end (SpeI or PstI). Note when cutting the SUMO sequence with AgeI, the last codon of the SUMO protein (GGT) is removed. Therefore any insert which is being cloned into the vector needs to have a 5' GGT immediately after the AgeI restriction site. Also, cutting at the BioBrick suffix will remove the BioBrick terminator currently in the gene. Thus, the insert must also contain a compatible terminator sequence.
Protein coding region
The part codes for a peptide which can be functionalised using chemical approaches. This peptide was designed to have flexible, unstructured arms and was termed the USP construct. The arms of this peptide serve not as active peptides themselves, but as inert structural linkers.
The arms were designed with the following elements in mind:
- Lack of structure. The arms were designed with a bioinspired approach, using the FxFG motif of nucleoporins (where x is a variable amino acid residue). Such segments naturally repeat in nucleoporins and are thought to lead to disorder/lack of stable secondary structure. Nucleoporins are found in mammalian cells, serving as flexible brushes around nuclear pores (C. Ader, S. Frey, W. Maas, H. B. Schmidt, D. Görlich and M. Baldus, Proc. Natl. Acad. Sci. U. S. A., 2010, 107, 6281-6285).
- Water-soluble. The arms were designed with several charged amino acids to improve solubility.
- Designed to form a disulfide bond. Although it is difficult to rationally ensure that the disulfide bond will form between two cysteines in our peptide, we incorporated a beta turn between the two cysteines which may encourage the peptide to fold at the apex of the hairpin loop. This may bring the cysteines into closer proximity, providing more probable bond formation.
The ultimate utility of this peptide lies in its ability to be functionalised with other biomacromolecules. For example, the technique of Native Chemical Ligation(NCL) can be used to join peptides, proteins, and other ligands to the arms (P. E. Dawson, T. W. Muir, I. Clark-Lewis and S. Kent, Science, 1994, 266, 776-779).
The full sequence of the protein coding region is: MHHHHHHSDSEVNQEAKPEVKPEVKPETHINLKVSDGSSEIFFKIKKTTPLRRLMEAFAKRQGKEMDSLRFLYDGIRIQADQTPEDLDMDDNDIIEAHREQTGG SRFKFGEKFRFGKNEFRFGKTDFKFGDNEFRFGKDFRFGKEKFRFGRDDFKFGYWYCYWFKFGRGRAFRSEPDGGFRFGYYCYWYFRFGKENFKFGEQDKFRFGN DFRFGERFKFGNDKFKFGDREFRFGKTKFRFGF. The SUMO fusion ends at resides EQTGG.
The plasmid was transformed to BL21(DE3) cells, and a single colony was cultured and induced overnight at 17 °C. A whole cell sample both before IPTG induction (-IPTG) and after the induction period (+IPTG) were boiled in SDS-PAGE sample buffer and loaded on a 15% tris-glycine gel. The whole cell Coomassie stained gel is shown below alongside the NEB P7712S molecular weight marker (see rightmost lanes labelled USP for this BioBrick).
A small-scale purification was carried out using the HIS-tag at the N-terminus.
After purification, a concurrent Western Blot and Coomassie stain on both the pre-induction samples from above and the purified protein (namely, the first elution from the batch purification) were run.
The Western Blot used mouse monoclonal antibodies against the N-terminal HIS-tag as primary antibodies and anti-mouse secondary antibodies (again see the USP lanes).
The corresponding Coomassie stain is show below (see USP lanes:)
The sequence was probed using an in gel digestion of the bands corresponding to the protein, followed by LC-MS/MS. For the construct in BL21(DE3) cells, the following tryptic fragments were detected:
For the construct in SHuffle T7 cells, the following tryptic fragments were detected:
Recall that the full sequence was: MHHHHHHSDSEVNQEAKPEVKPEVKPETHINLKVSDGSSEIFFKIKKTTPLRRLMEAFAKRQGKEMDSLRFLYDGIRIQADQTPEDLDMDDNDIIEAHREQTGG SRFKFGEKFRFGKNEFRFGKTDFKFGDNEFRFGKDFRFGKEKFRFGRDDFKFGYWYCYWFKFGRGRAFRSEPDGGFRFGYYCYWYFRFGKENFKFGEQDKFRF GNDFRFGERFKFGNDKFKFGDREFRFGKTKFRFGF
Thus, parts of the sequence have been detecting using the tandem mass spec.