Part:BBa_K5406000
Bi-functional Glutathione Synthetase
Basic Information: This part contains the genetic sequence of Bi-Fuctional Glutathione Synthetase (GshF henceforth) of the bacteria Streptococcus sanguinis SK36 (S. sanguinis) or GshFss. This sequence has been adapted to favor Escherichia coli (E.coli) expression and to also comply with the Type IIS Assembly Standard.
General Information: The basis of the iGEM Athens 2024 Project is Glutathione. It is a tripeptide comprised of L-glutamic acid, L-cysteine and glycine. It is biosynthesized from the following two reactions:
(1) L-Glu + Cys --> γ- GC (2) γ- GC + Gly --> GSH
GshFss can catalyze both of these reactions. It contains an N-terminal domain that is similar to γ-glutamylcysteine synthetase (γ-GCS, which catalyzes reaction (1)) and an ATP-grasp C-terminal domain. As can be inferred from the second sentence, GshFss requires ATP to achieve its catalytic capabilities. It is for this reason that our system uses GshFss couples with PPK (see part BBa_K5406001)
Theoretical Biochemical Protein Parameters Information: Prior to the structural study of the enzyme GshFSs, we also made use of an online bioinformatics tool for the theoretical determination of important protein parameters, in order to have a clear view of GshFSs key biochemical characteristics, that could be helpful for our experimental data analysis, as well. The ProtParam online tool of Expasy / the Swiss Institute for Bioinformatics – SIB was used for theoretical parameters calculation regarding protein biochemistry based on the amino acid sequence of a protein. We used the amino acid sequence of GshF enzyme from Streptococcus sanguinis SK 36 from the Protein database of the National Center for Biotechnology Information
– NCBI in FASTA format as follows: >ABN45551.1 Glutathione biosynthesis bifunctional protein gshAB (Gamma-GCS-GS) (GCS-GS), putative [Streptococcus sanguinis SK36]
MMTINQLLQKLDTASPILQATFGLERENLRVTTDGHLAQTAHPSQLGSRNFHPTIQTDFSEQQLELITPIAHSTKEARRLLGAISDVAGRSIDQNERLWPMSMPPQLTE EEIAIAHLENDYERHYREGLAKKYGKKLQAISGIHYNMELGKDLVTSLFQVSSYHSLKDFKNDLYLKLARNFLRFRWILTYLYGAAPWAEAGFYSQEISQPIRSFRNSD YGYVNDENIQVSYASLEQYITDIENYVQSGELSAEKEFYSAVRFRGQKHNHAYLEQGITYLEFRCFDLNPFDHLGISQETLDTVHLFLLSLLWLDDVENVDTALKAAHD LNQKIACSHPLTALPDEADSSALLQAMEELIQHFELPTYYQTLLQQLKEALLNPQLTLSGQLLPHIQQDSLMAFGLEKAEEYHRYAWTAPYALKGYENMELSTQMLLFD AIQKGLNVDILDENDQFLKLWHGHHVEYVKNGNMTSKDNYVIPLAMANKTVTKKILAEADFPVPAGAEFSSLEEGLAYYPLIRDRQIVVKPKSTNFGLGISIFQEPASL ESYRKALEIAFSEDAAVLVEEFIAGTEYRFFVLDGQCEAVLLRVAANVVGDGQHTVRELVAIKNDNPLRGRDHRSPLEIIELGDIELLMLDQQGYGPDDILPDGVKVDL RRNSNISTGGDSIDVTDSMHPSYKELAADMAKAMGAWACGVDLIIPDSSAISTKENPNCTCIELNFNPSMYMHTYCAEGPGQSITPKILAKLFPEMD
Using the provided sequence, the online algorithm of ProtParam calculated a set of basic specificities of the enzymatic biochemistry parameters, which can be seen in the tables below. More importantly, the total number as well as composition of amino acids contained in GshFSs are presented, along with the calculated Molecular Weight and Theoretical PI of the protein. Furthermore, indexes regarding the stability of the enzyme were also determined based on the amino acid sequence used as input. Such measures could be useful not only to have a broader knowledge of a biomolecule, but also to tailor further experiments in the future, for example conducting targeted mutagenesis to meet specific requirements, for example enhanced enzyme activity or even thermostability.
Number of Amino Acids 751
Molecular Weight: 84784.96 Da or 84.8 kDa
Theoretical PI: 4.93
Estimated Half-Life: >10 h (E. coli in vivo) or > 20 h (Yeast, in vivo) or 30 h (mammalian reticulocytes, in vitro)
Instability Index: 42.88 – Instable
Aliphatic Index: 92.28
GRAVY (Grand average of hydropathicity): -0.285
Amino Acid Composition
Amino Acid/Number of Residues/Percentage % in Sequence
Alanine / 62 / 8.3%
Arginin / 29 / 3.9%
Asparagine / 34 / 4.5%
Aspartic Acid / 47 / 6.3%
Cysteine / 7 / 0.9%
Glutamine / 40 / 5.3%
Glutamic Acid / 58 / 7.7%
Glycine / 41 / 5.5%
Histidine / 24 / 3.2%
Isoleucine / 45 / 6.0%
Leucine / 93 / 12.4%
Lysine / 34 / 4.5%
Methionine / 18 / 2.4%
Phenylalanine / 29 / 3.9%
Proline / 33 / 4.4%
Serine / 49 / 6.5%
Threonine / 36 / 4.8%
Tryptophan / 7 / 0.9%
Tyrosine / 33 / 4.4%
Valine / 32 / 4.3%
Total of negatively charged amino acids: (Asp + Glu) 105
Total of positively charged amino acids: (Arg + Lys) 63
Structural Information:
In order to obtain a representative model protein structure of GshFSs from Streptococcus sanguinis SK36, we employed online bioinformatic tools for homology modeling. More specifically, we used the SWISS-MODEL protein structure homology-modeling server of Expasy / the Swiss Institute for Bioinformatics – SIB. We used the amino-acid sequence of GshFSs as an input, leading to the export of a model protein structure based on homology.
After the alignment of a model structure of the input GshFSs with templates from the server, the results of the homology modeling analysis through SWISS-MODEL showed that in fact the most suitable model based on homology for GshFSs was in fact the protein structure of Streptococcus agalactiae bi-functional glutathione synthetase, listed as 3LN6 in the Protein Data Bank. This structure was chosen as a template for the alignment with the GshFSs model, showing that both structures are homo-dimers with a sequence identity of 64.22%. The structures taken for both the model homo-dimer structure of GshFSs as well as the template from Streptococcus agalactiae are shown in the respective pictures below.
To further support our homology modeling results from both blastp amino-acid sequence alignments as well as SWISS-MODEL, we also harnessed the AlphaFold Protein Structure Database, which is a novel online AI tool (Google DeepMind - EMBL-EBI) that predicts a protein’s 3D structure from its amino acid sequence. We used the UniProt accession number of GshFSs from Streptococcus sanguinis SK36 - A3CQU6, to get a structure assessment based on AI from AlphaFold, depicting GshFSs in the form of a monomer.
Finally, for our first effort in achieving a 3D visualization for the enzyme GshFSs from the microorganism Streptococcus sanguinis SK36, we used the PyMol (™) © Schrödinger molecular visualization system. For this purpose, we used the .pdb file structures obtained from AlphaFold, and thus the 3D structure of GshFSs as a monomer, based on homology modeling is presented in the following picture.
To make the comparison between homologous and model structures more clear, we are presenting the model of GshFSs as a monomer (colored in bright orange) in alignment with the homologous protein structure of the enzyme GshF from Streptococcus agalactiae, also as a monomer, listed as 3LN6 in PDB (colored in sky blue), using the align tool of PyMol.
Any use of Pymol Software was strictly done for educational and scientific purposes under Fair Use Provisions of US and EU Law.
Sequence and Features
- 10INCOMPATIBLE WITH RFC[10]Illegal EcoRI site found at 763
Illegal PstI site found at 22
Illegal PstI site found at 52
Illegal PstI site found at 409
Illegal PstI site found at 1051
Illegal PstI site found at 1111 - 12INCOMPATIBLE WITH RFC[12]Illegal EcoRI site found at 763
Illegal PstI site found at 22
Illegal PstI site found at 52
Illegal PstI site found at 409
Illegal PstI site found at 1051
Illegal PstI site found at 1111 - 21INCOMPATIBLE WITH RFC[21]Illegal EcoRI site found at 763
Illegal BglII site found at 453 - 23INCOMPATIBLE WITH RFC[23]Illegal EcoRI site found at 763
Illegal PstI site found at 22
Illegal PstI site found at 52
Illegal PstI site found at 409
Illegal PstI site found at 1051
Illegal PstI site found at 1111 - 25INCOMPATIBLE WITH RFC[25]Illegal EcoRI site found at 763
Illegal PstI site found at 22
Illegal PstI site found at 52
Illegal PstI site found at 409
Illegal PstI site found at 1051
Illegal PstI site found at 1111
Illegal NgoMIV site found at 262
Illegal NgoMIV site found at 1497
Illegal NgoMIV site found at 1705 - 1000COMPATIBLE WITH RFC[1000]
None |