Difference between revisions of "Part:BBa K1080001:Design"

(Design Notes)
(Design Notes)
Line 12: Line 12:
 
Note 8 PstI sites CTGCAG (785, 1031, 1385, 1652, 2681, 2825, 3458, 3569) No EcoRI or XbaI sites. 1 SpeI (ACTAGT) (4226) site 23 bp after stop codon.
 
Note 8 PstI sites CTGCAG (785, 1031, 1385, 1652, 2681, 2825, 3458, 3569) No EcoRI or XbaI sites. 1 SpeI (ACTAGT) (4226) site 23 bp after stop codon.
  
 +
<b>Amino Acid Sequence</b>
  
<FONT FACE="courier">ATGCGGGGTTCTCATCATCATCATCATCATGGTATGGCTAGCATGACTGGTGGACAGCAAATGGGTCGGGATCTGT
+
<FONT FACE="courier">GPAKQQVSLD LRDDGAGMFT STSPEMRRVV PDDVKGRVKV KVVYVVLEAQ YQSAISAAVK NINAKNSKVC FEVVGYLLEE LRDQKNLDML KEDVASANIF<br> IGSLIFIEEL AEKIVEAVSP LREKLDACLI FPSMPAVMKL NKLGTFSMAQ LGQSKSVFSE FIKSARKNND NFEEGLLKLV RTLPKVLKYL PSDKAQDAKN<br> FVNSLQYWLG GNSDNLENLL LNTVSNYVPA LKGVDFSVAE PTAYPDVGIW HPLASGMYED LKEYLNWYDT RKDMVFAKDA PVIGLVLQRS HLVTGDEGHY<br> SGVVAELESR GAKVIPVFAG GLDFSAPVKK FFYDPLGSGR TFVDTVVSLT GFALVGGPAR QDAPKAIEAL KNLNVPYLVS LPLVFQTTEE WLDSELGVHP<br> VQVALQVALP ELDGAMEPIV FAGRDSNTGK SHSLPDRIAS LCARAVNWAN LRKKRNAEKK LAVTVFSFPP DKGNVGTAAY LNVFGSIYRV LKNLQREGYD<br> VGALPPSEED LIQSVLTQKE AKFNSTDLHI AYKMKVDEYQ LCPYAEALE ENWGKPPGTL NTNGQELLVY GRQYGNVFIG VQPTFGYEGD PMRLLFSKSA<br> SPHHGFAAYY TFLEKIFKAD AVLHFGTHGS LEFMPGKQVG MSGVCYPDSL IGTIPNLYYY AANNPSEATI AKRRSYANTI SYLTPPAENA GLYKGLKELK<br> ELISSYQGMR ESGRAEQICA TIIETAKLCN LDRDVTLPDA DAKDLTMDMR DSVVGQVYRK LMEIESRLLP CGLHVVGCPP TAEEAVATLV NIAELDRPDN<br> NPPIKGMPGI LARAIGRDIE SIYSGNNKGV LADVDQLQRI TEASRTCVRE FVKDRTGLNG RIGTNWITNL LKFTGFYVDP WVRGLQNGEF ASANREELIT<br> LFNYLEFCLT QVVKDNELGA LVEALNGQYV EPGPGGDPIR NPNVLPTGKN IHALDPQSIP TQAALKSARL VVDRLLDRER DNNGGKYPET IALVLWGTDN<br> IKTYGESLAQ VMMMVGVKPV ADALGRVNKL EVIPLEELGR PRVDVVVNCS GVFRDLFVNQ MLLLDRAIKL AAEQDEPDEM  NFVRKHAKQQ AAELGLQSLR<br> DAATRVFSNS SGSYSSNVNL AVENSSWSDE SQLQEMYLKR KSYAFNSDRP GAGGEMQRDV FETAMKTVDV TFQNLDSSEI SLTDVSHYFD SDPTKLVASL<br> RNDGRTPNAY IADTTTANAQ VRTLGETVRL DARTKLLNPK WYEGMLASGY EGVREIQKRM TNTMGWSATS GMVDNWVYDE ANSTFIEDAA MAERLMNTNP<br> NSFRKLVATF LEANGRGYWD AKPEQLERLR QLYMDVEDKI EGVE 
 
+
ACGACGATGACGATAAGGATCATCCCTTCACCGCGTGCAATGTGGCGACTGGACCCCGGCCGCCCATGACCACCTT
+
 
+
CACCGGTGGCAACAAGGGCCCTGCTAAGCAGCAGGTGTCGCTGGATCTGCGCGACGAGGGCGCTGGCATGTTCACC
+
 
+
AGCACCAGCCCGGAGATGCGCCGTGTCGTCCCTGACGATGTGAAGGGTCGCGTTAAGGTGAAGGTTGTGTACGTGG
+
 
+
TGCTGGAGGCCCAGTACCAGTCGGCCATCAGCGCTGCGGTGAAGAACATCAACGCCAAGAACTCCAAGGTGTGCTT
+
 
+
CGAGGTGGTGGGCTACCTGCTGGAGGAGCTGCGTGACCAGAAGAACCTCGATATGCTCAAGGAGGATGTGGCCTCT
+
 
+
GCCAACATCTTCATCGGCTCGCTCATCTTCATTGAGGAGCTTGCCGAGAAGATTGTGGAGGCGGTGAGCCCCCTGC
+
 
+
GCGAGAAGCTGGACGCGTGCCTGATCTTCCCGTCCATGCCGGCGGTCATGAAGCTGAACAAGCTGGGCACGTTTTC
+
 
+
GATGGCTCAGCTGGGCCAGTCGAAGTCGGTGTTCTCGGAGTTCATCAAGTCTGCTCGCAAGAACAACGACAACTTC
+
 
+
GAGGAGGGCTTGCTGAAGCTGGTGCGCACCCTGCCTAAGGTGCTGAAGTATCTGCCCTCGGACAAGGCGCAGGACG
+
 
+
CCAAGAACTTCGTGAACAGCCTGCAGTACTGGCTGGGCGGTAACTCGGACAACCTGGAGAACCTGCTGCTGAACAC
+
 
+
CGTCAGCAACTACGTGCCCGCTCTGAAGGGCGTGGACTTCAGCGTGGCTGAGCCCACCGCCTACCCCGATGTGGGT
+
 
+
ATCTGGCACCCTCTGGCCTCGGGCATGTACGAGGACCTGAAGGAGTACCTGAACTGGTACGACACCCGCAAGGACA
+
 
+
TGGTCTTCGCCAAGGACGCCCCCGTCATTGGCCTGGTGCTGCAGCGCTCGCACCTGGTGACTGGCGATGAGGGCCA
+
 
+
CTACAGCGGCGTGGTCGCTGAGCTGGAGAGCCGCGGTGCTAAGGTCATCCCCGTCTTTGCCGGTGGCCTGGACTTC
+
 
+
TCCGCCCCCGTCAAGAAGTTCTTCTACGACCCCCTGGGCTCTGGCCGCACGTTCGTGGACACCGTTGTGTCGCTGA
+
 
+
CCGGCTTCGCGCTGGTGGGCGGCCCCGCGCGCCAGGACGCGCCGAAGGCCATTGAGGCGCTGAAGAACCTGAACGT
+
 
+
GCCCTACCTGGTGTCGCTGCCGCTGGTGTTCCAGACCACTGAGGAGTGGCTGGACAGCGAGCTGGGCGTGCACCCC
+
 
+
GTCCAGGTGGCTCTGCAGGTTGCCCTGCCCGAGCTGGATGGTGCCATGGAGCCCATCGTGTTCGCTGGCCGTGACT
+
 
+
CGAACACCGGCAAGTCGCACTCGCTGCCCGACCGCATCGCTTCGCTGTGCGCTCGCGCCGTGAACTGGGCCAACCT
+
 
+
GCGCAAGAAGCGCAACGCCGAGAAGAAGCTGGCCGTCACCGTGTTCAGCTTCCCCCCTGACAAGGGCAACGTCGGC
+
 
+
ACTGCCGCCTACCTGAACGTGTTCGGCTCCATCTACCGCGTGCTGAAGAACCTGCAGCGCGAGGGCTACGACGTGG
+
 
+
GCGCCCTGCCGCCCTCGGAGGAGGATCTGATCCAGTCGGTGCTGACCCAGAAGGAGGCCAAGTTCAACTCGACCGA
+
 
+
CCTGCACATCGCCTACAAGATGAAGGTGGACGAGTACCAGAAGCTGTGCCCTTACGCCGAGGCGCTGGAGGAGAAC
+
 
+
TGGGGCAAGCCCCCCGGCACCCTGAACACCAACGGCCAGGAGCTGCTGGTGTACGGCCGCCAGTACGGCAACGTCT
+
 
+
TCATCGGCGTGCAGCCCACCTTCGGCTACGAGGGCGACCCGATGCGCCTGCTGTTCTCGAAGTCGGCCAGCCCCCA
+
 
+
CCACGGCTTCGCCGCCTACTACACCTTCCTGGAGAAGATCTTCAAGGCCGACGCCGTGCTGCACTTCGGCACCCAC
+
 
+
GGCTCGCTGGAGTTCATGCCCGGCAAGCAGGTCGGCATGTCGGGTGTGTGCTACCCCGACTCGCTGATCGGCACCA
+
 
+
TCCCCAACCTCTACTACTACGCCGCCAACAACCCGTCTGAGGCCACCATCGCCAAGCGCCGCTCGTACGCCAACAC
+
 
+
CATTTCGTACCTGACGCCGCCTGCCGAGAACGCCGGCCTGTACAAGGGCCTGAAGGAGCTGAAGGAGCTGATCAGC
+
 
+
TCGTACCAGGGCATGCGTGAGTCTGGCCGCGCCGAGCAGATCTGCGCCACCATCATTGAGACCGCCAAGCTGTGCA
+
 
+
ACCTGGACCGCGACGTGACCCTGCCCGACGCTGACGCCAAGGACCTGACCATGGACATGCGCGACAGCGTTGTGGG
+
 
+
CCAGGTGTACCGCAAGCTGATGGAGATTGAGTCCCGCCTGCTGCCCTGCGGCCTGCACGTGGTGGGCTGCCCGCCC
+
 
+
ACCGCCGAGGAGGCCGTGGCCACCCTGGTCAACATCGCTGAGCTGGACCGCCCGGACAACAACCCCCCCATCAAGG
+
 
+
GCATGCCCGGCATCCTGGCCCGCGCCATTGGTCGCGACATCGAGTCGATTTACAGCGGCAACAACAAGGGCGTCCT
+
 
+
GGCTGACGTTGACCAGCTGCAGCGCATCACCGAGGCCTCCCGCACCTGCGTGCGCGAGTTCGTGAAGGACCGCACC
+
 
+
GGCCTGAACGGCCGCATCGGCACCAACTGGATCACCAACCTGCTCAAGTTCACCGGCTTCTACGTGGACCCCTGGG
+
 
+
TGCGCGGCCTGCAGAACGGCGAGTTCGCCAGCGCCAACCGCGAGGAGCTGATCACCCTGTTCAACTACCTGGAGTT
+
 
+
CTGCCTGACCCAGGTGGTCAAGGACAACGAGCTGGGCGCCCTGGTAGAGGCGCTGAACGGCCAGTACGTCGAGCCC
+
 
+
GGCCCCGGCGGTGACCCCATCCGCAACCCCAACGTGCTGCCCACCGGCAAGAACATCCACGCCCTGGACCCTCAGT
+
 
+
CGATTCCCACTCAGGCCGCGCTGAAGAGCGCCCGCCTGGTGGTGGACCGCCTGCTGGACCGCGAGCGCGACAACAA
+
 
+
CGGCGGCAAGTACCCCGAGACCATCGCGCTGGTGCTGTGGGGCACTGACAACATCAAGACCTACGGCGAGTCGCTG
+
 
+
GCCCAGGTCATGATGATGGTCGGTGTCAAGCCCGTGGCCGACGCCCTGGGCCGCGTGAACAAGCTGGAGGTGATCC
+
 
+
CTCTGGAGGAGCTGGGCCGCCCCCGCGTGGACGTGGTTGTCAACTGCTCGGGTGTGTTCCGCGACCTGTTCGTGAA
+
 
+
CCAGATGCTGCTGCTGGACCGCGCCATCAAGCTGGCGGCCGAGCAGGACGAGCCCGATGAGATGAACTTCGTGCGC
+
 
+
AAGCACGCCAAGCAGCAGGCGGCGGAGCTGGGCCTGCAGAGCCTGCGCGACGCGGCCACCCGTGTGTTCTCCAACA
+
 
+
GCTCGGGCTCCTACTCGTCCAACGTCAACCTGGCGGTGGAGAACAGCAGCTGGAGCGACGAGTCGCAGCTGCAGGA
+
 
+
GATGTACCTGAAGCGCAAGTCGTACGCCTTCAACTCGGACCGCCCCGGCGCCGGTGGCGAGATGCAGCGCGACGTG
+
 
+
TTCGAGACGGCCATGAAGACCGTGGACGTGACCTTCCAGAACCTGGACTCGTCCGAGATCTCGCTGACCGATGTGT
+
 
+
CGCACTACTTCGACTCCGACCCCACCAAGCTGGTGGCGTCGCTGCGCAACGACGGCCGCACCCCCAACGCCTACAT
+
 
+
CGCCGACACCACCACCGCCAACGCGCAGGTCCGCACTCTGGGTGAGACCGTGCGCCTGGACGCCCGCACCAAGCTG
+
 
+
CTCAACCCCAAGTGGTACGAGGGCATGCTTGCCTCGGGCTACGAGGGCGTGCGCGAGATCCAGAAGCGCATGACCA
+
 
+
ACACCATGGGCTGGTCGGCCACCTCGGGCATGGTGGACAACTGGGTGTACGACGAGGCCAACTCGACCTTCATCGA
+
 
+
GGATGCGGCCATGGCCGAGCGCCTGATGAACACCAACCCCAACAGCTTCCGCAAGCTGGTGGCCACCTTCCTGGAG
+
 
+
GCCAACGGCCGCGGCTACTGGGACGCCAAGCCCGAGCAGCTGGAGCGCCTGCGCCAGCTGTACATGGACGTGGAGG
+
 
+
ACAAGATTGAGGGCGTCGAATAAGCGGCCTCCCCTTCATGGTAGCACTAGTTGGCGGGTTGTGGTTGGACTAGGCG
+
 
+
GCTAGGGTATATACCTAGTAGCGGCGGCTGCGGAGTGGAGGGCTGGCGCCCAGCGCGAGGGCGTGGCCTTTCCTCC
+
 
+
TGGACCCGAGAGCGCTCCGCGAGGGACGGCGAGTGAGATAGGCAGCAGCG<br></FONT>
+
 
+
<b>Amino acid sequence</b>
+
+
<FONT FACE="courier">MRGSHHHHHH GMASMTGGQQ MGRDLYDDDD KDHPFTACNV ATGPRPPMTT FTGGNKGPAK
+
 
+
QQVSLDLRDE GAGMFTSTSP EMRRVVPDDV KGRVKVKVVY VVLEAQYQSA ISAAVKNINA
+
 
+
KNSKVCFEVV GYLLEELRDQ KNLDMLKEDV ASANIFIGSL IFIEELAEKI VEAVSPLREK
+
 
+
LDACLIFPSM PAVMKLNKLG TFSMAQLGQS KSVFSEFIKS ARKNNDNFEE GLLKLVRTLP
+
 
+
KVLKYLPSDK AQDAKNFVNS LQYWLGGNSD NLENLLLNTV SNYVPALKGV DFSVAEPTAY
+
 
+
PDVGIWHPLA SGMYEDLKEY LNWYDTRKDM VFAKDAPVIG LVLQRSHLVT GDEGHYSGVV
+
 
+
AELESRGAKV IPVFAGGLDF SAPVKKFFYD PLGSGRTFVD TVVSLTGFAL VGGPARQDAP
+
 
+
KAIEALKNLN VPYLVSLPLV FQTTEEWLDS ELGVHPVQVA LQVALPELDG AMEPIVFAGR
+
 
+
DSNTGKSHSL PDRIASLCAR AVNWANLRKK RNAEKKLAVT VFSFPPDKGN VGTAAYLNVF
+
 
+
GSIYRVLKNL QREGYDVGAL PPSEEDLIQS VLTQKEAKFN STDLHIAYKM KVDEYQKLCP
+
 
+
YAEALEENWG KPPGTLNTNG QELLVYGRQY GNVFIGVQPT FGYEGDPMRL LFSKSASPHH
+
 
+
GFAAYYTFLE KIFKADAVLH FGTHGSLEFM PGKQVGMSGV CYPDSLIGTI PNLYYYAANN
+
 
+
PSEATIAKRR SYANTISYLT PPAENAGLYK GLKELKELIS SYQGMRESGR AEQICATIIE
+
 
+
TAKLCNLDRD VTLPDADAKD LTMDMRDSVV GQVYRKLMEI ESRLLPCGLH VVGCPPTAEE
+
 
+
AVATLVNIAE LDRPDNNPPI KGMPGILARA IGRDIESIYS GNNKGVLADV DQLQRITEAS
+
 
+
RTCVREFVKD RTGLNGRIGT NWITNLLKFT GFYVDPWVRG LQNGEFASAN REELITLFNY
+
 
+
LEFCLTQVVK DNELGALVEA LNGQYVEPGP GGDPIRNPNV LPTGKNIHAL DPQSIPTQAA
+
 
+
LKSARLVVDR LLDRERDNNG GKYPETIALV LWGTDNIKTY GESLAQVMMM VGVKPVADAL
+
 
+
GRVNKLEVIP LEELGRPRVD VVVNCSGVFR DLFVNQMLLL DRAIKLAAEQ DEPDEMNFVR
+
 
+
KHAKQQAAEL GLQSLRDAAT RVFSNSSGSY SSNVNLAVEN SSWSDESQLQ EMYLKRKSYA
+
 
+
FNSDRPGAGG EMQRDVFETA MKTVDVTFQN LDSSEISLTD VSHYFDSDPT KLVASLRNDG
+
 
+
RTPNAYIADT TTANAQVRTL GETVRLDART KLLNPKWYEG MLASGYEGVR EIQKRMTNTM
+
 
+
GWSATSGMVD NWVYDEANST FIEDAAMAER LMNTNPNSFR KLVATFLEAN GRGYWDAKPE
+
 
+
QLERLRQLYM DVEDKIEGVE </FONT>
+
 
+
 
+
References and documentation are available.
+
  
 +
References and documentation are available.
 
Please note the modified algorithm for extinction coefficient.
 
Please note the modified algorithm for extinction coefficient.
  
 +
--------------------------------------------------------------------------------
 +
Number of amino acids: 1344
  
Number of amino acids: 1400
+
Molecular weight: 148676.1
  
Molecular weight: 154817.9
+
Theoretical pI: 5.33
  
Theoretical pI: 5.48
+
Amino acid composition: Ala (A) 113   8.4%
 +
Arg (R)  68   5.1%
 +
Asn (N)  77   5.7%
 +
Asp (D)  81   6.0%
 +
Cys (C)  12   0.9%
 +
Gln (Q)  46   3.4%
 +
Glu (E)  91   6.8%
 +
Gly (G)  97   7.2%
 +
His (H)  14   1.0%
 +
Ile (I)  51   3.8%
 +
Leu (L) 141 10.5%
 +
Lys (K)  78   5.8%
 +
Met (M)  33   2.5%
 +
Phe (F)  49   3.6%
 +
Pro (P)  67   5.0%
 +
Ser (S)  84   6.2%
 +
Thr (T)  65   4.8%
 +
Trp (W)  14   1.0%
 +
Tyr (Y)  49   3.6%
 +
Val (V) 114   8.5%
 +
Pyl (O)  0   0.0%
 +
Sec (U)  0   0.0%
  
Amino acid composition:
+
(B)   0   0.0%
Ala (A) 116 8.3% Arg (R) 71 5.1% Asn (N) 79 5.6% Asp (D) 86 6.1% Cys (C) 13 0.9% Gln (Q) 48 3.4% Glu (E) 92 6.6% Gly (G) 105 7.5% His (H) 21 1.5% Ile (I) 51 3.6% Leu (L) 142 10.1% Lys (K) 80 5.7% Met (M) 38 2.7% Phe (F) 51 3.6% Pro (P) 71 5.1% Ser (S) 86 6.1% Thr (T) 71 5.1% Trp (W) 14 1.0% Tyr (Y) 50 3.6% Val (V) 115 8.2% Pyl (O) 0 0.0% Sec (U) 0 0.0% (B) 0 0.0% (Z) 0 0.0% (X) 0 0.0%
+
(Z)   0   0.0%
 +
(X)   0   0.0%
  
  
Total number of negatively charged residues (Asp + Glu): 178
+
Total number of negatively charged residues (Asp + Glu): 172
Total number of positively charged residues (Arg + Lys): 151
+
Total number of positively charged residues (Arg + Lys): 146
  
 
Atomic composition:
 
Atomic composition:
  
Carbon      C       6872
+
Carbon      C       6616
Hydrogen    H     10826
+
Hydrogen    H     10441
Nitrogen    N       1876
+
Nitrogen    N       1791
Oxygen      O       2091
+
Oxygen      O       2010
Sulfur      S         51
+
Sulfur      S         45
  
Formula: C6872H10826N1876O2091S51
+
Formula: C6616H10441N1791O2010S45
Total number of atoms: 21716
+
Total number of atoms: 20903
  
 
Extinction coefficients:
 
Extinction coefficients:
Line 213: Line 72:
 
Extinction coefficients are in units of  M-1 cm-1, at 280 nm measured in water.
 
Extinction coefficients are in units of  M-1 cm-1, at 280 nm measured in water.
  
Ext. coefficient  152250
+
Ext. coefficient  150760
Abs 0.1% (=1 g/l)  0.983, assuming all pairs of Cys residues form cystines
+
Abs 0.1% (=1 g/l)  1.014, assuming all pairs of Cys residues form cystines
  
  
Ext. coefficient  151500
+
Ext. coefficient  150010
Abs 0.1% (=1 g/l)  0.979, assuming all Cys residues are reduced
+
Abs 0.1% (=1 g/l)  1.009, assuming all Cys residues are reduced
  
 
Estimated half-life:
 
Estimated half-life:
  
The N-terminal of the sequence considered is M (Met).
+
The N-terminal of the sequence considered is G (Gly).
  
 
The estimated half-life is:  
 
The estimated half-life is:  
Line 232: Line 91:
 
Instability index:
 
Instability index:
  
The instability index (II) is computed to be 32.82
+
The instability index (II) is computed to be 32.91
 
This classifies the protein as stable.
 
This classifies the protein as stable.
  
  
  
Aliphatic index: 85.87
+
Aliphatic index: 88.72
  
Grand average of hydropathicity (GRAVY): -0.292
+
Grand average of hydropathicity (GRAVY): -0.257
  
 
===Source===
 
===Source===

Revision as of 02:36, 25 September 2013


ChlH


Assembly Compatibility:
  • 10
    COMPATIBLE WITH RFC[10]
  • 12
    COMPATIBLE WITH RFC[12]
  • 21
    INCOMPATIBLE WITH RFC[21]
    Illegal BglII site found at 1928
    Illegal BglII site found at 2234
    Illegal BglII site found at 3620
  • 23
    COMPATIBLE WITH RFC[23]
  • 25
    INCOMPATIBLE WITH RFC[25]
    Illegal NgoMIV site found at 486
    Illegal NgoMIV site found at 2152
    Illegal AgeI site found at 1132
    Illegal AgeI site found at 2650
    Illegal AgeI site found at 2704
    Illegal AgeI site found at 2923
  • 1000
    INCOMPATIBLE WITH RFC[1000]
    Illegal BsaI.rc site found at 2254
    Illegal BsaI.rc site found at 3049
    Illegal SapI.rc site found at 2979


Design Notes

CR-ChlH sequence with His Tag in pET15b. Sequence from Translation start site.

Note 8 PstI sites CTGCAG (785, 1031, 1385, 1652, 2681, 2825, 3458, 3569) No EcoRI or XbaI sites. 1 SpeI (ACTAGT) (4226) site 23 bp after stop codon.

Amino Acid Sequence

GPAKQQVSLD LRDDGAGMFT STSPEMRRVV PDDVKGRVKV KVVYVVLEAQ YQSAISAAVK NINAKNSKVC FEVVGYLLEE LRDQKNLDML KEDVASANIF
IGSLIFIEEL AEKIVEAVSP LREKLDACLI FPSMPAVMKL NKLGTFSMAQ LGQSKSVFSE FIKSARKNND NFEEGLLKLV RTLPKVLKYL PSDKAQDAKN
FVNSLQYWLG GNSDNLENLL LNTVSNYVPA LKGVDFSVAE PTAYPDVGIW HPLASGMYED LKEYLNWYDT RKDMVFAKDA PVIGLVLQRS HLVTGDEGHY
SGVVAELESR GAKVIPVFAG GLDFSAPVKK FFYDPLGSGR TFVDTVVSLT GFALVGGPAR QDAPKAIEAL KNLNVPYLVS LPLVFQTTEE WLDSELGVHP
VQVALQVALP ELDGAMEPIV FAGRDSNTGK SHSLPDRIAS LCARAVNWAN LRKKRNAEKK LAVTVFSFPP DKGNVGTAAY LNVFGSIYRV LKNLQREGYD
VGALPPSEED LIQSVLTQKE AKFNSTDLHI AYKMKVDEYQ LCPYAEALE ENWGKPPGTL NTNGQELLVY GRQYGNVFIG VQPTFGYEGD PMRLLFSKSA
SPHHGFAAYY TFLEKIFKAD AVLHFGTHGS LEFMPGKQVG MSGVCYPDSL IGTIPNLYYY AANNPSEATI AKRRSYANTI SYLTPPAENA GLYKGLKELK
ELISSYQGMR ESGRAEQICA TIIETAKLCN LDRDVTLPDA DAKDLTMDMR DSVVGQVYRK LMEIESRLLP CGLHVVGCPP TAEEAVATLV NIAELDRPDN
NPPIKGMPGI LARAIGRDIE SIYSGNNKGV LADVDQLQRI TEASRTCVRE FVKDRTGLNG RIGTNWITNL LKFTGFYVDP WVRGLQNGEF ASANREELIT
LFNYLEFCLT QVVKDNELGA LVEALNGQYV EPGPGGDPIR NPNVLPTGKN IHALDPQSIP TQAALKSARL VVDRLLDRER DNNGGKYPET IALVLWGTDN
IKTYGESLAQ VMMMVGVKPV ADALGRVNKL EVIPLEELGR PRVDVVVNCS GVFRDLFVNQ MLLLDRAIKL AAEQDEPDEM NFVRKHAKQQ AAELGLQSLR
DAATRVFSNS SGSYSSNVNL AVENSSWSDE SQLQEMYLKR KSYAFNSDRP GAGGEMQRDV FETAMKTVDV TFQNLDSSEI SLTDVSHYFD SDPTKLVASL
RNDGRTPNAY IADTTTANAQ VRTLGETVRL DARTKLLNPK WYEGMLASGY EGVREIQKRM TNTMGWSATS GMVDNWVYDE ANSTFIEDAA MAERLMNTNP
NSFRKLVATF LEANGRGYWD AKPEQLERLR QLYMDVEDKI EGVE

References and documentation are available. Please note the modified algorithm for extinction coefficient.


Number of amino acids: 1344

Molecular weight: 148676.1

Theoretical pI: 5.33

Amino acid composition: Ala (A) 113 8.4% Arg (R) 68 5.1% Asn (N) 77 5.7% Asp (D) 81 6.0% Cys (C) 12 0.9% Gln (Q) 46 3.4% Glu (E) 91 6.8% Gly (G) 97 7.2% His (H) 14 1.0% Ile (I) 51 3.8% Leu (L) 141 10.5% Lys (K) 78 5.8% Met (M) 33 2.5% Phe (F) 49 3.6% Pro (P) 67 5.0% Ser (S) 84 6.2% Thr (T) 65 4.8% Trp (W) 14 1.0% Tyr (Y) 49 3.6% Val (V) 114 8.5% Pyl (O) 0 0.0% Sec (U) 0 0.0%

(B)   0	  0.0%
(Z)   0	  0.0%
(X)   0	  0.0%


Total number of negatively charged residues (Asp + Glu): 172 Total number of positively charged residues (Arg + Lys): 146

Atomic composition:

Carbon C 6616 Hydrogen H 10441 Nitrogen N 1791 Oxygen O 2010 Sulfur S 45

Formula: C6616H10441N1791O2010S45 Total number of atoms: 20903

Extinction coefficients:

Extinction coefficients are in units of M-1 cm-1, at 280 nm measured in water.

Ext. coefficient 150760 Abs 0.1% (=1 g/l) 1.014, assuming all pairs of Cys residues form cystines


Ext. coefficient 150010 Abs 0.1% (=1 g/l) 1.009, assuming all Cys residues are reduced

Estimated half-life:

The N-terminal of the sequence considered is G (Gly).

The estimated half-life is:

                            30 hours (mammalian reticulocytes, in vitro).
                           >20 hours (yeast, in vivo).
                           >10 hours (Escherichia coli, in vivo).


Instability index:

The instability index (II) is computed to be 32.91 This classifies the protein as stable.


Aliphatic index: 88.72

Grand average of hydropathicity (GRAVY): -0.257

Source

Chlamydomonas reinhardtii

===References===