Coding
"H"

Part:BBa_K1080001:Design

Designed by: Macquarie University   Group: iGEM13_Macquarie_Australia   (2013-09-22)
Revision as of 03:00, 24 September 2013 by DMC (Talk | contribs) (Design Notes)


ChlH


Assembly Compatibility:
  • 10
    COMPATIBLE WITH RFC[10]
  • 12
    COMPATIBLE WITH RFC[12]
  • 21
    INCOMPATIBLE WITH RFC[21]
    Illegal BglII site found at 1928
    Illegal BglII site found at 2234
    Illegal BglII site found at 3620
  • 23
    COMPATIBLE WITH RFC[23]
  • 25
    INCOMPATIBLE WITH RFC[25]
    Illegal NgoMIV site found at 486
    Illegal NgoMIV site found at 2152
    Illegal AgeI site found at 1132
    Illegal AgeI site found at 2650
    Illegal AgeI site found at 2704
    Illegal AgeI site found at 2923
  • 1000
    INCOMPATIBLE WITH RFC[1000]
    Illegal BsaI.rc site found at 2254
    Illegal BsaI.rc site found at 3049
    Illegal SapI.rc site found at 2979


Design Notes

CR-ChlH sequence with His Tag in pET15b. Sequence from Translation start site.

Note 8 PstI sites CTGCAG (785, 1031, 1385, 1652, 2681, 2825, 3458, 3569) No EcoRI or XbaI sites. 1 SpeI (ACTAGT) (4226) site 23 bp after stop codon.


ATGCGGGGTTCTCATCATCATCATCATCATGGTATGGCTAGCATGACTGGTGGACAGCAAATGGGTCGGGATCTGT

ACGACGATGACGATAAGGATCATCCCTTCACCGCGTGCAATGTGGCGACTGGACCCCGGCCGCCCATGACCACCTT

CACCGGTGGCAACAAGGGCCCTGCTAAGCAGCAGGTGTCGCTGGATCTGCGCGACGAGGGCGCTGGCATGTTCACC

AGCACCAGCCCGGAGATGCGCCGTGTCGTCCCTGACGATGTGAAGGGTCGCGTTAAGGTGAAGGTTGTGTACGTGG

TGCTGGAGGCCCAGTACCAGTCGGCCATCAGCGCTGCGGTGAAGAACATCAACGCCAAGAACTCCAAGGTGTGCTT

CGAGGTGGTGGGCTACCTGCTGGAGGAGCTGCGTGACCAGAAGAACCTCGATATGCTCAAGGAGGATGTGGCCTCT

GCCAACATCTTCATCGGCTCGCTCATCTTCATTGAGGAGCTTGCCGAGAAGATTGTGGAGGCGGTGAGCCCCCTGC

GCGAGAAGCTGGACGCGTGCCTGATCTTCCCGTCCATGCCGGCGGTCATGAAGCTGAACAAGCTGGGCACGTTTTC

GATGGCTCAGCTGGGCCAGTCGAAGTCGGTGTTCTCGGAGTTCATCAAGTCTGCTCGCAAGAACAACGACAACTTC

GAGGAGGGCTTGCTGAAGCTGGTGCGCACCCTGCCTAAGGTGCTGAAGTATCTGCCCTCGGACAAGGCGCAGGACG

CCAAGAACTTCGTGAACAGCCTGCAGTACTGGCTGGGCGGTAACTCGGACAACCTGGAGAACCTGCTGCTGAACAC

CGTCAGCAACTACGTGCCCGCTCTGAAGGGCGTGGACTTCAGCGTGGCTGAGCCCACCGCCTACCCCGATGTGGGT

ATCTGGCACCCTCTGGCCTCGGGCATGTACGAGGACCTGAAGGAGTACCTGAACTGGTACGACACCCGCAAGGACA

TGGTCTTCGCCAAGGACGCCCCCGTCATTGGCCTGGTGCTGCAGCGCTCGCACCTGGTGACTGGCGATGAGGGCCA

CTACAGCGGCGTGGTCGCTGAGCTGGAGAGCCGCGGTGCTAAGGTCATCCCCGTCTTTGCCGGTGGCCTGGACTTC

TCCGCCCCCGTCAAGAAGTTCTTCTACGACCCCCTGGGCTCTGGCCGCACGTTCGTGGACACCGTTGTGTCGCTGA

CCGGCTTCGCGCTGGTGGGCGGCCCCGCGCGCCAGGACGCGCCGAAGGCCATTGAGGCGCTGAAGAACCTGAACGT

GCCCTACCTGGTGTCGCTGCCGCTGGTGTTCCAGACCACTGAGGAGTGGCTGGACAGCGAGCTGGGCGTGCACCCC

GTCCAGGTGGCTCTGCAGGTTGCCCTGCCCGAGCTGGATGGTGCCATGGAGCCCATCGTGTTCGCTGGCCGTGACT

CGAACACCGGCAAGTCGCACTCGCTGCCCGACCGCATCGCTTCGCTGTGCGCTCGCGCCGTGAACTGGGCCAACCT

GCGCAAGAAGCGCAACGCCGAGAAGAAGCTGGCCGTCACCGTGTTCAGCTTCCCCCCTGACAAGGGCAACGTCGGC

ACTGCCGCCTACCTGAACGTGTTCGGCTCCATCTACCGCGTGCTGAAGAACCTGCAGCGCGAGGGCTACGACGTGG

GCGCCCTGCCGCCCTCGGAGGAGGATCTGATCCAGTCGGTGCTGACCCAGAAGGAGGCCAAGTTCAACTCGACCGA

CCTGCACATCGCCTACAAGATGAAGGTGGACGAGTACCAGAAGCTGTGCCCTTACGCCGAGGCGCTGGAGGAGAAC

TGGGGCAAGCCCCCCGGCACCCTGAACACCAACGGCCAGGAGCTGCTGGTGTACGGCCGCCAGTACGGCAACGTCT

TCATCGGCGTGCAGCCCACCTTCGGCTACGAGGGCGACCCGATGCGCCTGCTGTTCTCGAAGTCGGCCAGCCCCCA

CCACGGCTTCGCCGCCTACTACACCTTCCTGGAGAAGATCTTCAAGGCCGACGCCGTGCTGCACTTCGGCACCCAC

GGCTCGCTGGAGTTCATGCCCGGCAAGCAGGTCGGCATGTCGGGTGTGTGCTACCCCGACTCGCTGATCGGCACCA

TCCCCAACCTCTACTACTACGCCGCCAACAACCCGTCTGAGGCCACCATCGCCAAGCGCCGCTCGTACGCCAACAC

CATTTCGTACCTGACGCCGCCTGCCGAGAACGCCGGCCTGTACAAGGGCCTGAAGGAGCTGAAGGAGCTGATCAGC

TCGTACCAGGGCATGCGTGAGTCTGGCCGCGCCGAGCAGATCTGCGCCACCATCATTGAGACCGCCAAGCTGTGCA

ACCTGGACCGCGACGTGACCCTGCCCGACGCTGACGCCAAGGACCTGACCATGGACATGCGCGACAGCGTTGTGGG

CCAGGTGTACCGCAAGCTGATGGAGATTGAGTCCCGCCTGCTGCCCTGCGGCCTGCACGTGGTGGGCTGCCCGCCC

ACCGCCGAGGAGGCCGTGGCCACCCTGGTCAACATCGCTGAGCTGGACCGCCCGGACAACAACCCCCCCATCAAGG

GCATGCCCGGCATCCTGGCCCGCGCCATTGGTCGCGACATCGAGTCGATTTACAGCGGCAACAACAAGGGCGTCCT

GGCTGACGTTGACCAGCTGCAGCGCATCACCGAGGCCTCCCGCACCTGCGTGCGCGAGTTCGTGAAGGACCGCACC

GGCCTGAACGGCCGCATCGGCACCAACTGGATCACCAACCTGCTCAAGTTCACCGGCTTCTACGTGGACCCCTGGG

TGCGCGGCCTGCAGAACGGCGAGTTCGCCAGCGCCAACCGCGAGGAGCTGATCACCCTGTTCAACTACCTGGAGTT

CTGCCTGACCCAGGTGGTCAAGGACAACGAGCTGGGCGCCCTGGTAGAGGCGCTGAACGGCCAGTACGTCGAGCCC

GGCCCCGGCGGTGACCCCATCCGCAACCCCAACGTGCTGCCCACCGGCAAGAACATCCACGCCCTGGACCCTCAGT

CGATTCCCACTCAGGCCGCGCTGAAGAGCGCCCGCCTGGTGGTGGACCGCCTGCTGGACCGCGAGCGCGACAACAA

CGGCGGCAAGTACCCCGAGACCATCGCGCTGGTGCTGTGGGGCACTGACAACATCAAGACCTACGGCGAGTCGCTG

GCCCAGGTCATGATGATGGTCGGTGTCAAGCCCGTGGCCGACGCCCTGGGCCGCGTGAACAAGCTGGAGGTGATCC

CTCTGGAGGAGCTGGGCCGCCCCCGCGTGGACGTGGTTGTCAACTGCTCGGGTGTGTTCCGCGACCTGTTCGTGAA

CCAGATGCTGCTGCTGGACCGCGCCATCAAGCTGGCGGCCGAGCAGGACGAGCCCGATGAGATGAACTTCGTGCGC

AAGCACGCCAAGCAGCAGGCGGCGGAGCTGGGCCTGCAGAGCCTGCGCGACGCGGCCACCCGTGTGTTCTCCAACA

GCTCGGGCTCCTACTCGTCCAACGTCAACCTGGCGGTGGAGAACAGCAGCTGGAGCGACGAGTCGCAGCTGCAGGA

GATGTACCTGAAGCGCAAGTCGTACGCCTTCAACTCGGACCGCCCCGGCGCCGGTGGCGAGATGCAGCGCGACGTG

TTCGAGACGGCCATGAAGACCGTGGACGTGACCTTCCAGAACCTGGACTCGTCCGAGATCTCGCTGACCGATGTGT

CGCACTACTTCGACTCCGACCCCACCAAGCTGGTGGCGTCGCTGCGCAACGACGGCCGCACCCCCAACGCCTACAT

CGCCGACACCACCACCGCCAACGCGCAGGTCCGCACTCTGGGTGAGACCGTGCGCCTGGACGCCCGCACCAAGCTG

CTCAACCCCAAGTGGTACGAGGGCATGCTTGCCTCGGGCTACGAGGGCGTGCGCGAGATCCAGAAGCGCATGACCA

ACACCATGGGCTGGTCGGCCACCTCGGGCATGGTGGACAACTGGGTGTACGACGAGGCCAACTCGACCTTCATCGA

GGATGCGGCCATGGCCGAGCGCCTGATGAACACCAACCCCAACAGCTTCCGCAAGCTGGTGGCCACCTTCCTGGAG

GCCAACGGCCGCGGCTACTGGGACGCCAAGCCCGAGCAGCTGGAGCGCCTGCGCCAGCTGTACATGGACGTGGAGG

ACAAGATTGAGGGCGTCGAATAAGCGGCCTCCCCTTCATGGTAGCACTAGTTGGCGGGTTGTGGTTGGACTAGGCG

GCTAGGGTATATACCTAGTAGCGGCGGCTGCGGAGTGGAGGGCTGGCGCCCAGCGCGAGGGCGTGGCCTTTCCTCC

TGGACCCGAGAGCGCTCCGCGAGGGACGGCGAGTGAGATAGGCAGCAGCG

Amino acid sequence

MRGSHHHHHH GMASMTGGQQ MGRDLYDDDD KDHPFTACNV ATGPRPPMTT FTGGNKGPAK

QQVSLDLRDE GAGMFTSTSP EMRRVVPDDV KGRVKVKVVY VVLEAQYQSA ISAAVKNINA

KNSKVCFEVV GYLLEELRDQ KNLDMLKEDV ASANIFIGSL IFIEELAEKI VEAVSPLREK

LDACLIFPSM PAVMKLNKLG TFSMAQLGQS KSVFSEFIKS ARKNNDNFEE GLLKLVRTLP

KVLKYLPSDK AQDAKNFVNS LQYWLGGNSD NLENLLLNTV SNYVPALKGV DFSVAEPTAY

PDVGIWHPLA SGMYEDLKEY LNWYDTRKDM VFAKDAPVIG LVLQRSHLVT GDEGHYSGVV

AELESRGAKV IPVFAGGLDF SAPVKKFFYD PLGSGRTFVD TVVSLTGFAL VGGPARQDAP

KAIEALKNLN VPYLVSLPLV FQTTEEWLDS ELGVHPVQVA LQVALPELDG AMEPIVFAGR

DSNTGKSHSL PDRIASLCAR AVNWANLRKK RNAEKKLAVT VFSFPPDKGN VGTAAYLNVF

GSIYRVLKNL QREGYDVGAL PPSEEDLIQS VLTQKEAKFN STDLHIAYKM KVDEYQKLCP

YAEALEENWG KPPGTLNTNG QELLVYGRQY GNVFIGVQPT FGYEGDPMRL LFSKSASPHH

GFAAYYTFLE KIFKADAVLH FGTHGSLEFM PGKQVGMSGV CYPDSLIGTI PNLYYYAANN

PSEATIAKRR SYANTISYLT PPAENAGLYK GLKELKELIS SYQGMRESGR AEQICATIIE

TAKLCNLDRD VTLPDADAKD LTMDMRDSVV GQVYRKLMEI ESRLLPCGLH VVGCPPTAEE

AVATLVNIAE LDRPDNNPPI KGMPGILARA IGRDIESIYS GNNKGVLADV DQLQRITEAS

RTCVREFVKD RTGLNGRIGT NWITNLLKFT GFYVDPWVRG LQNGEFASAN REELITLFNY

LEFCLTQVVK DNELGALVEA LNGQYVEPGP GGDPIRNPNV LPTGKNIHAL DPQSIPTQAA

LKSARLVVDR LLDRERDNNG GKYPETIALV LWGTDNIKTY GESLAQVMMM VGVKPVADAL

GRVNKLEVIP LEELGRPRVD VVVNCSGVFR DLFVNQMLLL DRAIKLAAEQ DEPDEMNFVR

KHAKQQAAEL GLQSLRDAAT RVFSNSSGSY SSNVNLAVEN SSWSDESQLQ EMYLKRKSYA

FNSDRPGAGG EMQRDVFETA MKTVDVTFQN LDSSEISLTD VSHYFDSDPT KLVASLRNDG

RTPNAYIADT TTANAQVRTL GETVRLDART KLLNPKWYEG MLASGYEGVR EIQKRMTNTM

GWSATSGMVD NWVYDEANST FIEDAAMAER LMNTNPNSFR KLVATFLEAN GRGYWDAKPE

QLERLRQLYM DVEDKIEGVE


References and documentation are available.

Please note the modified algorithm for extinction coefficient.


Number of amino acids: 1400

Molecular weight: 154817.9

Theoretical pI: 5.48

Amino acid composition: Ala (A) 116 8.3% Arg (R) 71 5.1% Asn (N) 79 5.6% Asp (D) 86 6.1% Cys (C) 13 0.9% Gln (Q) 48 3.4% Glu (E) 92 6.6% Gly (G) 105 7.5% His (H) 21 1.5% Ile (I) 51 3.6% Leu (L) 142 10.1% Lys (K) 80 5.7% Met (M) 38 2.7% Phe (F) 51 3.6% Pro (P) 71 5.1% Ser (S) 86 6.1% Thr (T) 71 5.1% Trp (W) 14 1.0% Tyr (Y) 50 3.6% Val (V) 115 8.2% Pyl (O) 0 0.0% Sec (U) 0 0.0% (B) 0 0.0% (Z) 0 0.0% (X) 0 0.0%


Total number of negatively charged residues (Asp + Glu): 178 Total number of positively charged residues (Arg + Lys): 151

Atomic composition:

Carbon C 6872 Hydrogen H 10826 Nitrogen N 1876 Oxygen O 2091 Sulfur S 51

Formula: C6872H10826N1876O2091S51 Total number of atoms: 21716

Extinction coefficients:

Extinction coefficients are in units of M-1 cm-1, at 280 nm measured in water.

Ext. coefficient 152250 Abs 0.1% (=1 g/l) 0.983, assuming all pairs of Cys residues form cystines


Ext. coefficient 151500 Abs 0.1% (=1 g/l) 0.979, assuming all Cys residues are reduced

Estimated half-life:

The N-terminal of the sequence considered is M (Met).

The estimated half-life is:

                            30 hours (mammalian reticulocytes, in vitro).
                           >20 hours (yeast, in vivo).
                           >10 hours (Escherichia coli, in vivo).


Instability index:

The instability index (II) is computed to be 32.82 This classifies the protein as stable.


Aliphatic index: 85.87

Grand average of hydropathicity (GRAVY): -0.292

Source

Chlamydomonas reinhardtii

References