Difference between revisions of "Part:BBa K1080001:Design"
(→Design Notes) |
(→Design Notes) |
||
Line 129: | Line 129: | ||
TGGACCCGAGAGCGCTCCGCGAGGGACGGCGAGTGAGATAGGCAGCAGCG<br></FONT> | TGGACCCGAGAGCGCTCCGCGAGGGACGGCGAGTGAGATAGGCAGCAGCG<br></FONT> | ||
+ | <b>Amino acid sequence</b> | ||
+ | |||
+ | <FONT FACE="courier">MRGSHHHHHH GMASMTGGQQ MGRDLYDDDD KDHPFTACNV ATGPRPPMTT FTGGNKGPAK | ||
− | |||
− | |||
− | |||
− | |||
QQVSLDLRDE GAGMFTSTSP EMRRVVPDDV KGRVKVKVVY VVLEAQYQSA ISAAVKNINA | QQVSLDLRDE GAGMFTSTSP EMRRVVPDDV KGRVKVKVVY VVLEAQYQSA ISAAVKNINA | ||
− | |||
KNSKVCFEVV GYLLEELRDQ KNLDMLKEDV ASANIFIGSL IFIEELAEKI VEAVSPLREK | KNSKVCFEVV GYLLEELRDQ KNLDMLKEDV ASANIFIGSL IFIEELAEKI VEAVSPLREK | ||
− | |||
LDACLIFPSM PAVMKLNKLG TFSMAQLGQS KSVFSEFIKS ARKNNDNFEE GLLKLVRTLP | LDACLIFPSM PAVMKLNKLG TFSMAQLGQS KSVFSEFIKS ARKNNDNFEE GLLKLVRTLP | ||
− | |||
KVLKYLPSDK AQDAKNFVNS LQYWLGGNSD NLENLLLNTV SNYVPALKGV DFSVAEPTAY | KVLKYLPSDK AQDAKNFVNS LQYWLGGNSD NLENLLLNTV SNYVPALKGV DFSVAEPTAY | ||
− | |||
PDVGIWHPLA SGMYEDLKEY LNWYDTRKDM VFAKDAPVIG LVLQRSHLVT GDEGHYSGVV | PDVGIWHPLA SGMYEDLKEY LNWYDTRKDM VFAKDAPVIG LVLQRSHLVT GDEGHYSGVV | ||
− | |||
AELESRGAKV IPVFAGGLDF SAPVKKFFYD PLGSGRTFVD TVVSLTGFAL VGGPARQDAP | AELESRGAKV IPVFAGGLDF SAPVKKFFYD PLGSGRTFVD TVVSLTGFAL VGGPARQDAP | ||
− | |||
KAIEALKNLN VPYLVSLPLV FQTTEEWLDS ELGVHPVQVA LQVALPELDG AMEPIVFAGR | KAIEALKNLN VPYLVSLPLV FQTTEEWLDS ELGVHPVQVA LQVALPELDG AMEPIVFAGR | ||
− | |||
DSNTGKSHSL PDRIASLCAR AVNWANLRKK RNAEKKLAVT VFSFPPDKGN VGTAAYLNVF | DSNTGKSHSL PDRIASLCAR AVNWANLRKK RNAEKKLAVT VFSFPPDKGN VGTAAYLNVF | ||
− | |||
GSIYRVLKNL QREGYDVGAL PPSEEDLIQS VLTQKEAKFN STDLHIAYKM KVDEYQKLCP | GSIYRVLKNL QREGYDVGAL PPSEEDLIQS VLTQKEAKFN STDLHIAYKM KVDEYQKLCP | ||
− | |||
YAEALEENWG KPPGTLNTNG QELLVYGRQY GNVFIGVQPT FGYEGDPMRL LFSKSASPHH | YAEALEENWG KPPGTLNTNG QELLVYGRQY GNVFIGVQPT FGYEGDPMRL LFSKSASPHH | ||
− | |||
GFAAYYTFLE KIFKADAVLH FGTHGSLEFM PGKQVGMSGV CYPDSLIGTI PNLYYYAANN | GFAAYYTFLE KIFKADAVLH FGTHGSLEFM PGKQVGMSGV CYPDSLIGTI PNLYYYAANN | ||
− | |||
PSEATIAKRR SYANTISYLT PPAENAGLYK GLKELKELIS SYQGMRESGR AEQICATIIE | PSEATIAKRR SYANTISYLT PPAENAGLYK GLKELKELIS SYQGMRESGR AEQICATIIE | ||
− | |||
TAKLCNLDRD VTLPDADAKD LTMDMRDSVV GQVYRKLMEI ESRLLPCGLH VVGCPPTAEE | TAKLCNLDRD VTLPDADAKD LTMDMRDSVV GQVYRKLMEI ESRLLPCGLH VVGCPPTAEE | ||
− | |||
AVATLVNIAE LDRPDNNPPI KGMPGILARA IGRDIESIYS GNNKGVLADV DQLQRITEAS | AVATLVNIAE LDRPDNNPPI KGMPGILARA IGRDIESIYS GNNKGVLADV DQLQRITEAS | ||
− | |||
RTCVREFVKD RTGLNGRIGT NWITNLLKFT GFYVDPWVRG LQNGEFASAN REELITLFNY | RTCVREFVKD RTGLNGRIGT NWITNLLKFT GFYVDPWVRG LQNGEFASAN REELITLFNY | ||
− | |||
LEFCLTQVVK DNELGALVEA LNGQYVEPGP GGDPIRNPNV LPTGKNIHAL DPQSIPTQAA | LEFCLTQVVK DNELGALVEA LNGQYVEPGP GGDPIRNPNV LPTGKNIHAL DPQSIPTQAA | ||
− | |||
LKSARLVVDR LLDRERDNNG GKYPETIALV LWGTDNIKTY GESLAQVMMM VGVKPVADAL | LKSARLVVDR LLDRERDNNG GKYPETIALV LWGTDNIKTY GESLAQVMMM VGVKPVADAL | ||
− | |||
GRVNKLEVIP LEELGRPRVD VVVNCSGVFR DLFVNQMLLL DRAIKLAAEQ DEPDEMNFVR | GRVNKLEVIP LEELGRPRVD VVVNCSGVFR DLFVNQMLLL DRAIKLAAEQ DEPDEMNFVR | ||
− | |||
KHAKQQAAEL GLQSLRDAAT RVFSNSSGSY SSNVNLAVEN SSWSDESQLQ EMYLKRKSYA | KHAKQQAAEL GLQSLRDAAT RVFSNSSGSY SSNVNLAVEN SSWSDESQLQ EMYLKRKSYA | ||
− | |||
FNSDRPGAGG EMQRDVFETA MKTVDVTFQN LDSSEISLTD VSHYFDSDPT KLVASLRNDG | FNSDRPGAGG EMQRDVFETA MKTVDVTFQN LDSSEISLTD VSHYFDSDPT KLVASLRNDG | ||
− | |||
RTPNAYIADT TTANAQVRTL GETVRLDART KLLNPKWYEG MLASGYEGVR EIQKRMTNTM | RTPNAYIADT TTANAQVRTL GETVRLDART KLLNPKWYEG MLASGYEGVR EIQKRMTNTM | ||
− | |||
GWSATSGMVD NWVYDEANST FIEDAAMAER LMNTNPNSFR KLVATFLEAN GRGYWDAKPE | GWSATSGMVD NWVYDEANST FIEDAAMAER LMNTNPNSFR KLVATFLEAN GRGYWDAKPE | ||
− | + | QLERLRQLYM DVEDKIEGVE </FONT> | |
− | QLERLRQLYM DVEDKIEGVE | + | |
Revision as of 03:00, 24 September 2013
ChlH
- 10COMPATIBLE WITH RFC[10]
- 12COMPATIBLE WITH RFC[12]
- 21INCOMPATIBLE WITH RFC[21]Illegal BglII site found at 1928
Illegal BglII site found at 2234
Illegal BglII site found at 3620 - 23COMPATIBLE WITH RFC[23]
- 25INCOMPATIBLE WITH RFC[25]Illegal NgoMIV site found at 486
Illegal NgoMIV site found at 2152
Illegal AgeI site found at 1132
Illegal AgeI site found at 2650
Illegal AgeI site found at 2704
Illegal AgeI site found at 2923 - 1000INCOMPATIBLE WITH RFC[1000]Illegal BsaI.rc site found at 2254
Illegal BsaI.rc site found at 3049
Illegal SapI.rc site found at 2979
Design Notes
CR-ChlH sequence with His Tag in pET15b. Sequence from Translation start site.
Note 8 PstI sites CTGCAG (785, 1031, 1385, 1652, 2681, 2825, 3458, 3569) No EcoRI or XbaI sites. 1 SpeI (ACTAGT) (4226) site 23 bp after stop codon.
ATGCGGGGTTCTCATCATCATCATCATCATGGTATGGCTAGCATGACTGGTGGACAGCAAATGGGTCGGGATCTGT
ACGACGATGACGATAAGGATCATCCCTTCACCGCGTGCAATGTGGCGACTGGACCCCGGCCGCCCATGACCACCTT
CACCGGTGGCAACAAGGGCCCTGCTAAGCAGCAGGTGTCGCTGGATCTGCGCGACGAGGGCGCTGGCATGTTCACC
AGCACCAGCCCGGAGATGCGCCGTGTCGTCCCTGACGATGTGAAGGGTCGCGTTAAGGTGAAGGTTGTGTACGTGG
TGCTGGAGGCCCAGTACCAGTCGGCCATCAGCGCTGCGGTGAAGAACATCAACGCCAAGAACTCCAAGGTGTGCTT
CGAGGTGGTGGGCTACCTGCTGGAGGAGCTGCGTGACCAGAAGAACCTCGATATGCTCAAGGAGGATGTGGCCTCT
GCCAACATCTTCATCGGCTCGCTCATCTTCATTGAGGAGCTTGCCGAGAAGATTGTGGAGGCGGTGAGCCCCCTGC
GCGAGAAGCTGGACGCGTGCCTGATCTTCCCGTCCATGCCGGCGGTCATGAAGCTGAACAAGCTGGGCACGTTTTC
GATGGCTCAGCTGGGCCAGTCGAAGTCGGTGTTCTCGGAGTTCATCAAGTCTGCTCGCAAGAACAACGACAACTTC
GAGGAGGGCTTGCTGAAGCTGGTGCGCACCCTGCCTAAGGTGCTGAAGTATCTGCCCTCGGACAAGGCGCAGGACG
CCAAGAACTTCGTGAACAGCCTGCAGTACTGGCTGGGCGGTAACTCGGACAACCTGGAGAACCTGCTGCTGAACAC
CGTCAGCAACTACGTGCCCGCTCTGAAGGGCGTGGACTTCAGCGTGGCTGAGCCCACCGCCTACCCCGATGTGGGT
ATCTGGCACCCTCTGGCCTCGGGCATGTACGAGGACCTGAAGGAGTACCTGAACTGGTACGACACCCGCAAGGACA
TGGTCTTCGCCAAGGACGCCCCCGTCATTGGCCTGGTGCTGCAGCGCTCGCACCTGGTGACTGGCGATGAGGGCCA
CTACAGCGGCGTGGTCGCTGAGCTGGAGAGCCGCGGTGCTAAGGTCATCCCCGTCTTTGCCGGTGGCCTGGACTTC
TCCGCCCCCGTCAAGAAGTTCTTCTACGACCCCCTGGGCTCTGGCCGCACGTTCGTGGACACCGTTGTGTCGCTGA
CCGGCTTCGCGCTGGTGGGCGGCCCCGCGCGCCAGGACGCGCCGAAGGCCATTGAGGCGCTGAAGAACCTGAACGT
GCCCTACCTGGTGTCGCTGCCGCTGGTGTTCCAGACCACTGAGGAGTGGCTGGACAGCGAGCTGGGCGTGCACCCC
GTCCAGGTGGCTCTGCAGGTTGCCCTGCCCGAGCTGGATGGTGCCATGGAGCCCATCGTGTTCGCTGGCCGTGACT
CGAACACCGGCAAGTCGCACTCGCTGCCCGACCGCATCGCTTCGCTGTGCGCTCGCGCCGTGAACTGGGCCAACCT
GCGCAAGAAGCGCAACGCCGAGAAGAAGCTGGCCGTCACCGTGTTCAGCTTCCCCCCTGACAAGGGCAACGTCGGC
ACTGCCGCCTACCTGAACGTGTTCGGCTCCATCTACCGCGTGCTGAAGAACCTGCAGCGCGAGGGCTACGACGTGG
GCGCCCTGCCGCCCTCGGAGGAGGATCTGATCCAGTCGGTGCTGACCCAGAAGGAGGCCAAGTTCAACTCGACCGA
CCTGCACATCGCCTACAAGATGAAGGTGGACGAGTACCAGAAGCTGTGCCCTTACGCCGAGGCGCTGGAGGAGAAC
TGGGGCAAGCCCCCCGGCACCCTGAACACCAACGGCCAGGAGCTGCTGGTGTACGGCCGCCAGTACGGCAACGTCT
TCATCGGCGTGCAGCCCACCTTCGGCTACGAGGGCGACCCGATGCGCCTGCTGTTCTCGAAGTCGGCCAGCCCCCA
CCACGGCTTCGCCGCCTACTACACCTTCCTGGAGAAGATCTTCAAGGCCGACGCCGTGCTGCACTTCGGCACCCAC
GGCTCGCTGGAGTTCATGCCCGGCAAGCAGGTCGGCATGTCGGGTGTGTGCTACCCCGACTCGCTGATCGGCACCA
TCCCCAACCTCTACTACTACGCCGCCAACAACCCGTCTGAGGCCACCATCGCCAAGCGCCGCTCGTACGCCAACAC
CATTTCGTACCTGACGCCGCCTGCCGAGAACGCCGGCCTGTACAAGGGCCTGAAGGAGCTGAAGGAGCTGATCAGC
TCGTACCAGGGCATGCGTGAGTCTGGCCGCGCCGAGCAGATCTGCGCCACCATCATTGAGACCGCCAAGCTGTGCA
ACCTGGACCGCGACGTGACCCTGCCCGACGCTGACGCCAAGGACCTGACCATGGACATGCGCGACAGCGTTGTGGG
CCAGGTGTACCGCAAGCTGATGGAGATTGAGTCCCGCCTGCTGCCCTGCGGCCTGCACGTGGTGGGCTGCCCGCCC
ACCGCCGAGGAGGCCGTGGCCACCCTGGTCAACATCGCTGAGCTGGACCGCCCGGACAACAACCCCCCCATCAAGG
GCATGCCCGGCATCCTGGCCCGCGCCATTGGTCGCGACATCGAGTCGATTTACAGCGGCAACAACAAGGGCGTCCT
GGCTGACGTTGACCAGCTGCAGCGCATCACCGAGGCCTCCCGCACCTGCGTGCGCGAGTTCGTGAAGGACCGCACC
GGCCTGAACGGCCGCATCGGCACCAACTGGATCACCAACCTGCTCAAGTTCACCGGCTTCTACGTGGACCCCTGGG
TGCGCGGCCTGCAGAACGGCGAGTTCGCCAGCGCCAACCGCGAGGAGCTGATCACCCTGTTCAACTACCTGGAGTT
CTGCCTGACCCAGGTGGTCAAGGACAACGAGCTGGGCGCCCTGGTAGAGGCGCTGAACGGCCAGTACGTCGAGCCC
GGCCCCGGCGGTGACCCCATCCGCAACCCCAACGTGCTGCCCACCGGCAAGAACATCCACGCCCTGGACCCTCAGT
CGATTCCCACTCAGGCCGCGCTGAAGAGCGCCCGCCTGGTGGTGGACCGCCTGCTGGACCGCGAGCGCGACAACAA
CGGCGGCAAGTACCCCGAGACCATCGCGCTGGTGCTGTGGGGCACTGACAACATCAAGACCTACGGCGAGTCGCTG
GCCCAGGTCATGATGATGGTCGGTGTCAAGCCCGTGGCCGACGCCCTGGGCCGCGTGAACAAGCTGGAGGTGATCC
CTCTGGAGGAGCTGGGCCGCCCCCGCGTGGACGTGGTTGTCAACTGCTCGGGTGTGTTCCGCGACCTGTTCGTGAA
CCAGATGCTGCTGCTGGACCGCGCCATCAAGCTGGCGGCCGAGCAGGACGAGCCCGATGAGATGAACTTCGTGCGC
AAGCACGCCAAGCAGCAGGCGGCGGAGCTGGGCCTGCAGAGCCTGCGCGACGCGGCCACCCGTGTGTTCTCCAACA
GCTCGGGCTCCTACTCGTCCAACGTCAACCTGGCGGTGGAGAACAGCAGCTGGAGCGACGAGTCGCAGCTGCAGGA
GATGTACCTGAAGCGCAAGTCGTACGCCTTCAACTCGGACCGCCCCGGCGCCGGTGGCGAGATGCAGCGCGACGTG
TTCGAGACGGCCATGAAGACCGTGGACGTGACCTTCCAGAACCTGGACTCGTCCGAGATCTCGCTGACCGATGTGT
CGCACTACTTCGACTCCGACCCCACCAAGCTGGTGGCGTCGCTGCGCAACGACGGCCGCACCCCCAACGCCTACAT
CGCCGACACCACCACCGCCAACGCGCAGGTCCGCACTCTGGGTGAGACCGTGCGCCTGGACGCCCGCACCAAGCTG
CTCAACCCCAAGTGGTACGAGGGCATGCTTGCCTCGGGCTACGAGGGCGTGCGCGAGATCCAGAAGCGCATGACCA
ACACCATGGGCTGGTCGGCCACCTCGGGCATGGTGGACAACTGGGTGTACGACGAGGCCAACTCGACCTTCATCGA
GGATGCGGCCATGGCCGAGCGCCTGATGAACACCAACCCCAACAGCTTCCGCAAGCTGGTGGCCACCTTCCTGGAG
GCCAACGGCCGCGGCTACTGGGACGCCAAGCCCGAGCAGCTGGAGCGCCTGCGCCAGCTGTACATGGACGTGGAGG
ACAAGATTGAGGGCGTCGAATAAGCGGCCTCCCCTTCATGGTAGCACTAGTTGGCGGGTTGTGGTTGGACTAGGCG
GCTAGGGTATATACCTAGTAGCGGCGGCTGCGGAGTGGAGGGCTGGCGCCCAGCGCGAGGGCGTGGCCTTTCCTCC
TGGACCCGAGAGCGCTCCGCGAGGGACGGCGAGTGAGATAGGCAGCAGCG
Amino acid sequence
MRGSHHHHHH GMASMTGGQQ MGRDLYDDDD KDHPFTACNV ATGPRPPMTT FTGGNKGPAK
QQVSLDLRDE GAGMFTSTSP EMRRVVPDDV KGRVKVKVVY VVLEAQYQSA ISAAVKNINA
KNSKVCFEVV GYLLEELRDQ KNLDMLKEDV ASANIFIGSL IFIEELAEKI VEAVSPLREK
LDACLIFPSM PAVMKLNKLG TFSMAQLGQS KSVFSEFIKS ARKNNDNFEE GLLKLVRTLP
KVLKYLPSDK AQDAKNFVNS LQYWLGGNSD NLENLLLNTV SNYVPALKGV DFSVAEPTAY
PDVGIWHPLA SGMYEDLKEY LNWYDTRKDM VFAKDAPVIG LVLQRSHLVT GDEGHYSGVV
AELESRGAKV IPVFAGGLDF SAPVKKFFYD PLGSGRTFVD TVVSLTGFAL VGGPARQDAP
KAIEALKNLN VPYLVSLPLV FQTTEEWLDS ELGVHPVQVA LQVALPELDG AMEPIVFAGR
DSNTGKSHSL PDRIASLCAR AVNWANLRKK RNAEKKLAVT VFSFPPDKGN VGTAAYLNVF
GSIYRVLKNL QREGYDVGAL PPSEEDLIQS VLTQKEAKFN STDLHIAYKM KVDEYQKLCP
YAEALEENWG KPPGTLNTNG QELLVYGRQY GNVFIGVQPT FGYEGDPMRL LFSKSASPHH
GFAAYYTFLE KIFKADAVLH FGTHGSLEFM PGKQVGMSGV CYPDSLIGTI PNLYYYAANN
PSEATIAKRR SYANTISYLT PPAENAGLYK GLKELKELIS SYQGMRESGR AEQICATIIE
TAKLCNLDRD VTLPDADAKD LTMDMRDSVV GQVYRKLMEI ESRLLPCGLH VVGCPPTAEE
AVATLVNIAE LDRPDNNPPI KGMPGILARA IGRDIESIYS GNNKGVLADV DQLQRITEAS
RTCVREFVKD RTGLNGRIGT NWITNLLKFT GFYVDPWVRG LQNGEFASAN REELITLFNY
LEFCLTQVVK DNELGALVEA LNGQYVEPGP GGDPIRNPNV LPTGKNIHAL DPQSIPTQAA
LKSARLVVDR LLDRERDNNG GKYPETIALV LWGTDNIKTY GESLAQVMMM VGVKPVADAL
GRVNKLEVIP LEELGRPRVD VVVNCSGVFR DLFVNQMLLL DRAIKLAAEQ DEPDEMNFVR
KHAKQQAAEL GLQSLRDAAT RVFSNSSGSY SSNVNLAVEN SSWSDESQLQ EMYLKRKSYA
FNSDRPGAGG EMQRDVFETA MKTVDVTFQN LDSSEISLTD VSHYFDSDPT KLVASLRNDG
RTPNAYIADT TTANAQVRTL GETVRLDART KLLNPKWYEG MLASGYEGVR EIQKRMTNTM
GWSATSGMVD NWVYDEANST FIEDAAMAER LMNTNPNSFR KLVATFLEAN GRGYWDAKPE
QLERLRQLYM DVEDKIEGVE
References and documentation are available.
Please note the modified algorithm for extinction coefficient.
Number of amino acids: 1400
Molecular weight: 154817.9
Theoretical pI: 5.48
Amino acid composition: Ala (A) 116 8.3% Arg (R) 71 5.1% Asn (N) 79 5.6% Asp (D) 86 6.1% Cys (C) 13 0.9% Gln (Q) 48 3.4% Glu (E) 92 6.6% Gly (G) 105 7.5% His (H) 21 1.5% Ile (I) 51 3.6% Leu (L) 142 10.1% Lys (K) 80 5.7% Met (M) 38 2.7% Phe (F) 51 3.6% Pro (P) 71 5.1% Ser (S) 86 6.1% Thr (T) 71 5.1% Trp (W) 14 1.0% Tyr (Y) 50 3.6% Val (V) 115 8.2% Pyl (O) 0 0.0% Sec (U) 0 0.0% (B) 0 0.0% (Z) 0 0.0% (X) 0 0.0%
Total number of negatively charged residues (Asp + Glu): 178
Total number of positively charged residues (Arg + Lys): 151
Atomic composition:
Carbon C 6872 Hydrogen H 10826 Nitrogen N 1876 Oxygen O 2091 Sulfur S 51
Formula: C6872H10826N1876O2091S51 Total number of atoms: 21716
Extinction coefficients:
Extinction coefficients are in units of M-1 cm-1, at 280 nm measured in water.
Ext. coefficient 152250 Abs 0.1% (=1 g/l) 0.983, assuming all pairs of Cys residues form cystines
Ext. coefficient 151500
Abs 0.1% (=1 g/l) 0.979, assuming all Cys residues are reduced
Estimated half-life:
The N-terminal of the sequence considered is M (Met).
The estimated half-life is:
30 hours (mammalian reticulocytes, in vitro). >20 hours (yeast, in vivo). >10 hours (Escherichia coli, in vivo).
Instability index:
The instability index (II) is computed to be 32.82 This classifies the protein as stable.
Aliphatic index: 85.87
Grand average of hydropathicity (GRAVY): -0.292
Source
Chlamydomonas reinhardtii