Difference between revisions of "Part:BBa K1080002:Design"

(Design Notes)
(Design Notes)
Line 30: Line 30:
  
  
Protein sequence:
+
<b>Amino acid sequence:</b>
  
        10        20        30        40        50        60
+
<FONT FACE="courier">MRGSHHHHHH GMASMTGGQQ MGRDLYDDDD KDHPFTLRAM KVSEEDSKGF DADVSTRLAR  
MRGSHHHHHH GMASMTGGQQ MGRDLYDDDD KDHPFTLRAM KVSEEDSKGF DADVSTRLAR  
+
  
        70        80        90        100        110        120
 
 
SYPLAAVVGQ DNIKQALLLG AVDTGLGGIA IAGRRGTAKS IMARGLHALL PPIEVVEGSI  
 
SYPLAAVVGQ DNIKQALLLG AVDTGLGGIA IAGRRGTAKS IMARGLHALL PPIEVVEGSI  
  
      130        140        150        160        170        180
 
 
CNADPEDPRS WEAGLAEKYA GGPVKTKMRS APFVQIPLGV TEDRLVGTVD IEASMKEGKT  
 
CNADPEDPRS WEAGLAEKYA GGPVKTKMRS APFVQIPLGV TEDRLVGTVD IEASMKEGKT  
  
      190        200        210        220        230        240
 
 
VFQPGLLAEA HRGILYVDEI NLLDDGIANL LLSILSDGVN VVEREGISIS HPCRPLLIAT  
 
VFQPGLLAEA HRGILYVDEI NLLDDGIANL LLSILSDGVN VVEREGISIS HPCRPLLIAT  
  
      250        260        270        280        290        300
 
 
YNPEEGPLRE HLLDRIAIGL SADVPSTSDE RVKAIDAAIR FQDKPQDTID DTAELTDALR  
 
YNPEEGPLRE HLLDRIAIGL SADVPSTSDE RVKAIDAAIR FQDKPQDTID DTAELTDALR  
  
      310        320        330        340        350        360
 
 
TSVILAREYL KDVTIAPEQV TYIVEEARRG GVQGHRAELY AVKCAKACAA LEGRERVNKD  
 
TSVILAREYL KDVTIAPEQV TYIVEEARRG GVQGHRAELY AVKCAKACAA LEGRERVNKD  
  
      370        380        390        400        410        420
 
 
DLRQAVQLVI LPRATILDQP PPEQEQPPPP PPPPPPPPPQ DQMEDEDQEE KEDEKEEEEK  
 
DLRQAVQLVI LPRATILDQP PPEQEQPPPP PPPPPPPPPQ DQMEDEDQEE KEDEKEEEEK  
  
      430        440        450        460        470        480
 
 
ENEDQDEPEI PQEFMFESEG VIMDPSILMF AQQQQRAQGR SGRAKTLIFS DDRGRYIKPM  
 
ENEDQDEPEI PQEFMFESEG VIMDPSILMF AQQQQRAQGR SGRAKTLIFS DDRGRYIKPM  
  
      490        500        510        520        530        540
 
 
LPKGDKVKRL AVDATLRAAA PYQKIRRQQA ISEGKVQRKV YVDKPDMRSK KLARKAGALV  
 
LPKGDKVKRL AVDATLRAAA PYQKIRRQQA ISEGKVQRKV YVDKPDMRSK KLARKAGALV  
  
      550        560        570        580        590        600
 
 
IFVVDASGSM ALNRMSAAKG ACMRLLAESY TSRDQVCLIP FYGDKAEVLL PPSKSIAMAR  
 
IFVVDASGSM ALNRMSAAKG ACMRLLAESY TSRDQVCLIP FYGDKAEVLL PPSKSIAMAR  
 
+
      610        620        630        640        650        660
+
 
RRLDSLPCGG GSPLAHGLST AVRVGMQASQ AGEVGRVMMV LITDGRANVS LAKSNEDPEA  
 
RRLDSLPCGG GSPLAHGLST AVRVGMQASQ AGEVGRVMMV LITDGRANVS LAKSNEDPEA  
  
      670        680        690        700        710        720
 
 
LKPDAPKPTA DSLKDEVRDM AKKAASAGIN VLVIDTENKF VSTGFAEEIS KAAQGKYYYL  
 
LKPDAPKPTA DSLKDEVRDM AKKAASAGIN VLVIDTENKF VSTGFAEEIS KAAQGKYYYL  
  
      730        740
+
PNASDAAIAA AASGAMAAAK GGY </FONT>
PNASDAAIAA AASGAMAAAK GGY  
+
  
  

Revision as of 03:06, 24 September 2013

ChlD


Assembly Compatibility:
  • 10
    COMPATIBLE WITH RFC[10]
  • 12
    INCOMPATIBLE WITH RFC[12]
    Illegal NotI site found at 1334
  • 21
    INCOMPATIBLE WITH RFC[21]
    Illegal BglII site found at 2039
  • 23
    COMPATIBLE WITH RFC[23]
  • 25
    INCOMPATIBLE WITH RFC[25]
    Illegal NgoMIV site found at 614
    Illegal NgoMIV site found at 1433
    Illegal NgoMIV site found at 1975
  • 1000
    COMPATIBLE WITH RFC[1000]


Design Notes

Incorporated sequence overlap for Gibson assembly and no GC rich region or restriction site in sequence


ChlD Clone: DNA sequence from translation start site: Regions in BOLD are the sequence of the leader region in the pET100 plasmid. Translated the DNA sequence into a protein sequence using "Translate" at http://au.expasy.org/tools Then used the translated protein sequence to analyse the protein using "ProtParam" at http://au.expasy.org/tools

Note: No XbaI, EcoRI, PstI or SpeI sites


ATG CGG GGT TCT CAT CAT CAT CAT CAT CAT GGT ATG GCT AGC ATG ACT GGT GGA

CAG CAA ATG GGT CGG GAT CTG TAC GAC GAT GAC GAT AAG GAT CAT CCC TTC ACC


CTGCGCGCCATGAAGGTGTCTGAGGAGGACTCCAAGGGCTTCGATGCGGATGTGTCGACCCGCCTGGCCCGCTCG TACCCTCTGGCGGCCGTGGTGGGCCAGGACAACATCAAGCAGGCGCTGCTGCTGGGCGCCGTGGACACCGGGCTG GGCGGCATCGCCATCGCCGGTCGCCGCGGTACCGCCAAGTCCATCATGGCTCGCGGCCTGCACGCTCTGCTGCCG CCCATTGAGGTGGTGGAGGGCAGCATCTGCAACGCCGACCCCGAGGACCCCCGCTCCTGGGAGGCTGGCCTGGCT GAGAAGTATGCGGGCGGCCCTGTGAAGACCAAGATGCGCTCGGCGCCGTTTGTGCAGATCCCTCTGGGTGTGACT GAGGACCGCTTGGTGGGCACTGTGGACATTGAGGCGTCCATGAAGGAGGGCAAGACTGTGTTCCAGCCCGGCCTG CTGGCTGAGGCGCACCGCGGCATCCTGTACGTGGACGAGATCAACCTGCTGGATGACGGCATTGCCAACCTGCTG CTGTCCATCCTGTCGGACGGAGTCAACGTGGTGGAGCGCGAGGGCATCTCCATCAGCCACCCCTGCCGGCCGCTG CTGATTGCCACCTACAACCCCGAGGAGGGCCCTCTGCGTGAGCACCTGCTGGACCGCATCGCCATTGGCCTCAGC GCCGACGTCCCCAGCACCAGCGACGAGCGCGTCAAGGCCATTGACGCAGCCATCCGCTTCCAGGACAAGCCGCAG GACACTATTGACGACACCGCGGAGCTCACCGACGCCCTGCGCACCTCGGTCATCCTGGCTCGCGAGTACCTGAAG GACGTGACCATCGCGCCGGAGCAGGTGACCTACATTGTGGAGGAGGCGCGCCGCGGCGGAGTCCAGGGGCACCGC GCGGAGCTGTACGCGGTCAAGTGTGCCAAGGCGTGTGCGGCTCTGGAGGGCCGTGAGCGTGTGAACAAGGATGAC CTGCGCCAGGCCGTGCAGCTGGTCATCCTGCCGCGCGCCACCATCCTGGACCAGCCCCCGCCCGAGCAGGAGCAG CCCCCGCCGCCGCCCCCGCCCCCTCCCCCGCCGCCGCCGCAGGACCAAATGGAGGACGAGGACCAGGAGGAGAAG GAGGACGAGAAGGAGGAGGAGGAGAAGGAGAACGAGGACCAGGACGAGCCCGAGATCCCTCAGGAGTTCATGTTT GAGTCCGAGGGCGTCATCATGGACCCCTCCATCCTCATGTTCGCGCAGCAGCAGCAGCGCGCGCAGGGCCGCTCC GGCCGCGCCAAGACGCTCATCTTCAGCGACGACCGCGGCCGCTACATCAAGCCCATGCTGCCCAAGGGTGACAAG GTCAAGCGCCTGGCAGTGGACGCCACGCTTCGCGCCGCCGCGCCCTACCAGAAGATTCGCCGGCAGCAGGCCATC AGCGAGGGCAAGGTGCAGCGCAAGGTGTACGTGGACAAGCCAGACATGCGCTCCAAGAAGCTGGCCCGCAAGGCC GGTGCGCTGGTGATTTTTGTTGTGGACGCGTCCGGCTCCATGGCTCTGAACCGCATGAGCGCCGCCAAGGGCGCC TGCATGCGCCTGCTGGCTGAGTCGTACACCAGCCGCGACCAGGTGTGCCTCATCCCCTTCTACGGCGACAAGGCC GAGGTGCTGCTGCCGCCCTCCAAGTCCATCGCCATGGCCCGCCGCCGCCTGGACTCGCTGCCCTGCGGCGGCGGC TCGCCCCTTGCGCACGGCCTGTCCACGGCGGTACGTGTGGGCATGCAGGCCAGCCAGGCGGGCGAGGTGGGCCGC GTCATGATGGTGCTCATCACGGACGGCCGCGCCAACGTCAGCCTGGCCAAGTCCAACGAGGACCCCGAGGCGCTC AAGCCCGACGCGCCCAAGCCCACCGCCGACTCGCTGAAGGACGAGGTGCGCGACATGGCCAAGAAGGCCGCGTCC GCCGGCATCAACGTGCTTGTCATTGACACGGAGAACAAGTTCGTGAGCACCGGCTTTGCGGAGGAGATCTCCAAG GCAGCGCAGGGCAAGTACTACTACCTGCCCAACGCCAGCGACGCCGCCATCGCGGCGGCCGCGTCCGGCGCCATG GCCGCGGCCAAGGGCGGCTACTAGGTGCCGAGTGACTGAGGTGGCAAGGTGCAGTGGCGGCGGAGGCAGTTGTGC TGGGGTGGCAAGGCGGACAGGCGAAGCTGGTGGGTTGCGACGAGGAGGAGGTGCACGTGCACGCGTAACATAAGA AGAACAGTGGGAGGACAGGTAGCGTGACTTGACTGGGACGAGGAGCGTACTGATGTGTGGCGTGTGTTGGTATGT
GAGCGTTACCCCTCC


Amino acid sequence:

MRGSHHHHHH GMASMTGGQQ MGRDLYDDDD KDHPFTLRAM KVSEEDSKGF DADVSTRLAR

SYPLAAVVGQ DNIKQALLLG AVDTGLGGIA IAGRRGTAKS IMARGLHALL PPIEVVEGSI

CNADPEDPRS WEAGLAEKYA GGPVKTKMRS APFVQIPLGV TEDRLVGTVD IEASMKEGKT

VFQPGLLAEA HRGILYVDEI NLLDDGIANL LLSILSDGVN VVEREGISIS HPCRPLLIAT

YNPEEGPLRE HLLDRIAIGL SADVPSTSDE RVKAIDAAIR FQDKPQDTID DTAELTDALR

TSVILAREYL KDVTIAPEQV TYIVEEARRG GVQGHRAELY AVKCAKACAA LEGRERVNKD

DLRQAVQLVI LPRATILDQP PPEQEQPPPP PPPPPPPPPQ DQMEDEDQEE KEDEKEEEEK

ENEDQDEPEI PQEFMFESEG VIMDPSILMF AQQQQRAQGR SGRAKTLIFS DDRGRYIKPM

LPKGDKVKRL AVDATLRAAA PYQKIRRQQA ISEGKVQRKV YVDKPDMRSK KLARKAGALV

IFVVDASGSM ALNRMSAAKG ACMRLLAESY TSRDQVCLIP FYGDKAEVLL PPSKSIAMAR

RRLDSLPCGG GSPLAHGLST AVRVGMQASQ AGEVGRVMMV LITDGRANVS LAKSNEDPEA

LKPDAPKPTA DSLKDEVRDM AKKAASAGIN VLVIDTENKF VSTGFAEEIS KAAQGKYYYL

PNASDAAIAA AASGAMAAAK GGY


References and documentation are available.

Please note the modified algorithm for extinction coefficient.


Number of amino acids: 743

Molecular weight: 80527.5

Theoretical pI: 5.37

Amino acid composition: Ala (A) 89 12.0% Arg (R) 49 6.6% Asn (N) 14 1.9% Asp (D) 57 7.7% Cys (C) 7 0.9% Gln (Q) 33 4.4% Glu (E) 55 7.4% Gly (G) 56 7.5% His (H) 13 1.7% Ile (I) 40 5.4% Leu (L) 62 8.3% Lys (K) 44 5.9% Met (M) 23 3.1% Phe (F) 13 1.7% Pro (P) 51 6.9% Ser (S) 44 5.9% Thr (T) 26 3.5% Trp (W) 1 0.1% Tyr (Y) 17 2.3% Val (V) 49 6.6% Pyl (O) 0 0.0% Sec (U) 0 0.0% (B) 0 0.0% (Z) 0 0.0% (X) 0 0.0%


Total number of negatively charged residues (Asp + Glu): 112 Total number of positively charged residues (Arg + Lys): 93

Atomic composition:

Carbon C 3504 Hydrogen H 5684 Nitrogen N 1008 Oxygen O 1102 Sulfur S 30

Formula: C3504H5684N1008O1102S30 Total number of atoms: 11328

Extinction coefficients:

Extinction coefficients are in units of M-1 cm-1, at 280 nm measured in water.

Ext. coefficient 31205 Abs 0.1% (=1 g/l) 0.388, assuming all pairs of Cys residues form cystines


Ext. coefficient 30830 Abs 0.1% (=1 g/l) 0.383, assuming all Cys residues are reduced

Estimated half-life:

The N-terminal of the sequence considered is M (Met).

The estimated half-life is:

                            30 hours (mammalian reticulocytes, in vitro).
                           >20 hours (yeast, in vivo).
                           >10 hours (Escherichia coli, in vivo).


Instability index:

The instability index (II) is computed to be 44.71 This classifies the protein as unstable.


Aliphatic index: 84.64

Grand average of hydropathicity (GRAVY): -0.392

Source

Chlamydomonas reinhardtii

References