Difference between revisions of "Help:Protein coding"

m
 
(38 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 +
[[Category:Protein Coding Sequences]]
 
[[Image:Part icon cds.png]]  
 
[[Image:Part icon cds.png]]  
<small>Browse [https://parts.igem.org/cgi/partsdb/pgroup.cgi?pgroup=Coding protein coding parts]!</small>
+
<small>Browse [https://parts.igem.org/cgi/partsdb/pgroup.cgi?pgroup=Coding protein-coding parts]!</small>
  
 
<hr>
 
<hr>
Protein coding parts are parts which create functional proteins.  Most of the parts presented on this page are protein coding regions only--however a few of them currently also contain [[Help:Ribosome Binding Site|RBS]] sites.
+
"Protein-coding" parts [also called "coding sequences" (cds) or "open reading frames" (ORFs)]  contain the sequence information needed to create functional protein (polypeptide) chains .  Most of the parts presented on this page are protein-coding sequences only -- a few of them, however, currently also contain [[Help:Ribosome Binding Site|RBS]] sites.
  
==Regulations for Protein Coding Parts==
+
==Regulations for Protein-Coding Parts==
Every biobrick coding region consists of the following structure:
+
Every BioBrick coding sequence consists of the following structure:
 
*It begins with a standard start codon: "<b>ATG</b>"  
 
*It begins with a standard start codon: "<b>ATG</b>"  
*It ends with two "stop" codons: "<b>TAA","TAA</b>".  
+
*It ends with two "stop" codons: "<b>TAA","TAA</b>", ''i.e.,'' "<b>TAATAA</b>".
  
The actual protein coding sequence is sandwiched between the start codon and two stop codons. Thus the structure of a biobricks protein coding region sequence looks like: "<b>ATG[your coding region]TAATAA</b>"
+
The actual protein-coding sequence begins with the start codon and ends immediately before the two stop codons. Thus the structure of a BioBrick protein-coding region sequence looks like: "<b>ATG-[inserted coding region]-TAATAA </b>" (Note that '''ATG''' ---> '''AUG''' in the mRNA transcript, and '''AUG''' codes for methionine.  Depending on which protein you have, the methionine may or may not remain as the first residue in the expressed amino acid sequence.)
  
 
==Direction==
 
==Direction==
A coding region can point RNA polymerase in either the ''forward'' or ''reverse'' directions depending on which strand of the double stranded DNA molecule it decides to bind to.  Currently most Biobrick parts transcribe DNA in the forward direction.<br>
+
A coding region can point RNA polymerase in either the ''forward'' or ''reverse'' direction depending on which strand of the double-stranded DNA molecule binds the polymerase.  Currently most BioBrick parts get transcribed in the forward direction. By convention, in the scientific literature (and in textbooks) sequences are always presented with the forward direction to the right.  In the case of BioBricks, however, the location of the cloning sites determines the "forward" direction.  The BioBrick prefix contains the EcoRI and XbaI restriction sites, and when a coding sequence starts (with its '''ATG''' initiation codon) at the ''prefix,''  it's said to run in the forward direction.  However, nothing prevents us from inserting the coding sequence the other way around with the '''ATG''' adjacent to the BioBrick ''suffix'' and the coding sequence running from right to left.  We define this as the "reverse" orientation.
<br>
+
  
==References and Sources for Protein Coding Information==
+
==References and Sources for Protein-Coding Information==
There are a number of excellent sources for protein coding sequence.  Some of them include:
+
There are a number of excellent sources for protein-coding sequence informationThey include:
<small>
+
 
*Pubmed (Entrez Protein) [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Protein&itool=toolbar link]
+
*PubMed (Entrez Protein) [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Protein&itool=toolbar link]
 
*UniProt [http://www.pir.uniprot.org/ link]
 
*UniProt [http://www.pir.uniprot.org/ link]
*UniProt-SwissProt/TrEMBL [http://www.ebi.ac.uk/swissprot/ link]  
+
*UniProt-SwissProt/TrEMBL [http://www.ebi.ac.uk/swissprot/ link]
*Protein Data Bank (PDB) [http://www.rcsb.org/pdb/Welcome.do link] </small>
+
  
 +
For 3-dimensional structural information, go to:
 +
 +
*Protein Data Bank (PDB) [http://www.rcsb.org/pdb/Welcome.do link]
  
 
==Protein Barcodes==
 
==Protein Barcodes==
Many protein coding parts in the Registry are trackable through the use of pieces of DNA which are:
+
{|
 +
|[[Image:feature_barcode.png|frame|left|a BioBrick [[Help:Barcode|Barcode]], shown under Part Design:Features, ssDNA viewing mode]]
 +
|Many [https://parts.igem.org/cgi/partsdb/pgroup.cgi?pgroup=Coding protein-coding parts] in the Registry are trackable through the use of pieces of DNA which are:
 
non-coding (no "start" codon), rare, and about 25 base pairs in length.  
 
non-coding (no "start" codon), rare, and about 25 base pairs in length.  
These sequences are known as [[BioBrick Barcodes]].
+
These sequences are known as [[Help:Barcode|Barcodes]].
 
+
|}
  
 
==Tags==
 
==Tags==
Often, [https://parts.igem.org/cgi/partsdb/pgroup.cgi?pgroup=Coding protein coding parts] have [[Help:Tag|tags]] attached to their ends in order to add functionalities such as fast degradation.  The lengths of these tags depend on their function (the predominant LVA, AAV, ASV, etc. degradation tags are 11 amino acids or 33 base pairs in length).   
+
Often, [https://parts.igem.org/cgi/partsdb/pgroup.cgi?pgroup=Coding protein-coding parts] have [[Help:Tag|tags]] attached to their ends in order to add functionalities such as fast degradation.  The lengths of these tags depend on their function (the predominant LVA, AAV, ASV, etc. degradation tags are 11 amino acids or 33 base pairs in length).   
  
Because [https://parts.igem.org/cgi/partsdb/pgroup.cgi?pgroup=Coding protein coding parts] currently all end in the double stop codon "<b>TAATAA</b>", we currently cannot add on [https://parts.igem.org/cgi/partsdb/pgroup.cgi?pgroup=Tag Tag parts] in the manner of [[Assembly:Standard assembly|Biobricks Standard Assembly method]].
+
Because [https://parts.igem.org/cgi/partsdb/pgroup.cgi?pgroup=Coding protein-coding parts] currently all end in the double stop codon "<b>TAATAA</b>", you cannot add on [https://parts.igem.org/cgi/partsdb/pgroup.cgi?pgroup=Tag tag parts] in the manner of [[Assembly:Standard assembly|Biobricks standard assembly]].
  
 
<small>For more information, visit [[Image:Part_icon_tag.png]][[Help:Tag]].</small>
 
<small>For more information, visit [[Image:Part_icon_tag.png]][[Help:Tag]].</small>

Latest revision as of 21:55, 20 July 2017

Part icon cds.png Browse protein-coding parts!


"Protein-coding" parts [also called "coding sequences" (cds) or "open reading frames" (ORFs)] contain the sequence information needed to create functional protein (polypeptide) chains . Most of the parts presented on this page are protein-coding sequences only -- a few of them, however, currently also contain RBS sites.

Regulations for Protein-Coding Parts

Every BioBrick coding sequence consists of the following structure:

  • It begins with a standard start codon: "ATG"
  • It ends with two "stop" codons: "TAA","TAA", i.e., "TAATAA".

The actual protein-coding sequence begins with the start codon and ends immediately before the two stop codons. Thus the structure of a BioBrick protein-coding region sequence looks like: "ATG-[inserted coding region]-TAATAA " (Note that ATG ---> AUG in the mRNA transcript, and AUG codes for methionine. Depending on which protein you have, the methionine may or may not remain as the first residue in the expressed amino acid sequence.)

Direction

A coding region can point RNA polymerase in either the forward or reverse direction depending on which strand of the double-stranded DNA molecule binds the polymerase. Currently most BioBrick parts get transcribed in the forward direction. By convention, in the scientific literature (and in textbooks) sequences are always presented with the forward direction to the right. In the case of BioBricks, however, the location of the cloning sites determines the "forward" direction. The BioBrick prefix contains the EcoRI and XbaI restriction sites, and when a coding sequence starts (with its ATG initiation codon) at the prefix, it's said to run in the forward direction. However, nothing prevents us from inserting the coding sequence the other way around with the ATG adjacent to the BioBrick suffix and the coding sequence running from right to left. We define this as the "reverse" orientation.

References and Sources for Protein-Coding Information

There are a number of excellent sources for protein-coding sequence information. They include:

  • PubMed (Entrez Protein) [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Protein&itool=toolbar link]
  • UniProt [http://www.pir.uniprot.org/ link]
  • UniProt-SwissProt/TrEMBL [http://www.ebi.ac.uk/swissprot/ link]

For 3-dimensional structural information, go to:

  • Protein Data Bank (PDB) [http://www.rcsb.org/pdb/Welcome.do link]

Protein Barcodes

a BioBrick Barcode, shown under Part Design:Features, ssDNA viewing mode
Many protein-coding parts in the Registry are trackable through the use of pieces of DNA which are:

non-coding (no "start" codon), rare, and about 25 base pairs in length. These sequences are known as Barcodes.

Tags

Often, protein-coding parts have tags attached to their ends in order to add functionalities such as fast degradation. The lengths of these tags depend on their function (the predominant LVA, AAV, ASV, etc. degradation tags are 11 amino acids or 33 base pairs in length).

Because protein-coding parts currently all end in the double stop codon "TAATAA", you cannot add on tag parts in the manner of Biobricks standard assembly.

For more information, visit Part icon tag.pngHelp:Tag.