Protein domains/Overview

Every protein coding sequence in the Registry consists of at least three protein domains, a Head Domain, one or more Internal Domains including Special Internal Domains, and a Tail Domain.

Head Domain: The Head domain consists of the ribosome binding sites and a start codon followed immediately by zero or more triplets specifiying an N-terminal tag, such as a protein export tag or lipoprotein binding tag. Examples of head domains include
- RBS plus start codon
- RBS, start codon and codons 2-3
- RBS, start codon and signal sequence
- RBS, start codon and affinity tag
Internal Domains: Protein domains consist of a series of codon triplets coding for an amino acid sequence without a start codon or stop codon. Multiple Domains can be fused.
Special Internal Domains: Short Domains with specific function may be separately categorized, but obey the same composition rules as normal domains. Examples of special internal domains include
- DNA binding domains
- Dimerization domains
- Kinase domains
- Linkers
- Cleavage sites
- Inteins
Tail Domain: The C-terminus of a coding region consists of zero or more triplet codons, followed by a pair of TAA stop codons. In the simplest case, the stop codons terrminate the protein with an Stop. More complex Tails may include degradation tags appropriate to the organism (i.e., with different degradation rates). Examples of Tail domain include
- Stop codon
- A degradation tag followed by a stop codon
- An affinity tag followed by a stop codon

Translational units begin with the RBS, the site of translational initiation, and end with a stop codon, the site of translational termination. Thus translational units can, in some sense, be thought of as a composite part of three or more protein domains. Protein coding sequence begin with a start codon and end with a stop codon.

Unfortunately, the original BioBrick assembly standard, Assembly standard 10, does not support in-frame assembly of protein domains. (Assembly standard 10 creates an 8 bp scar between adjacent parts.) Therefore, it is recommended that you use an alternate approach to assemble protein domains together to make a translational unit. There are several possible approaches to assembling protein domains including direct synthesis (preferred because it creates no scars) as well as various assembly standards. Regardless of which standard you choose, we suggest that the resulting protein coding sequence or translational unit comply with the original BioBrick assembly standard so that your parts can be assembled with most of the parts in the Registry.

Protein coding sequences should be as follows

GAATTC GCGGCCGC T TCTAG [ATG ... TAA TAA] T ACTAGT A GCGGCCG CTGCAG

Translational units should be as follows

GAATTC GCGGCCGC T TCTAGA G [RBS] [ATG ... TAA TAA] T ACTAGT A GCGGCCG CTGCAG

Although most RBSs are currently specified as separate parts in the Registry, we are now moving to a new design in which the RBS is included within the Head domain as a single part. The new design has the advantage of encapsulating both ribosome binding and translational initiation within a single part. Our working hypothesis is that the new design will reduce the likelihood of unexpected functional composition problems between the RBS and coding sequence.