Assembly standard 21

< Back to Catalog

Researchers at UC Berkeley have developed the BglBrick assembly standard, or Assembly standard 21, based on idempotent assembly with BamHI and BglII restriction enzymes. In a nutshell, most parts look like this:

        Prefix                        Suffix
5' GAATTC atg AGATCT ...part... GGATCC taa CTCGAG 3'
   EcoRI      BglII             BamHI   *   XhoI 

Fusing two parts leaves the following scar:

5' [part A] GGATCT [part B] 3'
             G  S

Note, however, that Assembly standard 20 is intended as a minimal physical assembly standard, and only those features needed for interconversion of BglBrick assembly standard plasmids are formally defined. Therefore, atg and taa spacers are not core definitions of the standard.

See [http://openwetware.org/wiki/The_BioBricks_Foundation:Standards/Technical/Formats The BioBricks Foundation wiki] for a discussion and comparison of different technical standards.

Plasmid backbones (?) Ribosome binding sites (?) Protein domains (?) Protein coding sequences (?) Translational units (?)

Or get help on Assembly standard 21 parts.

Plasmid backbones

Plasmids are circular, double-stranded DNA molecules typically containing a few thousand base pairs that replicate within the cell independently of the chromosomal DNA. Plasmid DNA is easily purified from cells, manipulated using common lab techniques and incorporated into cells. Most BioBrick parts in the Registry are maintained and propagated on plasmids. Thus, construction of BioBrick parts, devices and systems usually requires working with plasmids.

Note: In the Registry, plasmids are made up of two distinct components:

  1. the BioBrick part, device or system that is located in the cloning site, between (and excluding) the Assembly standard 21 prefix and suffix.
  2. the plasmid backbone which propagates the BioBrick part. The plasmid backbone is defined as the sequence beginning with the Assembly standard 21 suffix, including the replication origin and antibiotic resistance marker, and ending with the Assembly standard 21 prefix. [Note that the plasmid backbone itself can be composed of BioBrick parts.]

Many BioBrick parts in the Registry are maintained on more than one plasmid backbone!


There are no parts for this table


Ribosome binding sites

A typical RBS sequence is located about 6 nucleotides upstream of a start codon in an mRNA. The ribosomal holoenzyme binds to both the RBS and the start codon. The start codon and everything downstream are translated by the ribosome.

A Ribosome Binding Site (RBS) is an RNA sequence found in mRNA to which ribosomes can bind and initiate translation. Translation initiation in bacteria almost always requires both an RBS sequence and a start codon. In the registry, protein coding sequences begin with the start codon. So, if you want to build a BioBrick system that produces a protein, you need to pick an RBS part and put it upstream of the protein coding sequence you want to translate. Note that an RBS is often defined as just that part of the mRNA sequence that binds to the ribosome, however, the surrounding sequence can also affect the translation initiation rate. Consequently, a BioBrick™ RBS part contains the classic RBS but sometimes some surrounding sequence also. See here for a more detailed explanation of how RBSs work.


There are no parts for this table


Protein domains

Protein domains encode portions of proteins and can be assembled together to form translational units, a genetic part spanning from translational initiation (the RBS) to translational termination (the stop codon).

ProteinDomains.png

There are several different types of protein domains.

  1. Head Domain: The Head domain consists of the start codon followed immediately by zero or more triplets specifiying an N-terminal tag, such as a protein export tag or lipoprotein binding tag. Head domains should begin with an ATG start codon and include codons 2 and 3 of the protein at a minimum. Examples of head domains include
    • ATG start codon
    • ATG start codon and codons 2-3
    • ATG start codon and signal sequence
    • ATG start codon and affinity tag
  2. Internal Domains: Protein domains consist of a series of codon triplets coding for an amino acid sequence without a start codon or stop codon. Multiple Internal Domains can be fused. Examples of internal domains include
    • DNA binding domains
    • Dimerization domains
    • Kinase domains
  3. Special Internal Domains: Short Domains with specific function may be separately categorized, but obey the same composition rules as normal internal domains. Examples of special internal domains include
    • Linkers
    • Cleavage sites
    • Inteins
  4. Tail Domain: The C-terminus of a coding region consists of zero or more triplet codons, followed by a pair of TAA stop codons. In the simplest case, the stop codons terminate the protein with an Stop. More complex Tails may include degradation tags appropriate to the organism (i.e., with different degradation rates). Examples of Tail domain include
    • TAATAA stop codons
    • A degradation tag followed by TAATAA stop codons
    • An affinity tag followed by TAATAA stop codon

Unfortunately, the original BioBrick assembly standard, Assembly standard 10, does not support in-frame assembly of protein domains. (Assembly standard 10 creates an 8 bp scar between adjacent parts.) Therefore, it is recommended that you use an alternate approach to assemble protein domains together to make a translational unit. There are several possible approaches to assembling protein domains including direct synthesis (preferred because it creates no scars) as well as various assembly standards. Regardless of which standard you choose, we suggest that the resulting protein coding sequence or translational unit comply with the original BioBrick assembly standard so that your parts can be assembled with most of the parts in the Registry.


Protein coding sequences should be as follows

GAATTC GCGGCCGC T TCTAG [ATG ... TAA TAA] T ACTAGT A GCGGCCG CTGCAG


Note: Although most RBSs are currently specified as separate parts in the Registry, we are now moving to a new design in which the RBS and Head domain are combined into a single part termed a Translational start. The new design has the advantage of encapsulating both ribosome binding and translational initiation within a single part. Our working hypothesis is that the new design will reduce the likelihood of unexpected functional composition problems between the RBS and coding sequence.


There are no parts for this table


Protein coding sequences

Protein coding sequences are DNA sequences that are transcribed into mRNA and in which the corresponding mRNA molecules are translated into a polypeptide chain. Every three nucleotides, termed a codon, in a protein coding sequence encodes 1 amino acid in the polypeptide chain. In some cases, different chassis may either map a given codon to a different sequence or may use different codons more or less frequently. Therefore some protein coding sequences may be optimized for use in a particular chassis.

In the Registry, protein coding sequences begin with a start codon (usually ATG) and end with a stop codon (usually with a double stop codon TAA TAA). Protein coding sequences are often abbreviated with the acronym CDS.

Although protein coding sequences are often considered to be basic parts, in fact proteins coding sequences can themselves be composed of one or more regions, called protein domains. Thus, a protein coding sequence could either be entered as a basic part or as a composite part of two or more protein domains.

  1. The N-terminal domain of a protein coding sequence is special in a number of ways. First, it always contains a start codon, spaced at an appropriate distance from a ribosomal binding site. Second, many coding regions have special features at the N terminus, such as protein export tags and lipoprotein cleavage and attachment tags. These occur at the beginning of a coding region, and therefore are termed Head domains.
  2. A protein domain is a sequence of amino acids which fold relatively independently and which are evolutionarily shuffled as a unit among different protein coding regions. The DNA sequence of such domains must maintain in-frame translation, and thus is a multiple of three bases. Since these protein domains are within a protein coding sequence, they are called Internal domains. Certain Internal domains have particular functions in protein cleavage or splicing and are termed Special Internal domains.
  3. Similarly, the C-terminal domain of a protein is special, containing at least a stop codon. Other special features, such as degradation tags, are also required to be at the extreme C-terminus. Again, these domains cannot function when internal to a coding region, and are termed Tail domains.

For more details on protein domains including how to assemble protein domains into protein coding sequences, please see Protein domains.


Protein coding sequences should be as follows

GAATTC GCGGCCGC T TCTAG [ATG ... TAA TAA] T ACTAGT A GCGGCCG CTGCAG


There are no parts for this table


Translational units

Translational units begin with the RBS, the site of ribosome binding and translational initiation, and end with a stop codon, the site of translational termination. Every translational unit in the Registry consists of at least three parts, a Translational start, one or more Internal Domains including Special Internal Domains, and a Tail Domain. Thus translational units can, in some sense, be thought of as a composite part made up of three or more parts. Protein coding sequences, in contrast, begin with a start codon and end with a stop codon.

ProteinDomains.png

For more information on protein domains, see protein domains. Unfortunately, the original BioBrick assembly standard, Assembly standard 10, does not support in-frame assembly of protein domains. (Assembly standard 10 creates an 8 bp scar between adjacent parts.) Therefore, it is recommended that you use an alternate approach to assemble protein domains together to make a translational unit. There are several possible approaches to assembling protein domains including direct synthesis (preferred because it creates no scars) as well as various assembly standards. Regardless of which standard you choose, we suggest that the resulting translational unit comply with the original BioBrick assembly standard so that your parts can be assembled with most of the parts in the Registry.


Translational units should be as follows

GAATTC GCGGCCGC T TCTAGA G [RBS] [ATG ... TAA TAA] T ACTAGT A GCGGCCG CTGCAG


Although most RBSs are currently specified as separate parts in the Registry, we are now moving to a new design in which the RBS and Head domain are combined into a single part termed a Translational start. The new design has the advantage of encapsulating both ribosome binding and translational initiation within a single part. Our working hypothesis is that the new design will reduce the likelihood of unexpected functional composition problems between the RBS and coding sequence.


There are no parts for this table


References

Researchers at UC Berkeley have developed a new assembly standard. See [http://openwetware.org/wiki/The_BioBricks_Foundation:Standards/Technical/Formats#The_Berkeley_.28BBb.29_Format the BioBricks Foundation wiki] for more details.