Help:Proteins/FAQ

< Back to Protein Help

Why are you splitting up protein coding sequences into multiple protein domain parts?

Proteins are often composed of multiple protein domains, each of which has a distinct biological function. For example, the lac repressor has a protein domain whose function is to bind to DNA and a separate domain whose function is to enable lac repressor to oligomerize. In synthetic biology, a part encodes a basic biological function as a nucleic acid sequence. Thus, since protein domains often have a particular function, it is logical to specify protein domains as a separate basic parts that are assembled together.

What's this new part type that you call a translational unit?

A translational unit is a part that includes both translational initiation and translational termination. Thus, it begins with a ribosome binding site (or equivalent in other organisms) and ends with a stop codon. In other words, in its simplest form, a translational unit is an RBS followed by a protein coding sequence.

Now natural systems can be quite complicated. In some cases, the stop codon of one protein coding sequence can overlap the start codon of the next. In such situations, the translational unit encompasses everything from the RBS to the final stop codon (and thus includes multiple protein coding sequences). Note, however, that operons are different! Most operons typically have multiple ribosome binding sites, so thus they have multiple translational units.

Tom's BioBrick standard, Assembly standard 10, doesn't support in-frame assembly of parts. What do I do?

You're right it doesn't. After giving this problem a lot of thought, the Registry has decided to adopt a policy that you can construct a protein coding sequence from protein domains any way that you want. You can use direct synthesis, PCR, or any one of assembly standards 21, 23, 25, and 28 that support in-frame assembly. The only rule is that after construction, the resulting protein coding sequence or translational unit must comply with Assembly standard 10, the original BioBrick standard. The advantage of this approach is that

  1. Your protein coding sequence or translational unit is compatible with the large majority of parts in the Registry.
  2. You can use whichever construction method is most suitable for your protein coding sequence or translational unit. If you have a choice, we recommend direct synthesis so that you avoid any scars altogether.

What are head domains, internal domains, and tail domains?

Proteins are not just arbitrary sequences of amino acids, but rather they have some structure to them. Proteins have a beginning, middle and end. Certain domains only belong at the beginning of a protein, others only in the middle and still others only at the end. To better describe the underlying structure of protein coding sequences, Tom wrote up BBF RFC 13 called [http://openwetware.org/wiki/BBRFC13 "Rethinking the boundaries and composition of coding regions"]. Head domains occur at the beginning of protein coding sequences, internal domains in the middle, and tail domains at the end.

  • In its simplest form, head domains are composed of a ribosome binding site and a start codon. The reason that a head domain spans both the RBS and start codon is that both sequences impact the rate of translational initiation, therefore it is appropriate to include both in a single part. In fact, studies have shown that even the second and third codon can impact translational initiation, so we suggest that head domains should really include the RBS and first three codons of the coding sequence. Thus, really head domains occur at the beginning of translational units, since they include the RBS.
  • Internal domains occur in the middle of protein coding sequences. They include neither a start codon nor a stop codon. Multiple internal domains can be strung together. Multiple internal domains are often separated by linkers, a special type of internal domain.
  • Tail domains occur at the end of protein coding sequences. In its simplest form, a tail domain is just a stop codon, or a double stop codon as most Registry parts have.