Difference between revisions of "Part:BBa K1431401:Design"

Line 4: Line 4:
 
<partinfo>BBa_K1431401 SequenceAndFeatures</partinfo>
 
<partinfo>BBa_K1431401 SequenceAndFeatures</partinfo>
  
 
=== Introduction ===
 
 
'''CRISPR(Clustered Regularly Interspaced Short Palindromic Repeat)/Cas System''' is a hot topic for biology research these days. Recently we see dozens of papers published in top journals addressing this intersting field. In case you are not familiar with it, I quoted those lines full of jargons from Wikipedia:
 
<blockquote>CRISPRs are DNA loci containing short repetitions of base sequences. Each repetition is followed by short segments of &quot;spacer DNA&quot; from previous exposures to a virus.<br>
 
The CRISPR/Cas system is a prokaryotic immune system that confers resistance to foreign genetic elements such as plasmids and phages and provides a form of acquired immunity. CRISPR spacers recognize and silence these exogenous genetic elements like RNAi in eukaryotic organisms.<br>
 
Since 2012, the CRISPR/Cas system has been used for gene editing (silencing, enhancing or changing specific genes) that even works in eukaryotes like mice and primates. By inserting a plasmid containing cas genes and specifically designed CRISPRs, an organism's genome can be cut at any desired location.<br>
 
'''-Wikipedia'''
 
</blockquote>
 
 
In short, CRISPR/Cas System is a tool to edit genes in '''live''' cells. Similar tools include TALEN(Transcription activator-like effector nuclease) and ZFN(Zinc Finger Nuclease). But CRISPR/Cas is superior than those methods in that CRISPR/Cas is guided by short RNA chain (~23bp), which is obviously easier to synthesize.<br>
 
Moreover, TALENs require a significantly longer time to construct<sup>[http://indepth.systembio.com/cas9-crispr-faq/what-is-the-difference-between-cas9-crispr-and-talen SystemBio]</sup>.
 
 
=== CRISPR gRNA Basics ===
 
 
As mentioned above, CRISPR/Cas9 Systems need a gRNA(Guide RNA) sequence to identify the target<sup>[http://www.nature.com/nprot/journal/v8/n11/full/nprot.2013.143.html ZhangFCas]</sup>. The gRNA is a 23bp long RNA beginning with a 3bp PAM(Protospacer Adjacent Motif) sequence. To effectively and specifically target a gene, the remaining 20bp of gRNA have to match the target sequence strictly. According to [http://crispr.mit.edu/about ZhangTool], the approximate quality of gRNA can be denoted as 
 
https://static.igem.org/mediawiki/parts/2/23/Equation-crispr.png .
 
 
Remember here that this equation bases only on an approximation of experimental data, and may differ from the actual situation.
 
  
 
=== Designing the Sequence ===
 
=== Designing the Sequence ===

Revision as of 18:34, 17 October 2014

One gRNA Sequence for HIV-1


Assembly Compatibility:
  • 10
    COMPATIBLE WITH RFC[10]
  • 12
    COMPATIBLE WITH RFC[12]
  • 21
    COMPATIBLE WITH RFC[21]
  • 23
    COMPATIBLE WITH RFC[23]
  • 25
    COMPATIBLE WITH RFC[25]
  • 1000
    COMPATIBLE WITH RFC[1000]


Designing the Sequence

We used a method derived from the method described in the paper by Feng Zhang[http://www.nature.com/nbt/journal/v31/n9/abs/nbt.2647.html ZhangFgRNA].

The whole process can be divided into the following steps:

Conserved Sequence Analysis

We first extracted all conserved regions from the NIH HIV-1 Reference Genome. In this step, we found around 10 alternatives for the next process. Here all screening processes are done in a per-strain basis because of the high mutability of the HIV-1 virus.

Strip out sequences without PAM

Supplementary Table 1 - Base Percentage of HIV-1 Aligned Genome 730bp-752bp
A % G % C % T % Empty % Non Empty % A(Corrected) G(Corrected) C(Corrected) T(Corrected)
730 0 0 0 56.47 43.53 56.47 0.00% 0.00% 0.00% 100.00%
731 0 55.88 0 0.59 43.53 56.47 0.00% 98.96% 0.00% 1.04%
732 0 0 0 56.47 43.53 56.47 0.00% 0.00% 0.00% 100.00%
733 0 54.71 0 1.18 43.53 55.89 0.00% 97.89% 0.00% 2.11%
734 0 0 0 58.24 41.76 58.24 0.00% 0.00% 0.00% 100.00%
735 56.47 0.59 0.59 0.59 41.76 58.24 96.96% 1.01% 1.01% 1.01%
736 0 1.18 57.06 0 41.76 58.24 0.00% 2.03% 97.97% 0.00%
737 1.18 57.06 0 0.59 41.18 58.83 2.01% 96.99% 0.00% 1.00%
738 60 0 0 0 40 60 100.00% 0.00% 0.00% 0.00%
739 0.59 0 58.82 0 40 59.41 0.99% 0.00% 99.01% 0.00%
740 0 0 0 0 100 0
741 0 0 0 0 100 0
742 0.59 0 1.18 58.24 40 60.01 0.98% 0.00% 1.97% 97.05%
743 0 0 60 0 40 60 0.00% 0.00% 100.00% 0.00%
744 0 1.18 58.82 0 40 60 0.00% 1.97% 98.03% 0.00%
745 0 58.82 1.18 0 40 60 0.00% 98.03% 1.97% 0.00%
746 0.59 0 59.41 0 40 60 0.98% 0.00% 99.02% 0.00%
747 0.59 59.41 0 0 40 60 0.98% 99.02% 0.00% 0.00%
748 0.59 59.41 0 0 40 60 0.98% 99.02% 0.00% 0.00%
749 0 58.82 0.59 0.59 40 60 0.00% 98.03% 0.98% 0.98%
750 0.59 0.59 58.24 0.59 40 60.01 0.98% 0.98% 97.05% 0.98%
751 60 0 0 0 40 60 100.00% 0.00% 0.00% 0.00%
752 59.41 0.59 0 0 40 60 99.02% 0.98% 0.00% 0.00%

Select gRNA sequences with the best theoretical quality

HIV-1 Quasi-Conservative gRNAs(Useful)
Sequence Rating(Zhang) Rank(Church) Free Energy(Approx.)
GTGTGGAAAATCTCTAGCAGTGG 71 - -1.4 HIV1_REF_2010
TCTAGCAGTGGCGCCCGAACAGG 97 - -1.3

Source

Conserved Region of the HIV-1 Genome from the NIH HIV-1 Sequence Database

References