Difference between revisions of "Part:BBa K1431402:Design"

 
Line 1: Line 1:
 
 
__NOTOC__
 
__NOTOC__
<partinfo>BBa_K1431402 short</partinfo>
+
<partinfo>BBa_K1431401 short</partinfo>
 +
 
 +
<partinfo>BBa_K1431401 SequenceAndFeatures</partinfo>
 +
 
 +
 
 +
=== Introduction ===
 +
 
 +
'''CRISPR(Clustered Regularly Interspaced Short Palindromic Repeat)/Cas System''' is a hot topic for biology research these days. Recently we see dozens of papers published in top journals addressing this intersting field. In case you are not familiar with it, I quoted those lines full of jargons from Wikipedia:
 +
<blockquote>CRISPRs are DNA loci containing short repetitions of base sequences. Each repetition is followed by short segments of &quot;spacer DNA&quot; from previous exposures to a virus.<br>
 +
The CRISPR/Cas system is a prokaryotic immune system that confers resistance to foreign genetic elements such as plasmids and phages and provides a form of acquired immunity. CRISPR spacers recognize and silence these exogenous genetic elements like RNAi in eukaryotic organisms.<br>
 +
Since 2012, the CRISPR/Cas system has been used for gene editing (silencing, enhancing or changing specific genes) that even works in eukaryotes like mice and primates. By inserting a plasmid containing cas genes and specifically designed CRISPRs, an organism's genome can be cut at any desired location.<br>
 +
'''-Wikipedia'''
 +
</blockquote>
 +
 
 +
In short, CRISPR/Cas System is a tool to edit genes in '''live''' cells. Similar tools include TALEN(Transcription activator-like effector nuclease) and ZFN(Zinc Finger Nuclease). But CRISPR/Cas is superior than those methods in that CRISPR/Cas is guided by short RNA chain (~23bp), which is obviously easier to synthesize.<br>
 +
Moreover, TALENs require a significantly longer time to construct<sup>[http://indepth.systembio.com/cas9-crispr-faq/what-is-the-difference-between-cas9-crispr-and-talen SystemBio]</sup>.
 +
 
 +
=== CRISPR gRNA Basics ===
 +
 
 +
As mentioned above, CRISPR/Cas9 Systems need a gRNA(Guide RNA) sequence to identify the target<sup>[http://www.nature.com/nprot/journal/v8/n11/full/nprot.2013.143.html ZhangFCas]</sup>. The gRNA is a 23bp long RNA beginning with a 3bp PAM(Protospacer Adjacent Motif) sequence. To effectively and specifically target a gene, the remaining 20bp of gRNA have to match the target sequence strictly. According to [http://crispr.mit.edu/about ZhangTool], the approximate quality of gRNA can be denoted as 
 +
https://static.igem.org/mediawiki/parts/2/23/Equation-crispr.png .
 +
 
 +
Remember here that this equation bases only on an approximation of experimental data, and may differ from the actual situation.
 +
 
 +
=== Designing the Sequence ===
 +
 
 +
We used a method derived from the method described in the paper by Feng Zhang<sup>[http://www.nature.com/nbt/journal/v31/n9/abs/nbt.2647.html ZhangFgRNA]</sup>.
 +
 
 +
The whole process can be divided into the following steps:
 +
==== Conserved Sequence Analysis ====
  
<partinfo>BBa_K1431402 SequenceAndFeatures</partinfo>
+
We first extracted all conserved regions from the NIH HIV-1 Reference Genome. In this step, we found around 10 alternatives for the next process. Here all screening processes are done in a per-strain basis because of the high mutability of the HIV-1 virus.
  
 +
==== Strip out sequences without PAM ====
  
===Design Notes===
+
{| class="wikitable"
we will complete this part later
+
! colspan="11" | Supplementary Table 1 - Base Percentage of HIV-1 Aligned Genome 730bp-752bp
 +
|-
 +
|
 +
| A %
 +
| G %
 +
| C %
 +
| T %
 +
| Empty %
 +
| Non Empty %
 +
| A(Corrected)
 +
| G(Corrected)
 +
| C(Corrected)
 +
| T(Corrected)
 +
|-
 +
| 730
 +
| 0
 +
| 0
 +
| 0
 +
| 56.47
 +
| 43.53
 +
| 56.47
 +
| 0.00%
 +
| 0.00%
 +
| 0.00%
 +
| 100.00%
 +
|-
 +
| 731
 +
| 0
 +
| 55.88
 +
| 0
 +
| 0.59
 +
| 43.53
 +
| 56.47
 +
| 0.00%
 +
| 98.96%
 +
| 0.00%
 +
| 1.04%
 +
|-
 +
| 732
 +
| 0
 +
| 0
 +
| 0
 +
| 56.47
 +
| 43.53
 +
| 56.47
 +
| 0.00%
 +
| 0.00%
 +
| 0.00%
 +
| 100.00%
 +
|-
 +
| 733
 +
| 0
 +
| 54.71
 +
| 0
 +
| 1.18
 +
| 43.53
 +
| 55.89
 +
| 0.00%
 +
| 97.89%
 +
| 0.00%
 +
| 2.11%
 +
|-
 +
| 734
 +
| 0
 +
| 0
 +
| 0
 +
| 58.24
 +
| 41.76
 +
| 58.24
 +
| 0.00%
 +
| 0.00%
 +
| 0.00%
 +
| 100.00%
 +
|-
 +
| 735
 +
| 56.47
 +
| 0.59
 +
| 0.59
 +
| 0.59
 +
| 41.76
 +
| 58.24
 +
| 96.96%
 +
| 1.01%
 +
| 1.01%
 +
| 1.01%
 +
|-
 +
| 736
 +
| 0
 +
| 1.18
 +
| 57.06
 +
| 0
 +
| 41.76
 +
| 58.24
 +
| 0.00%
 +
| 2.03%
 +
| 97.97%
 +
| 0.00%
 +
|-
 +
| 737
 +
| 1.18
 +
| 57.06
 +
| 0
 +
| 0.59
 +
| 41.18
 +
| 58.83
 +
| 2.01%
 +
| 96.99%
 +
| 0.00%
 +
| 1.00%
 +
|-
 +
| 738
 +
| 60
 +
| 0
 +
| 0
 +
| 0
 +
| 40
 +
| 60
 +
| 100.00%
 +
| 0.00%
 +
| 0.00%
 +
| 0.00%
 +
|-
 +
| 739
 +
| 0.59
 +
| 0
 +
| 58.82
 +
| 0
 +
| 40
 +
| 59.41
 +
| 0.99%
 +
| 0.00%
 +
| 99.01%
 +
| 0.00%
 +
|-
 +
| 740
 +
| 0
 +
| 0
 +
| 0
 +
| 0
 +
| 100
 +
| 0
 +
|
 +
|
 +
|
 +
|
 +
|-
 +
| 741
 +
| 0
 +
| 0
 +
| 0
 +
| 0
 +
| 100
 +
| 0
 +
|
 +
|
 +
|
 +
|
 +
|-
 +
| 742
 +
| 0.59
 +
| 0
 +
| 1.18
 +
| 58.24
 +
| 40
 +
| 60.01
 +
| 0.98%
 +
| 0.00%
 +
| 1.97%
 +
| 97.05%
 +
|-
 +
| 743
 +
| 0
 +
| 0
 +
| 60
 +
| 0
 +
| 40
 +
| 60
 +
| 0.00%
 +
| 0.00%
 +
| 100.00%
 +
| 0.00%
 +
|-
 +
| 744
 +
| 0
 +
| 1.18
 +
| 58.82
 +
| 0
 +
| 40
 +
| 60
 +
| 0.00%
 +
| 1.97%
 +
| 98.03%
 +
| 0.00%
 +
|-
 +
| 745
 +
| 0
 +
| 58.82
 +
| 1.18
 +
| 0
 +
| 40
 +
| 60
 +
| 0.00%
 +
| 98.03%
 +
| 1.97%
 +
| 0.00%
 +
|-
 +
| 746
 +
| 0.59
 +
| 0
 +
| 59.41
 +
| 0
 +
| 40
 +
| 60
 +
| 0.98%
 +
| 0.00%
 +
| 99.02%
 +
| 0.00%
 +
|-
 +
| 747
 +
| 0.59
 +
| 59.41
 +
| 0
 +
| 0
 +
| 40
 +
| 60
 +
| 0.98%
 +
| 99.02%
 +
| 0.00%
 +
| 0.00%
 +
|-
 +
| 748
 +
| 0.59
 +
| 59.41
 +
| 0
 +
| 0
 +
| 40
 +
| 60
 +
| 0.98%
 +
| 99.02%
 +
| 0.00%
 +
| 0.00%
 +
|-
 +
| 749
 +
| 0
 +
| 58.82
 +
| 0.59
 +
| 0.59
 +
| 40
 +
| 60
 +
| 0.00%
 +
| 98.03%
 +
| 0.98%
 +
| 0.98%
 +
|-
 +
| 750
 +
| 0.59
 +
| 0.59
 +
| 58.24
 +
| 0.59
 +
| 40
 +
| 60.01
 +
| 0.98%
 +
| 0.98%
 +
| 97.05%
 +
| 0.98%
 +
|-
 +
| 751
 +
| 60
 +
| 0
 +
| 0
 +
| 0
 +
| 40
 +
| 60
 +
| 100.00%
 +
| 0.00%
 +
| 0.00%
 +
| 0.00%
 +
|-
 +
| 752
 +
| 59.41
 +
| 0.59
 +
| 0
 +
| 0
 +
| 40
 +
| 60
 +
| 99.02%
 +
| 0.98%
 +
| 0.00%
 +
| 0.00%
 +
|}
  
 +
==== Select gRNA sequences with the best theoretical quality ====
  
 +
{| class="wikitable"
 +
! colspan="5" | HIV-1 Quasi-Conservative gRNAs(Useful)
 +
|-
 +
| Sequence
 +
| Rating(Zhang)
 +
| Rank(Church)
 +
| Free Energy(Approx.)
 +
|
 +
|-
 +
| GTGTGGAAAATCTCTAGCAGTGG
 +
| 71
 +
| -
 +
| -1.4
 +
| rowspan="2" | HIV1_REF_2010
 +
|-
 +
| TCTAGCAGTGGCGCCCGAACAGG
 +
| 97
 +
| -
 +
| -1.3
 +
|}
  
 
===Source===
 
===Source===
  
we will complete this part later
+
Conserved Region of the HIV-1 Genome from the NIH HIV-1 Sequence Database
  
 
===References===
 
===References===

Revision as of 15:59, 16 October 2014

One gRNA Sequence for HIV-1


Assembly Compatibility:
  • 10
    COMPATIBLE WITH RFC[10]
  • 12
    COMPATIBLE WITH RFC[12]
  • 21
    COMPATIBLE WITH RFC[21]
  • 23
    COMPATIBLE WITH RFC[23]
  • 25
    COMPATIBLE WITH RFC[25]
  • 1000
    COMPATIBLE WITH RFC[1000]


Introduction

CRISPR(Clustered Regularly Interspaced Short Palindromic Repeat)/Cas System is a hot topic for biology research these days. Recently we see dozens of papers published in top journals addressing this intersting field. In case you are not familiar with it, I quoted those lines full of jargons from Wikipedia:

CRISPRs are DNA loci containing short repetitions of base sequences. Each repetition is followed by short segments of "spacer DNA" from previous exposures to a virus.

The CRISPR/Cas system is a prokaryotic immune system that confers resistance to foreign genetic elements such as plasmids and phages and provides a form of acquired immunity. CRISPR spacers recognize and silence these exogenous genetic elements like RNAi in eukaryotic organisms.
Since 2012, the CRISPR/Cas system has been used for gene editing (silencing, enhancing or changing specific genes) that even works in eukaryotes like mice and primates. By inserting a plasmid containing cas genes and specifically designed CRISPRs, an organism's genome can be cut at any desired location.
-Wikipedia

In short, CRISPR/Cas System is a tool to edit genes in live cells. Similar tools include TALEN(Transcription activator-like effector nuclease) and ZFN(Zinc Finger Nuclease). But CRISPR/Cas is superior than those methods in that CRISPR/Cas is guided by short RNA chain (~23bp), which is obviously easier to synthesize.
Moreover, TALENs require a significantly longer time to construct[http://indepth.systembio.com/cas9-crispr-faq/what-is-the-difference-between-cas9-crispr-and-talen SystemBio].

CRISPR gRNA Basics

As mentioned above, CRISPR/Cas9 Systems need a gRNA(Guide RNA) sequence to identify the target[http://www.nature.com/nprot/journal/v8/n11/full/nprot.2013.143.html ZhangFCas]. The gRNA is a 23bp long RNA beginning with a 3bp PAM(Protospacer Adjacent Motif) sequence. To effectively and specifically target a gene, the remaining 20bp of gRNA have to match the target sequence strictly. According to [http://crispr.mit.edu/about ZhangTool], the approximate quality of gRNA can be denoted as Equation-crispr.png .

Remember here that this equation bases only on an approximation of experimental data, and may differ from the actual situation.

Designing the Sequence

We used a method derived from the method described in the paper by Feng Zhang[http://www.nature.com/nbt/journal/v31/n9/abs/nbt.2647.html ZhangFgRNA].

The whole process can be divided into the following steps:

Conserved Sequence Analysis

We first extracted all conserved regions from the NIH HIV-1 Reference Genome. In this step, we found around 10 alternatives for the next process. Here all screening processes are done in a per-strain basis because of the high mutability of the HIV-1 virus.

Strip out sequences without PAM

Supplementary Table 1 - Base Percentage of HIV-1 Aligned Genome 730bp-752bp
A % G % C % T % Empty % Non Empty % A(Corrected) G(Corrected) C(Corrected) T(Corrected)
730 0 0 0 56.47 43.53 56.47 0.00% 0.00% 0.00% 100.00%
731 0 55.88 0 0.59 43.53 56.47 0.00% 98.96% 0.00% 1.04%
732 0 0 0 56.47 43.53 56.47 0.00% 0.00% 0.00% 100.00%
733 0 54.71 0 1.18 43.53 55.89 0.00% 97.89% 0.00% 2.11%
734 0 0 0 58.24 41.76 58.24 0.00% 0.00% 0.00% 100.00%
735 56.47 0.59 0.59 0.59 41.76 58.24 96.96% 1.01% 1.01% 1.01%
736 0 1.18 57.06 0 41.76 58.24 0.00% 2.03% 97.97% 0.00%
737 1.18 57.06 0 0.59 41.18 58.83 2.01% 96.99% 0.00% 1.00%
738 60 0 0 0 40 60 100.00% 0.00% 0.00% 0.00%
739 0.59 0 58.82 0 40 59.41 0.99% 0.00% 99.01% 0.00%
740 0 0 0 0 100 0
741 0 0 0 0 100 0
742 0.59 0 1.18 58.24 40 60.01 0.98% 0.00% 1.97% 97.05%
743 0 0 60 0 40 60 0.00% 0.00% 100.00% 0.00%
744 0 1.18 58.82 0 40 60 0.00% 1.97% 98.03% 0.00%
745 0 58.82 1.18 0 40 60 0.00% 98.03% 1.97% 0.00%
746 0.59 0 59.41 0 40 60 0.98% 0.00% 99.02% 0.00%
747 0.59 59.41 0 0 40 60 0.98% 99.02% 0.00% 0.00%
748 0.59 59.41 0 0 40 60 0.98% 99.02% 0.00% 0.00%
749 0 58.82 0.59 0.59 40 60 0.00% 98.03% 0.98% 0.98%
750 0.59 0.59 58.24 0.59 40 60.01 0.98% 0.98% 97.05% 0.98%
751 60 0 0 0 40 60 100.00% 0.00% 0.00% 0.00%
752 59.41 0.59 0 0 40 60 99.02% 0.98% 0.00% 0.00%

Select gRNA sequences with the best theoretical quality

HIV-1 Quasi-Conservative gRNAs(Useful)
Sequence Rating(Zhang) Rank(Church) Free Energy(Approx.)
GTGTGGAAAATCTCTAGCAGTGG 71 - -1.4 HIV1_REF_2010
TCTAGCAGTGGCGCCCGAACAGG 97 - -1.3

Source

Conserved Region of the HIV-1 Genome from the NIH HIV-1 Sequence Database

References