Difference between revisions of "Part:BBa K1431402:Design"

Line 31: Line 31:
 
==== Conserved Sequence Analysis ====
 
==== Conserved Sequence Analysis ====
  
We first extracted all conserved regions from the NIH HIV-1 Reference Genome. In this step, we found around 10 alternatives for the next process. Here all screening processes are done in a per-strain basis because of the high mutability of the HIV-1 virus.
+
We first extracted all conserved regions from the NIH HBV Reference Genome. In this step, we found around 10 alternatives for the next process. Here all screening processes are done in a per-strain basis because of the high mutability of the HBV virus.
  
 
==== Strip out sequences without PAM ====
 
==== Strip out sequences without PAM ====
  
 
{| class="wikitable"
 
{| class="wikitable"
! colspan="11" | Supplementary Table 1 - Base Percentage of HIV-1 Aligned Genome 730bp-752bp
+
! colspan="11" | Supplementary Table 1 - Base Percentage of HBV Aligned Genome 730bp-752bp
 
|-
 
|-
 
|  
 
|  
Line 330: Line 330:
  
 
{| class="wikitable"
 
{| class="wikitable"
! colspan="5" | HIV-1 Quasi-Conservative gRNAs(Useful)
+
! colspan="5" | HBV Quasi-Conservative gRNAs(Useful)
 
|-
 
|-
 
| Sequence
 
| Sequence
Line 352: Line 352:
 
===Source===
 
===Source===
  
Conserved Region of the HIV-1 Genome from the NIH HIV-1 Sequence Database
+
Conserved Region of the HBV Genome from the NIH HBV Sequence Database
  
 
===References===
 
===References===

Revision as of 16:12, 16 October 2014

One gRNA Sequence for HIV-1


Assembly Compatibility:
  • 10
    COMPATIBLE WITH RFC[10]
  • 12
    COMPATIBLE WITH RFC[12]
  • 21
    COMPATIBLE WITH RFC[21]
  • 23
    COMPATIBLE WITH RFC[23]
  • 25
    COMPATIBLE WITH RFC[25]
  • 1000
    COMPATIBLE WITH RFC[1000]


Introduction

CRISPR(Clustered Regularly Interspaced Short Palindromic Repeat)/Cas System is a hot topic for biology research these days. Recently we see dozens of papers published in top journals addressing this intersting field. In case you are not familiar with it, I quoted those lines full of jargons from Wikipedia:

CRISPRs are DNA loci containing short repetitions of base sequences. Each repetition is followed by short segments of "spacer DNA" from previous exposures to a virus.

The CRISPR/Cas system is a prokaryotic immune system that confers resistance to foreign genetic elements such as plasmids and phages and provides a form of acquired immunity. CRISPR spacers recognize and silence these exogenous genetic elements like RNAi in eukaryotic organisms.
Since 2012, the CRISPR/Cas system has been used for gene editing (silencing, enhancing or changing specific genes) that even works in eukaryotes like mice and primates. By inserting a plasmid containing cas genes and specifically designed CRISPRs, an organism's genome can be cut at any desired location.
-Wikipedia

In short, CRISPR/Cas System is a tool to edit genes in live cells. Similar tools include TALEN(Transcription activator-like effector nuclease) and ZFN(Zinc Finger Nuclease). But CRISPR/Cas is superior than those methods in that CRISPR/Cas is guided by short RNA chain (~23bp), which is obviously easier to synthesize.
Moreover, TALENs require a significantly longer time to construct[http://indepth.systembio.com/cas9-crispr-faq/what-is-the-difference-between-cas9-crispr-and-talen SystemBio].

CRISPR gRNA Basics

As mentioned above, CRISPR/Cas9 Systems need a gRNA(Guide RNA) sequence to identify the target[http://www.nature.com/nprot/journal/v8/n11/full/nprot.2013.143.html ZhangFCas]. The gRNA is a 23bp long RNA beginning with a 3bp PAM(Protospacer Adjacent Motif) sequence. To effectively and specifically target a gene, the remaining 20bp of gRNA have to match the target sequence strictly. According to [http://crispr.mit.edu/about ZhangTool], the approximate quality of gRNA can be denoted as Equation-crispr.png .

Remember here that this equation bases only on an approximation of experimental data, and may differ from the actual situation.

Designing the Sequence

We used a method derived from the method described in the paper by Feng Zhang[http://www.nature.com/nbt/journal/v31/n9/abs/nbt.2647.html ZhangFgRNA].

The whole process can be divided into the following steps:

Conserved Sequence Analysis

We first extracted all conserved regions from the NIH HBV Reference Genome. In this step, we found around 10 alternatives for the next process. Here all screening processes are done in a per-strain basis because of the high mutability of the HBV virus.

Strip out sequences without PAM

Supplementary Table 1 - Base Percentage of HBV Aligned Genome 730bp-752bp
A % G % C % T % Empty % Non Empty % A(Corrected) G(Corrected) C(Corrected) T(Corrected)
730 0 0 0 56.47 43.53 56.47 0.00% 0.00% 0.00% 100.00%
731 0 55.88 0 0.59 43.53 56.47 0.00% 98.96% 0.00% 1.04%
732 0 0 0 56.47 43.53 56.47 0.00% 0.00% 0.00% 100.00%
733 0 54.71 0 1.18 43.53 55.89 0.00% 97.89% 0.00% 2.11%
734 0 0 0 58.24 41.76 58.24 0.00% 0.00% 0.00% 100.00%
735 56.47 0.59 0.59 0.59 41.76 58.24 96.96% 1.01% 1.01% 1.01%
736 0 1.18 57.06 0 41.76 58.24 0.00% 2.03% 97.97% 0.00%
737 1.18 57.06 0 0.59 41.18 58.83 2.01% 96.99% 0.00% 1.00%
738 60 0 0 0 40 60 100.00% 0.00% 0.00% 0.00%
739 0.59 0 58.82 0 40 59.41 0.99% 0.00% 99.01% 0.00%
740 0 0 0 0 100 0
741 0 0 0 0 100 0
742 0.59 0 1.18 58.24 40 60.01 0.98% 0.00% 1.97% 97.05%
743 0 0 60 0 40 60 0.00% 0.00% 100.00% 0.00%
744 0 1.18 58.82 0 40 60 0.00% 1.97% 98.03% 0.00%
745 0 58.82 1.18 0 40 60 0.00% 98.03% 1.97% 0.00%
746 0.59 0 59.41 0 40 60 0.98% 0.00% 99.02% 0.00%
747 0.59 59.41 0 0 40 60 0.98% 99.02% 0.00% 0.00%
748 0.59 59.41 0 0 40 60 0.98% 99.02% 0.00% 0.00%
749 0 58.82 0.59 0.59 40 60 0.00% 98.03% 0.98% 0.98%
750 0.59 0.59 58.24 0.59 40 60.01 0.98% 0.98% 97.05% 0.98%
751 60 0 0 0 40 60 100.00% 0.00% 0.00% 0.00%
752 59.41 0.59 0 0 40 60 99.02% 0.98% 0.00% 0.00%

Select gRNA sequences with the best theoretical quality

HBV Quasi-Conservative gRNAs(Useful)
Sequence Rating(Zhang) Rank(Church) Free Energy(Approx.)
GTGTGGAAAATCTCTAGCAGTGG 71 - -1.4 HIV1_REF_2010
TCTAGCAGTGGCGCCCGAACAGG 97 - -1.3

Source

Conserved Region of the HBV Genome from the NIH HBV Sequence Database

References