Difference between revisions of "Part:BBa K1431402:Design"
Line 1: | Line 1: | ||
− | |||
__NOTOC__ | __NOTOC__ | ||
− | <partinfo> | + | <partinfo>BBa_K1431401 short</partinfo> |
+ | |||
+ | <partinfo>BBa_K1431401 SequenceAndFeatures</partinfo> | ||
+ | |||
+ | |||
+ | === Introduction === | ||
+ | |||
+ | '''CRISPR(Clustered Regularly Interspaced Short Palindromic Repeat)/Cas System''' is a hot topic for biology research these days. Recently we see dozens of papers published in top journals addressing this intersting field. In case you are not familiar with it, I quoted those lines full of jargons from Wikipedia: | ||
+ | <blockquote>CRISPRs are DNA loci containing short repetitions of base sequences. Each repetition is followed by short segments of "spacer DNA" from previous exposures to a virus.<br> | ||
+ | The CRISPR/Cas system is a prokaryotic immune system that confers resistance to foreign genetic elements such as plasmids and phages and provides a form of acquired immunity. CRISPR spacers recognize and silence these exogenous genetic elements like RNAi in eukaryotic organisms.<br> | ||
+ | Since 2012, the CRISPR/Cas system has been used for gene editing (silencing, enhancing or changing specific genes) that even works in eukaryotes like mice and primates. By inserting a plasmid containing cas genes and specifically designed CRISPRs, an organism's genome can be cut at any desired location.<br> | ||
+ | '''-Wikipedia''' | ||
+ | </blockquote> | ||
+ | |||
+ | In short, CRISPR/Cas System is a tool to edit genes in '''live''' cells. Similar tools include TALEN(Transcription activator-like effector nuclease) and ZFN(Zinc Finger Nuclease). But CRISPR/Cas is superior than those methods in that CRISPR/Cas is guided by short RNA chain (~23bp), which is obviously easier to synthesize.<br> | ||
+ | Moreover, TALENs require a significantly longer time to construct<sup>[http://indepth.systembio.com/cas9-crispr-faq/what-is-the-difference-between-cas9-crispr-and-talen SystemBio]</sup>. | ||
+ | |||
+ | === CRISPR gRNA Basics === | ||
+ | |||
+ | As mentioned above, CRISPR/Cas9 Systems need a gRNA(Guide RNA) sequence to identify the target<sup>[http://www.nature.com/nprot/journal/v8/n11/full/nprot.2013.143.html ZhangFCas]</sup>. The gRNA is a 23bp long RNA beginning with a 3bp PAM(Protospacer Adjacent Motif) sequence. To effectively and specifically target a gene, the remaining 20bp of gRNA have to match the target sequence strictly. According to [http://crispr.mit.edu/about ZhangTool], the approximate quality of gRNA can be denoted as | ||
+ | https://static.igem.org/mediawiki/parts/2/23/Equation-crispr.png . | ||
+ | |||
+ | Remember here that this equation bases only on an approximation of experimental data, and may differ from the actual situation. | ||
+ | |||
+ | === Designing the Sequence === | ||
+ | |||
+ | We used a method derived from the method described in the paper by Feng Zhang<sup>[http://www.nature.com/nbt/journal/v31/n9/abs/nbt.2647.html ZhangFgRNA]</sup>. | ||
+ | |||
+ | The whole process can be divided into the following steps: | ||
+ | ==== Conserved Sequence Analysis ==== | ||
− | + | We first extracted all conserved regions from the NIH HIV-1 Reference Genome. In this step, we found around 10 alternatives for the next process. Here all screening processes are done in a per-strain basis because of the high mutability of the HIV-1 virus. | |
+ | ==== Strip out sequences without PAM ==== | ||
− | == | + | {| class="wikitable" |
− | + | ! colspan="11" | Supplementary Table 1 - Base Percentage of HIV-1 Aligned Genome 730bp-752bp | |
+ | |- | ||
+ | | | ||
+ | | A % | ||
+ | | G % | ||
+ | | C % | ||
+ | | T % | ||
+ | | Empty % | ||
+ | | Non Empty % | ||
+ | | A(Corrected) | ||
+ | | G(Corrected) | ||
+ | | C(Corrected) | ||
+ | | T(Corrected) | ||
+ | |- | ||
+ | | 730 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 56.47 | ||
+ | | 43.53 | ||
+ | | 56.47 | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | | 100.00% | ||
+ | |- | ||
+ | | 731 | ||
+ | | 0 | ||
+ | | 55.88 | ||
+ | | 0 | ||
+ | | 0.59 | ||
+ | | 43.53 | ||
+ | | 56.47 | ||
+ | | 0.00% | ||
+ | | 98.96% | ||
+ | | 0.00% | ||
+ | | 1.04% | ||
+ | |- | ||
+ | | 732 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 56.47 | ||
+ | | 43.53 | ||
+ | | 56.47 | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | | 100.00% | ||
+ | |- | ||
+ | | 733 | ||
+ | | 0 | ||
+ | | 54.71 | ||
+ | | 0 | ||
+ | | 1.18 | ||
+ | | 43.53 | ||
+ | | 55.89 | ||
+ | | 0.00% | ||
+ | | 97.89% | ||
+ | | 0.00% | ||
+ | | 2.11% | ||
+ | |- | ||
+ | | 734 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 58.24 | ||
+ | | 41.76 | ||
+ | | 58.24 | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | | 100.00% | ||
+ | |- | ||
+ | | 735 | ||
+ | | 56.47 | ||
+ | | 0.59 | ||
+ | | 0.59 | ||
+ | | 0.59 | ||
+ | | 41.76 | ||
+ | | 58.24 | ||
+ | | 96.96% | ||
+ | | 1.01% | ||
+ | | 1.01% | ||
+ | | 1.01% | ||
+ | |- | ||
+ | | 736 | ||
+ | | 0 | ||
+ | | 1.18 | ||
+ | | 57.06 | ||
+ | | 0 | ||
+ | | 41.76 | ||
+ | | 58.24 | ||
+ | | 0.00% | ||
+ | | 2.03% | ||
+ | | 97.97% | ||
+ | | 0.00% | ||
+ | |- | ||
+ | | 737 | ||
+ | | 1.18 | ||
+ | | 57.06 | ||
+ | | 0 | ||
+ | | 0.59 | ||
+ | | 41.18 | ||
+ | | 58.83 | ||
+ | | 2.01% | ||
+ | | 96.99% | ||
+ | | 0.00% | ||
+ | | 1.00% | ||
+ | |- | ||
+ | | 738 | ||
+ | | 60 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 40 | ||
+ | | 60 | ||
+ | | 100.00% | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | |- | ||
+ | | 739 | ||
+ | | 0.59 | ||
+ | | 0 | ||
+ | | 58.82 | ||
+ | | 0 | ||
+ | | 40 | ||
+ | | 59.41 | ||
+ | | 0.99% | ||
+ | | 0.00% | ||
+ | | 99.01% | ||
+ | | 0.00% | ||
+ | |- | ||
+ | | 740 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 100 | ||
+ | | 0 | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | |- | ||
+ | | 741 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 100 | ||
+ | | 0 | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | |- | ||
+ | | 742 | ||
+ | | 0.59 | ||
+ | | 0 | ||
+ | | 1.18 | ||
+ | | 58.24 | ||
+ | | 40 | ||
+ | | 60.01 | ||
+ | | 0.98% | ||
+ | | 0.00% | ||
+ | | 1.97% | ||
+ | | 97.05% | ||
+ | |- | ||
+ | | 743 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 60 | ||
+ | | 0 | ||
+ | | 40 | ||
+ | | 60 | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | | 100.00% | ||
+ | | 0.00% | ||
+ | |- | ||
+ | | 744 | ||
+ | | 0 | ||
+ | | 1.18 | ||
+ | | 58.82 | ||
+ | | 0 | ||
+ | | 40 | ||
+ | | 60 | ||
+ | | 0.00% | ||
+ | | 1.97% | ||
+ | | 98.03% | ||
+ | | 0.00% | ||
+ | |- | ||
+ | | 745 | ||
+ | | 0 | ||
+ | | 58.82 | ||
+ | | 1.18 | ||
+ | | 0 | ||
+ | | 40 | ||
+ | | 60 | ||
+ | | 0.00% | ||
+ | | 98.03% | ||
+ | | 1.97% | ||
+ | | 0.00% | ||
+ | |- | ||
+ | | 746 | ||
+ | | 0.59 | ||
+ | | 0 | ||
+ | | 59.41 | ||
+ | | 0 | ||
+ | | 40 | ||
+ | | 60 | ||
+ | | 0.98% | ||
+ | | 0.00% | ||
+ | | 99.02% | ||
+ | | 0.00% | ||
+ | |- | ||
+ | | 747 | ||
+ | | 0.59 | ||
+ | | 59.41 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 40 | ||
+ | | 60 | ||
+ | | 0.98% | ||
+ | | 99.02% | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | |- | ||
+ | | 748 | ||
+ | | 0.59 | ||
+ | | 59.41 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 40 | ||
+ | | 60 | ||
+ | | 0.98% | ||
+ | | 99.02% | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | |- | ||
+ | | 749 | ||
+ | | 0 | ||
+ | | 58.82 | ||
+ | | 0.59 | ||
+ | | 0.59 | ||
+ | | 40 | ||
+ | | 60 | ||
+ | | 0.00% | ||
+ | | 98.03% | ||
+ | | 0.98% | ||
+ | | 0.98% | ||
+ | |- | ||
+ | | 750 | ||
+ | | 0.59 | ||
+ | | 0.59 | ||
+ | | 58.24 | ||
+ | | 0.59 | ||
+ | | 40 | ||
+ | | 60.01 | ||
+ | | 0.98% | ||
+ | | 0.98% | ||
+ | | 97.05% | ||
+ | | 0.98% | ||
+ | |- | ||
+ | | 751 | ||
+ | | 60 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 40 | ||
+ | | 60 | ||
+ | | 100.00% | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | |- | ||
+ | | 752 | ||
+ | | 59.41 | ||
+ | | 0.59 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 40 | ||
+ | | 60 | ||
+ | | 99.02% | ||
+ | | 0.98% | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | |} | ||
+ | ==== Select gRNA sequences with the best theoretical quality ==== | ||
+ | {| class="wikitable" | ||
+ | ! colspan="5" | HIV-1 Quasi-Conservative gRNAs(Useful) | ||
+ | |- | ||
+ | | Sequence | ||
+ | | Rating(Zhang) | ||
+ | | Rank(Church) | ||
+ | | Free Energy(Approx.) | ||
+ | | | ||
+ | |- | ||
+ | | GTGTGGAAAATCTCTAGCAGTGG | ||
+ | | 71 | ||
+ | | - | ||
+ | | -1.4 | ||
+ | | rowspan="2" | HIV1_REF_2010 | ||
+ | |- | ||
+ | | TCTAGCAGTGGCGCCCGAACAGG | ||
+ | | 97 | ||
+ | | - | ||
+ | | -1.3 | ||
+ | |} | ||
===Source=== | ===Source=== | ||
− | + | Conserved Region of the HIV-1 Genome from the NIH HIV-1 Sequence Database | |
===References=== | ===References=== |
Revision as of 15:59, 16 October 2014
One gRNA Sequence for HIV-1
- 10COMPATIBLE WITH RFC[10]
- 12COMPATIBLE WITH RFC[12]
- 21COMPATIBLE WITH RFC[21]
- 23COMPATIBLE WITH RFC[23]
- 25COMPATIBLE WITH RFC[25]
- 1000COMPATIBLE WITH RFC[1000]
Introduction
CRISPR(Clustered Regularly Interspaced Short Palindromic Repeat)/Cas System is a hot topic for biology research these days. Recently we see dozens of papers published in top journals addressing this intersting field. In case you are not familiar with it, I quoted those lines full of jargons from Wikipedia:
CRISPRs are DNA loci containing short repetitions of base sequences. Each repetition is followed by short segments of "spacer DNA" from previous exposures to a virus.
The CRISPR/Cas system is a prokaryotic immune system that confers resistance to foreign genetic elements such as plasmids and phages and provides a form of acquired immunity. CRISPR spacers recognize and silence these exogenous genetic elements like RNAi in eukaryotic organisms.
Since 2012, the CRISPR/Cas system has been used for gene editing (silencing, enhancing or changing specific genes) that even works in eukaryotes like mice and primates. By inserting a plasmid containing cas genes and specifically designed CRISPRs, an organism's genome can be cut at any desired location.
-Wikipedia
In short, CRISPR/Cas System is a tool to edit genes in live cells. Similar tools include TALEN(Transcription activator-like effector nuclease) and ZFN(Zinc Finger Nuclease). But CRISPR/Cas is superior than those methods in that CRISPR/Cas is guided by short RNA chain (~23bp), which is obviously easier to synthesize.
Moreover, TALENs require a significantly longer time to construct[http://indepth.systembio.com/cas9-crispr-faq/what-is-the-difference-between-cas9-crispr-and-talen SystemBio].
CRISPR gRNA Basics
As mentioned above, CRISPR/Cas9 Systems need a gRNA(Guide RNA) sequence to identify the target[http://www.nature.com/nprot/journal/v8/n11/full/nprot.2013.143.html ZhangFCas]. The gRNA is a 23bp long RNA beginning with a 3bp PAM(Protospacer Adjacent Motif) sequence. To effectively and specifically target a gene, the remaining 20bp of gRNA have to match the target sequence strictly. According to [http://crispr.mit.edu/about ZhangTool], the approximate quality of gRNA can be denoted as .
Remember here that this equation bases only on an approximation of experimental data, and may differ from the actual situation.
Designing the Sequence
We used a method derived from the method described in the paper by Feng Zhang[http://www.nature.com/nbt/journal/v31/n9/abs/nbt.2647.html ZhangFgRNA].
The whole process can be divided into the following steps:
Conserved Sequence Analysis
We first extracted all conserved regions from the NIH HIV-1 Reference Genome. In this step, we found around 10 alternatives for the next process. Here all screening processes are done in a per-strain basis because of the high mutability of the HIV-1 virus.
Strip out sequences without PAM
Supplementary Table 1 - Base Percentage of HIV-1 Aligned Genome 730bp-752bp | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
A % | G % | C % | T % | Empty % | Non Empty % | A(Corrected) | G(Corrected) | C(Corrected) | T(Corrected) | |
730 | 0 | 0 | 0 | 56.47 | 43.53 | 56.47 | 0.00% | 0.00% | 0.00% | 100.00% |
731 | 0 | 55.88 | 0 | 0.59 | 43.53 | 56.47 | 0.00% | 98.96% | 0.00% | 1.04% |
732 | 0 | 0 | 0 | 56.47 | 43.53 | 56.47 | 0.00% | 0.00% | 0.00% | 100.00% |
733 | 0 | 54.71 | 0 | 1.18 | 43.53 | 55.89 | 0.00% | 97.89% | 0.00% | 2.11% |
734 | 0 | 0 | 0 | 58.24 | 41.76 | 58.24 | 0.00% | 0.00% | 0.00% | 100.00% |
735 | 56.47 | 0.59 | 0.59 | 0.59 | 41.76 | 58.24 | 96.96% | 1.01% | 1.01% | 1.01% |
736 | 0 | 1.18 | 57.06 | 0 | 41.76 | 58.24 | 0.00% | 2.03% | 97.97% | 0.00% |
737 | 1.18 | 57.06 | 0 | 0.59 | 41.18 | 58.83 | 2.01% | 96.99% | 0.00% | 1.00% |
738 | 60 | 0 | 0 | 0 | 40 | 60 | 100.00% | 0.00% | 0.00% | 0.00% |
739 | 0.59 | 0 | 58.82 | 0 | 40 | 59.41 | 0.99% | 0.00% | 99.01% | 0.00% |
740 | 0 | 0 | 0 | 0 | 100 | 0 | ||||
741 | 0 | 0 | 0 | 0 | 100 | 0 | ||||
742 | 0.59 | 0 | 1.18 | 58.24 | 40 | 60.01 | 0.98% | 0.00% | 1.97% | 97.05% |
743 | 0 | 0 | 60 | 0 | 40 | 60 | 0.00% | 0.00% | 100.00% | 0.00% |
744 | 0 | 1.18 | 58.82 | 0 | 40 | 60 | 0.00% | 1.97% | 98.03% | 0.00% |
745 | 0 | 58.82 | 1.18 | 0 | 40 | 60 | 0.00% | 98.03% | 1.97% | 0.00% |
746 | 0.59 | 0 | 59.41 | 0 | 40 | 60 | 0.98% | 0.00% | 99.02% | 0.00% |
747 | 0.59 | 59.41 | 0 | 0 | 40 | 60 | 0.98% | 99.02% | 0.00% | 0.00% |
748 | 0.59 | 59.41 | 0 | 0 | 40 | 60 | 0.98% | 99.02% | 0.00% | 0.00% |
749 | 0 | 58.82 | 0.59 | 0.59 | 40 | 60 | 0.00% | 98.03% | 0.98% | 0.98% |
750 | 0.59 | 0.59 | 58.24 | 0.59 | 40 | 60.01 | 0.98% | 0.98% | 97.05% | 0.98% |
751 | 60 | 0 | 0 | 0 | 40 | 60 | 100.00% | 0.00% | 0.00% | 0.00% |
752 | 59.41 | 0.59 | 0 | 0 | 40 | 60 | 99.02% | 0.98% | 0.00% | 0.00% |
Select gRNA sequences with the best theoretical quality
HIV-1 Quasi-Conservative gRNAs(Useful) | ||||
---|---|---|---|---|
Sequence | Rating(Zhang) | Rank(Church) | Free Energy(Approx.) | |
GTGTGGAAAATCTCTAGCAGTGG | 71 | - | -1.4 | HIV1_REF_2010 |
TCTAGCAGTGGCGCCCGAACAGG | 97 | - | -1.3 |
Source
Conserved Region of the HIV-1 Genome from the NIH HIV-1 Sequence Database