|
|
(8 intermediate revisions by the same user not shown) |
Line 1: |
Line 1: |
| __NOTOC__ | | __NOTOC__ |
− | <partinfo>BBa_K1431401 short</partinfo> | + | <partinfo>BBa_K1431402 short</partinfo> |
| | | |
− | <partinfo>BBa_K1431401 SequenceAndFeatures</partinfo> | + | <partinfo>BBa_K1431402 SequenceAndFeatures</partinfo> |
− | | + | |
− | | + | |
− | === Introduction ===
| + | |
− | | + | |
− | '''CRISPR(Clustered Regularly Interspaced Short Palindromic Repeat)/Cas System''' is a hot topic for biology research these days. Recently we see dozens of papers published in top journals addressing this intersting field. In case you are not familiar with it, I quoted those lines full of jargons from Wikipedia:
| + | |
− | <blockquote>CRISPRs are DNA loci containing short repetitions of base sequences. Each repetition is followed by short segments of "spacer DNA" from previous exposures to a virus.<br>
| + | |
− | The CRISPR/Cas system is a prokaryotic immune system that confers resistance to foreign genetic elements such as plasmids and phages and provides a form of acquired immunity. CRISPR spacers recognize and silence these exogenous genetic elements like RNAi in eukaryotic organisms.<br>
| + | |
− | Since 2012, the CRISPR/Cas system has been used for gene editing (silencing, enhancing or changing specific genes) that even works in eukaryotes like mice and primates. By inserting a plasmid containing cas genes and specifically designed CRISPRs, an organism's genome can be cut at any desired location.<br>
| + | |
− | '''-Wikipedia'''
| + | |
− | </blockquote>
| + | |
− | | + | |
− | In short, CRISPR/Cas System is a tool to edit genes in '''live''' cells. Similar tools include TALEN(Transcription activator-like effector nuclease) and ZFN(Zinc Finger Nuclease). But CRISPR/Cas is superior than those methods in that CRISPR/Cas is guided by short RNA chain (~23bp), which is obviously easier to synthesize.<br>
| + | |
− | Moreover, TALENs require a significantly longer time to construct<sup>[http://indepth.systembio.com/cas9-crispr-faq/what-is-the-difference-between-cas9-crispr-and-talen SystemBio]</sup>.
| + | |
− | | + | |
− | === CRISPR gRNA Basics ===
| + | |
− | | + | |
− | As mentioned above, CRISPR/Cas9 Systems need a gRNA(Guide RNA) sequence to identify the target<sup>[http://www.nature.com/nprot/journal/v8/n11/full/nprot.2013.143.html ZhangFCas]</sup>. The gRNA is a 23bp long RNA beginning with a 3bp PAM(Protospacer Adjacent Motif) sequence. To effectively and specifically target a gene, the remaining 20bp of gRNA have to match the target sequence strictly. According to [http://crispr.mit.edu/about ZhangTool], the approximate quality of gRNA can be denoted as
| + | |
− | https://static.igem.org/mediawiki/parts/2/23/Equation-crispr.png .
| + | |
− | | + | |
− | Remember here that this equation bases only on an approximation of experimental data, and may differ from the actual situation.
| + | |
| | | |
| === Designing the Sequence === | | === Designing the Sequence === |
Line 29: |
Line 9: |
| | | |
| The whole process can be divided into the following steps: | | The whole process can be divided into the following steps: |
| + | |
| ==== Conserved Sequence Analysis ==== | | ==== Conserved Sequence Analysis ==== |
| | | |
Line 35: |
Line 16: |
| ==== Strip out sequences without PAM ==== | | ==== Strip out sequences without PAM ==== |
| | | |
− | {| class="wikitable"
| |
− | ! colspan="11" | Supplementary Table 1 - Base Percentage of HIV-1 Aligned Genome 730bp-752bp
| |
− | |-
| |
− | |
| |
− | | A %
| |
− | | G %
| |
− | | C %
| |
− | | T %
| |
− | | Empty %
| |
− | | Non Empty %
| |
− | | A(Corrected)
| |
− | | G(Corrected)
| |
− | | C(Corrected)
| |
− | | T(Corrected)
| |
− | |-
| |
− | | 730
| |
− | | 0
| |
− | | 0
| |
− | | 0
| |
− | | 56.47
| |
− | | 43.53
| |
− | | 56.47
| |
− | | 0.00%
| |
− | | 0.00%
| |
− | | 0.00%
| |
− | | 100.00%
| |
− | |-
| |
− | | 731
| |
− | | 0
| |
− | | 55.88
| |
− | | 0
| |
− | | 0.59
| |
− | | 43.53
| |
− | | 56.47
| |
− | | 0.00%
| |
− | | 98.96%
| |
− | | 0.00%
| |
− | | 1.04%
| |
− | |-
| |
− | | 732
| |
− | | 0
| |
− | | 0
| |
− | | 0
| |
− | | 56.47
| |
− | | 43.53
| |
− | | 56.47
| |
− | | 0.00%
| |
− | | 0.00%
| |
− | | 0.00%
| |
− | | 100.00%
| |
− | |-
| |
− | | 733
| |
− | | 0
| |
− | | 54.71
| |
− | | 0
| |
− | | 1.18
| |
− | | 43.53
| |
− | | 55.89
| |
− | | 0.00%
| |
− | | 97.89%
| |
− | | 0.00%
| |
− | | 2.11%
| |
− | |-
| |
− | | 734
| |
− | | 0
| |
− | | 0
| |
− | | 0
| |
− | | 58.24
| |
− | | 41.76
| |
− | | 58.24
| |
− | | 0.00%
| |
− | | 0.00%
| |
− | | 0.00%
| |
− | | 100.00%
| |
− | |-
| |
− | | 735
| |
− | | 56.47
| |
− | | 0.59
| |
− | | 0.59
| |
− | | 0.59
| |
− | | 41.76
| |
− | | 58.24
| |
− | | 96.96%
| |
− | | 1.01%
| |
− | | 1.01%
| |
− | | 1.01%
| |
− | |-
| |
− | | 736
| |
− | | 0
| |
− | | 1.18
| |
− | | 57.06
| |
− | | 0
| |
− | | 41.76
| |
− | | 58.24
| |
− | | 0.00%
| |
− | | 2.03%
| |
− | | 97.97%
| |
− | | 0.00%
| |
− | |-
| |
− | | 737
| |
− | | 1.18
| |
− | | 57.06
| |
− | | 0
| |
− | | 0.59
| |
− | | 41.18
| |
− | | 58.83
| |
− | | 2.01%
| |
− | | 96.99%
| |
− | | 0.00%
| |
− | | 1.00%
| |
− | |-
| |
− | | 738
| |
− | | 60
| |
− | | 0
| |
− | | 0
| |
− | | 0
| |
− | | 40
| |
− | | 60
| |
− | | 100.00%
| |
− | | 0.00%
| |
− | | 0.00%
| |
− | | 0.00%
| |
− | |-
| |
− | | 739
| |
− | | 0.59
| |
− | | 0
| |
− | | 58.82
| |
− | | 0
| |
− | | 40
| |
− | | 59.41
| |
− | | 0.99%
| |
− | | 0.00%
| |
− | | 99.01%
| |
− | | 0.00%
| |
− | |-
| |
− | | 740
| |
− | | 0
| |
− | | 0
| |
− | | 0
| |
− | | 0
| |
− | | 100
| |
− | | 0
| |
− | |
| |
− | |
| |
− | |
| |
− | |
| |
− | |-
| |
− | | 741
| |
− | | 0
| |
− | | 0
| |
− | | 0
| |
− | | 0
| |
− | | 100
| |
− | | 0
| |
− | |
| |
− | |
| |
− | |
| |
− | |
| |
− | |-
| |
− | | 742
| |
− | | 0.59
| |
− | | 0
| |
− | | 1.18
| |
− | | 58.24
| |
− | | 40
| |
− | | 60.01
| |
− | | 0.98%
| |
− | | 0.00%
| |
− | | 1.97%
| |
− | | 97.05%
| |
− | |-
| |
− | | 743
| |
− | | 0
| |
− | | 0
| |
− | | 60
| |
− | | 0
| |
− | | 40
| |
− | | 60
| |
− | | 0.00%
| |
− | | 0.00%
| |
− | | 100.00%
| |
− | | 0.00%
| |
− | |-
| |
− | | 744
| |
− | | 0
| |
− | | 1.18
| |
− | | 58.82
| |
− | | 0
| |
− | | 40
| |
− | | 60
| |
− | | 0.00%
| |
− | | 1.97%
| |
− | | 98.03%
| |
− | | 0.00%
| |
− | |-
| |
− | | 745
| |
− | | 0
| |
− | | 58.82
| |
− | | 1.18
| |
− | | 0
| |
− | | 40
| |
− | | 60
| |
− | | 0.00%
| |
− | | 98.03%
| |
− | | 1.97%
| |
− | | 0.00%
| |
− | |-
| |
− | | 746
| |
− | | 0.59
| |
− | | 0
| |
− | | 59.41
| |
− | | 0
| |
− | | 40
| |
− | | 60
| |
− | | 0.98%
| |
− | | 0.00%
| |
− | | 99.02%
| |
− | | 0.00%
| |
− | |-
| |
− | | 747
| |
− | | 0.59
| |
− | | 59.41
| |
− | | 0
| |
− | | 0
| |
− | | 40
| |
− | | 60
| |
− | | 0.98%
| |
− | | 99.02%
| |
− | | 0.00%
| |
− | | 0.00%
| |
− | |-
| |
− | | 748
| |
− | | 0.59
| |
− | | 59.41
| |
− | | 0
| |
− | | 0
| |
− | | 40
| |
− | | 60
| |
− | | 0.98%
| |
− | | 99.02%
| |
− | | 0.00%
| |
− | | 0.00%
| |
− | |-
| |
− | | 749
| |
− | | 0
| |
− | | 58.82
| |
− | | 0.59
| |
− | | 0.59
| |
− | | 40
| |
− | | 60
| |
− | | 0.00%
| |
− | | 98.03%
| |
− | | 0.98%
| |
− | | 0.98%
| |
− | |-
| |
− | | 750
| |
− | | 0.59
| |
− | | 0.59
| |
− | | 58.24
| |
− | | 0.59
| |
− | | 40
| |
− | | 60.01
| |
− | | 0.98%
| |
− | | 0.98%
| |
− | | 97.05%
| |
− | | 0.98%
| |
− | |-
| |
− | | 751
| |
− | | 60
| |
− | | 0
| |
− | | 0
| |
− | | 0
| |
− | | 40
| |
− | | 60
| |
− | | 100.00%
| |
− | | 0.00%
| |
− | | 0.00%
| |
− | | 0.00%
| |
− | |-
| |
− | | 752
| |
− | | 59.41
| |
− | | 0.59
| |
− | | 0
| |
− | | 0
| |
− | | 40
| |
− | | 60
| |
− | | 99.02%
| |
− | | 0.98%
| |
− | | 0.00%
| |
− | | 0.00%
| |
− | |}
| |
| | | |
− | ==== Select gRNA sequences with the best theoretical quality ==== | + | ====Select gRNA sequences with the best theoretical quality==== |
| + | |
| + | |
| + | |
| + | |
| | | |
− | {| class="wikitable"
| |
− | ! colspan="5" | HIV-1 Quasi-Conservative gRNAs(Useful)
| |
− | |-
| |
− | | Sequence
| |
− | | Rating(Zhang)
| |
− | | Rank(Church)
| |
− | | Free Energy(Approx.)
| |
− | |
| |
− | |-
| |
− | | GTGTGGAAAATCTCTAGCAGTGG
| |
− | | 71
| |
− | | -
| |
− | | -1.4
| |
− | | rowspan="2" | HIV1_REF_2010
| |
− | |-
| |
− | | TCTAGCAGTGGCGCCCGAACAGG
| |
− | | 97
| |
− | | -
| |
− | | -1.3
| |
− | |}
| |
| | | |
| ===Source=== | | ===Source=== |
| | | |
− | Conserved Region of the HIV-1 Genome from the NIH HIV-1 Sequence Database
| + | AB010289-AB078031 in HBV_aligned Database |
| | | |
| ===References=== | | ===References=== |
We first extracted all conserved regions from the NIH HIV-1 Reference Genome. In this step, we found around 10 alternatives for the next process. Here all screening processes are done in a per-strain basis because of the high mutability of the HIV-1 virus.