Minimizing off-target effects in CRISPR experiments


Image taken from Addgene’s blogpost on CRISPR software tools

CRISPR is taking the world by storm. The latest gene editing technique surpasses current known methods due to its ease of application, low cost, and ability to be used in almost any system. The only concern that many have however is the extent of its off-target effects. Several studies have been performed trying to characterize this, and the mass consensus is that the off-target effects vary with the guide RNA (gRNA) used. In one study, the number of off-target effects for each single gRNA ranged from 0-150! gRNAs can also tolerate bulges in DNA and not just single-base mismatches, increasing their propensity for non-specific binding. And unfortunately, current computational tools cannot readily predict off-target events based on gRNA sequences.

The Cas9 enzyme bears some influence on off-target activity as well. A recent study found that the extent of off-target effects could be reduced by using a modified Cas9 with lower binding energies towards DNA. This supposedly prevents non-specific associations with non-target DNA sites.

So yes off-targets are a matter of concern when doing CRISPR experiments, which means there’s still a lot to be done before this gene editing technique sees the light of clinical trials. But for the sake of the ordinary scientist, here’s a list of ways you could reduce off-targets in your own CRISPR experiment (hopefully using cells/animals):

1. Use the Cas9 recombinant protein rather than a plasmid.  Studies have found this reduces off-target effects, presumably because expression of Cas9 is more transient. Having Cas9 in a plasmid induces expression of Cas9 that tends to accumulate over days and may lead to more non-specific Cas9-mediated cleavages.

2. Use gRNAs that begin with GG at the 5′ end. Reasons are unknown but these gRNAs tend to be more specific.

3. Use shorter gRNAs. Specifically, 17 to 18 nucleotide long gRNAs were found to be more specific than their 20 nucleotide long counterparts. This might again be due to their lower binding energy, however the shorter sequences limit the ability to choose specific sequences. So it is useful to note that mismatch tolerance on the gRNA is position dependent as well, with base 8-14 at the 3′ end of the gRNA being less tolerant to mismatches compared to the 5′ end. Also, stick to 40-60% GC content.

4. Use D10A Cas9 nickase. This is a modified version of Cas9 that nicks (i.e cuts only one strand of the DNA) rather than cuts. So one would have to design two gRNAs that target both strands of DNA at the specific site in order for Cas9 D10A nickase to produce a complete double-strand break. And as you can imagine chances of this happening non-specifically are really low. In the same vein, a dead mutant Cas9 (dCas9) fused to FokI has also been designed where two dCas9-FokI molecules are required to bind both strands on the target site before FokI-mediated cleavage can occur.

5. Use a different Cas9 ortholog with longer PAM requirements. The longer PAM sequences enforce an additional sequence requirement for cleavage and Cas9 orthologs from Staphylococcus aureus, Streptococcus thermophilus and Neisseria meningitidis have all been found to have reduced off-target activity.

6. Target the promoter/transcription start site region. These regions tend to have a more open chromatin structure that allows better gRNA-Cas9 access. Further studies found that seed sequences that are more likely to form secondary structures, tended to have a greater chance of being cut by gRNA-Cas9.

7. Don’t use too much DNA during transfection. This kind of correlates with RNAi experiments where higher siRNA concentrations lead to more off-target effects. Large amounts of transfection reagent/DNA  could trigger an immune/toxic response within cells that result in random DNA strand breakages.

So you’ve followed all the above rules, now comes the question, just how do you make sure there really have been no off-target events?

There have been many genome-scale monitoring methods devised. Many tried to detect Cas9 binding sites (via ChIP-Seq or SELEX) but then realized that Cas9 binding does not necessarily correlate with Cas9 cleavage. Some performed Cas9 cleavages in vitro followed by sequencing to characterize cleavage sites (Digenome-Seq) but could only detect a small subset of actual off-target events, indicating an in vivo context may be required. The GUIDE-seq approach detected sites of incorporation of a short double-stranded DNA fragment into gRNA-Cas9-mediated cuts that can occur during non-homologous end joining (NHEJ), and identified more off-target sites than any other method. However these read-outs are also likely underestimations of actual off-target frequencies since cleavage events may occur without fragment incorporation.

To perform a more targeted approach, one could come up with a list of possible off-target sites based on 1-3 tolerated mismatches near the 5′ end. And target your sequencing or endonuclease assays towards those regions. You might be restricting yourself but at least it provides some assurance that your gRNA is showing some respectable specificity. I hope I have covered enough ground, but if you would like to read more, try here, here and here. And for a refresher on how CRISPR works, you can read this old blog post.


What you need to know about non-coding RNA

Having just returned from an EMBO organised conference on non-coding RNA (ncRNA), I am bursting with facts about non-coding RNA which may be of interest (or not) to some of you. ncRNAs are RNAs that have been transcribed from the genome but do not get translated into protein. There is a burgeoning interest in them mainly because they represent a substantial proportion of the genome as compared to coding RNA. Specifically, ncRNAs make up ~80% of the genome and ~95% of the transcriptome (source: this paper). Here’s what you need to know about ncRNAs:

1. How many types of ncRNAs are there? So far, 8, maybe 9. These include transfer RNAs (1) and ribosomal RNAs (2) which were well-characterized alongside messenger/coding RNA in the 1970s and partake in protein synthesis. Small ncRNAs, that include microRNAs (3), siRNAs (4)  and germline specific piwi-RNAs (5), range between 21 to 35 nucleotides. Their discovery revealed a novel system termed RNA interference, used by organisms to regulate gene expression. In essence, the small ncRNAs bind to longer protein-coding RNAs, leading to their subsequent degradation or translation inhibition. Other small nuclear-based RNAs termed snoRNAs or U-RNAs (6) are also classified as ncRNAs and are involved in RNA editing or splicing. The more recent forms of ncRNAs include long ncRNAs (7) which are 200 nucleotides or longer, and enhancer RNAs or promoter-associated RNAs (8), which range from 100-9000 nucleotides and are transcribed from enhancer regions on DNA. There is also a new kid on the block called circular RNAs (9) but its a little uncertain if they are non-coding as some do get translated into protein.


Graphical representation of the transcription in mammals from Mattick and Manukin, Hum Mol Gen, 2006. Excerpt – ” The area of the box represents the genome. The area of large green circle is equivalent to the documented extent of transcription, with the darker green area corresponding to that on both strands. It should be noted that these estimates may and probably will increase as more information comes to hand. The function of most of these transcripts is unknown. CDSs are protein-coding sequences, and UTRs are 5′- and 3′-untranslated sequences in mRNAs. The dots indicate (and in fact overstate) the proportion of the genome occupied by known snoRNAs and miRNAs.”

2. How did we not notice them before? This is mainly in relation to long ncRNAs (lncRNAs) whose presence was only confirmed in 2005. They were formerly thought to be junk or “transcriptional noise” because of their low abundance and low sequence conservation. Furthermore many of them were missed in sequencing analysis due to the use of polyA-tail based isolation techniques since a good proportion of them (~40%) are non-polyadenylated.

3. Are they really that important? Well, yes. The smaller ncRNAs such as miRNAs, piRNAs and snoRNAs already have established functions in gene expression regulation and RNA editing. What’s more unclear is the role of lncRNAs. Some well-studied lncRNAs such as XIST, H19 and HOTAIR induce epigenetic silencing of genomic loci, namely through recruitment of chromatin modifiers that affect access to transcriptional machinery. However a vast majority of lncRNAs have undefined functions and some groups argue that as much as 90% of transcription by RNA polymerase II can be random. Furthermore, there is some evidence that transcription events tend to “ripple” into neighbouring regions giving rise to “leaky transcription” that may give rise to ncRNAs.

However, based on numerous reports and from what I garner from the conference, the highly-regulated expression of lncRNAs, both spatially and temporally, argues that at least some of them are important for specific functions. This is confirmed by observations that their expression closely correlates and in some cases is required for certain phenotypes including embryonic development, pluripotency, cell cycle progression, cell proliferation and death, and even motor function.

From an evolutionary perspective, despite the low sequence conservation of lncRNAs across organisms, sequence conservation of lncRNA promoters are comparable if not higher than protein-coding RNAs. ncRNAs are thought to have less stringent restraints on sequence than coding RNAs as they do not have to maintain correct reading frames and rely more on their secondary structures (which rely more on short stretches of sequences) for normal function. Indeed, high sequence conservation was seen for short defined regions of lncRNAs. Xist for example, has a well-defined function across species, and possesses short domains with high sequence conservation yet overall sequence conservation is low. lncRNAs may also be under rapid evolution, where due to the aforementioned less stringent species requirements, they undergo sequence mutations a lot more rapidly. Its interesting that organism complexity shows little correlation with genome size but massive correlation with levels of lncRNAs.

4. How do people study them?  RNA-sequencing is commonly used to identify lncRNAs but this is hampered by the low abundance of lncRNAs, which often leads to erroneous transcript reconstruction. A recent method of RNA-captureseq, developed by the Mattick lab in Sydney, utilizes DNA probes to enrich RNA derived from certain genomic regions of interest (i.e. containing sites where ncRNAs are derived) followed by RNA sequencing which provides greater depth and coverage. You can read more about it here.

To demonstrate functional importance of ncRNAs, many are using CRISPr or RNAi technologies, which I have already  spoken at length about.

That’s about the lowdown for now. There is substantial evidence that ncRNAs play important roles in disease but the field is still pretty new. So one has to wait a while before they start to provide new avenues for therapeutic intervention.

Bioshot News

Some weekly industry news updates:

New Gilead CEO John Milligan takes over from John Martin 

  • Both men joined at the same time in 1990. John Milligan became COO in 2007 then President in 2008.
  • Takeover comes at a crucial time as Gilead is facing increasing pressure in the Hepatitis C field from competitors Merck and AbbVie.
  • Rumor has it they might be scouting for a new acquisition.

Theranos’ Wellness center based in Walgreens at Palo Alto closed

  • Theranos received a warning letter from the US Centers for Medicare and Medicaid Services which stated that testing at the center posed “immediate jeopardy to patient safety”. The letter gives Theranos 10 days to provide “acceptable evidence of correction”.
  • Walgreens has informed Theranos that all samples collected in their stores must only be sent to Theranos’ certified lab in Pheonix.
  • Their Disruptive technology has been called into question over the last few months as previous employees report that their minute blood sampling technology has only been applied to a small percentage of patients.
  • Theranos also still has yet to publish details of their technology, inciting skepticism and criticism from the scientific community.

Zika virus outbreak

  • A worldwide alert was issued by the Pan American Health Organization (PAHO) as an infection was reported in Brazil during May 2015 (speculation that it was triggered by the massive traveler influx during the World Cup).
  • Transmitted by the Aedes mosquito, 80% of infected people do not display symptoms, other 20% show symptoms such as fever, rashes, joint pain, and conjunctivitis. Consequences for pregnant women are more severe, resulting in poor outcomes and babies born with abnormally small heads i.e microecephaly.
  • No cure or vaccine for now. People are currently being advised to cover their arms and use insect repellent or to avoid travel to infected areas.

Pfizer-Allergan partners with Astrazeneca to develop a new antibiotic with the help of the U.S. government

  • Biomedical Advanced Research and Development Authority (BARDA), part of the US Department of Health, is offering $50 million for drug discovery with a potential $170 million over five years in a plan to build a portfolio of drug candidates for the treatment of illnesses caused by bioterrorism agents and antibiotic-resistant infections.
  • Pharmas often claim making antibiotics are not cost-effective due to low returns on investment. However the threat of microbial resistance has seen global governments trying to seek new ways to work with pharma to increase antibiotic development.

New sweat monitoring technology

  • Published in Nature, the device monitors glucose, sodium, lactate and potassium levels in sweat emitted on the skin’s surface.

Eisai acquires Liaoning TianYi Biological Pharmaceutical Co Ltd

  • Japanese pharma Eisai, aims to tap into China’s pharmaceutical market valued at US$109.3 billion – generics said to make up 80%.

Takeda teams up with universities to develop clinical applications of induced pluripotent stem cells (iPS). 

  • Takeda will provide collaborative funding of 20 billion yen (~US$166 million), and jointly run multiple projects led by researchers invited from The Center for iPS Cell Research and Application (CiRA) at Kyoto University and other universities.

US-based Accorda buys Finnish Biotie for 321 million

  • The deal serves to boost Accorda’s neurodegenerative disease pipeline as Biotie’s adenosine 2a receptor inhibitor is already in PIII for Parkinson’s disease.

Treeway won orphan drug status for its investigational drug for ALS

  • Started by two ALS patients Bernard Muller and Robbert Jan Stuit.
  • Orphan drug status for TW001 allows for reduced clinical trial costs, a $2M waiver in fees for NDA submission and seven year market exclusivity.
  • ALS only has one approved drug – Riluzole (Sanofi)
  • Biogen’s recent ALS candidate drug dexpramipexole crashed out during PIII.

Dementia Discovery Fund

  • $100 million has already been raised for a new fund set up to invest in drugs for dementia.
  • The fund gains monetary and scientific support from the UK Department of Health, Alzheimer’s research UK (charity) and major global pharmas including Biogen, Pfizer, Eli Lily, Takeda and J&J.
  • Headed by SV Life Sciences, a VC with established track record of investing in Life Science companies since the 1980s.