Machine learning-based end-to-end CRISPR/Cas9 guide design: Introducing CRISPR ML

Jan. 28, 2018, 3:56 a.m. By: Kirti Bakshi

CRISPR ML

DNA is the building block of life, but DNA can also contain glitches that contribute to serious and unavoidable health issues that affect billions of people. What if there was way to change your DNA to eliminate the glitches before they caused problems?

Moving onto what exactly is CRISPR and its NATURE:

“The CRISPR system was not designed, it evolved,” said John Doench, an associate director at the Broad Institute who leads the biological portions of the research collaboration with Microsoft.

CRISPR is basically a sewing kit that is nano-sized and can be designed to cut and alter DNA at a specific point in a specific gene.

The sewing kit stands for “clustered regularly interspaced short palindromic repeats,” which describes a pattern of DNA sequences that are repeating in the genomes of bacteria further separated by short, non-repeating spacer DNA sequences.

The gene-editing system of CRISPR has been adapted from a natural virus-fighting mechanism. In the late 1980s Scientists discovered it in the DNA of bacteria and over the course of the next several decades figured out how it works.

Elevation and off-target effects:

The newest tool released by the team called Elevation makes the use of machine learning that is a branch of artificial intelligence in order to predict the so-called off-target effects when editing genes with the CRISPR system.

Although CRISPR in a number of fields shows great promise, one challenge that is faced is that lots of genomic regions are similar, which means accidentally the nano-sized sewing kit can go to work on the wrong gene and result in unintended consequences – hence we call them off-target effects.

“Off-target effects are something that one would really want to avoid as you want to make sure that your experiment doesn’t mess up something else," said Nicolo Fusi, a researcher at Microsoft’s research lab in Cambridge, Massachusetts.

Off-target scores:

Elevation for every guide provides researchers with two kinds of Off-target scores:

  • Individual scores for one target region.

  • Single overall summary score for that guide.

Target scores are probabilities that are based on machine-learning and are provided for every single region of the genome that something negative could happen.

For every single guide, Elevation returns hundreds to thousands of these off-target scores.

"These individual off-target scores alone can be cumbersome for researchers that aim at trying to determine which of potentially hundreds of guides to use for a given experiment," noted Listgarten.

A single number called the summary score lumps the off-target scores together to provide an overview of how likely the guide is to cause disruption to the cell over all its potential off-targets.

“Instead of a probability for each point in the genome, it is what’s the probability I am going to mess up this cell because of all of the off-target activities of the guide?” said Listgarten.

“Our job,” said Fusi, “is to get people who work in molecular biology the best tools that we can.”

Modern adaptions:

In 2012, molecular biologists then figured out on how to adapt the bacterial virus-fighting system to edit genes in organisms that range from plants to mice and even humans. The result came out to be the CRISPR-Cas9 gene editing technique.

The basic system works like this: The Scientists first design synthetic guide RNA to match a DNA sequence in the gene that they want to cut or edit and then set it loose in a cell with the CRISPR-associated protein scissors, Cas9.

Today, this technique is widely used as a precise and efficient way if it is wished to understand the role of individual genes in everything that ranges from people to trees and alongside, on how to change genes to do everything from fight diseases to grow more food as well.

CRISPR has been a complete game changer if one wishes to understand how gene dysfunction leads to any disease, taking for example of how the gene normally functions,” said Doench.

Another Challenge and why CRISPR ML:

Another challenge for researchers was to decide what guide RNA to choose for a given experiment as each guide roughly is 20 nucleotides and hundreds of potential guides exist for each target gene in a knockout experiment.

Also, In general, each guide comes with a different degree of off-target activity as well as a different on-target efficiency. So, The collaboration between the computer scientists and biologists then focused on building tools that can help researchers search through the guide choices and find the best one for their experiments.

Several research teams have now designed rules that help determine where off-targets are for any given gene-editing experiment and also how to avoid them. “The rules are very hand-made and very hand-tailored,” said Fusi. “And so, We decided to tackle this problem with machine learning.” And hence came CRISPR ML: where Machine learning meets Editing.

Official Link: Click Here

CRISPR.ML - Machine learning meets gene editing:

Video Source: Microsoft Research