A Historical Viewpoint

The Explosion Onto The Public Stage

For those focused on how history will favor the uncovering of CRISPR and the miraculous developments that have and will spawn from it, the first thing to note is that the idea and research into the technology started several years prior to its explosion into the common public consciousness. But there is one very specific point in time, one publication to be exact, that can be stated to be the spark that got the ball rolling for certain.

That was on August 17th, 2012, when a paper was published in the prestigious journal Science and was titled A Programmable Dual-RNA–Guided DNA Endonuclease in Adaptive Bacterial Immunity. Now, gene editing wasn’t anything new by this point, scientists had been using Agrobacterium for decades to introduce plasmids with desired genes into plants and other techniques like TALENs (Transcription activator-like effector nucleases) and ZFNs (Zinc finger nucleases) had been in use for a number of years besides.

So, gene editing as a technology was known quite well and it’s hard to say whether the paper at the time was really understood to be as seminal as it would soon become. It was certainly well regarded, but as one of the biggest breakthroughs in biotechnology in decades? Likely not. Though it admittedly didn’t take long.

Jennifer Doudna and all of her companions that worked on the study would go on to win a number of prizes for this discovery from 2014 on, so you can see the short time it took for the blow-up of CRISPR to truly take place, even within the hard science community themselves. But their time came.

What happened next? Other scientists stepped in, of course. In January of 2013, the same group were able to go farther than just precisely cutting a piece of DNA with CRISPR as Doudna et al did originally. Instead, they were able to cut out a part of the human genome itself and then replace it with another sequence, showing that insertions of DNA were also a possibility with the tool.

Meanwhile, it was not long after that that the Broad Institute, one of the famous plaintiffs in the primetime judicial case over the patent on CRISPR, came into the ring. With two of their own versions of CRISPR derived from separate bacteria, S. thermophilus and S. pyogenes, they also conducted directed DNA cleavage and were in addition able to show the effects of homology repair being activated, one of the two main DNA repair mechanisms.

Early Efforts Into Deciphering Bacteria

Now you know about how CRISPR was first used to directly edit the human genome and how it came about that the scientific world and eventually the public went into a furor over its capabilities. But what has been left out thus far is the true origins of understanding on how CRISPR was even noticed in the first place. And it would be criminal to do a history of it without covering that topic.

So let’s go from 2012 and take a jump backward in time, quite far indeed, all the way back to Japan in 1987. This was an early era of DNA sequencing that allowed mostly open freedom to tackle any genome a researcher would like. There was so little known at this point, with the very first sequences having just happened the decade before, that basically any research into the field would be likely to reveal some important insight.

And that was definitely the case for Yoshizumi Ishino when he conducted a sequencing of just a few pieces of code from the Escherichia coli (E. coli) genome, though what he found wouldn’t be truly understood until years later. What he was looking into was a gene that was to be called iap and what made it interesting isn’t so much the gene itself but what they found when they tried to determine what the gene did.

The method they decided to use was to sequence some of the region around the gene and try to find where it was that proteins acted on the genome to regulate the iap gene by turning it on or off. What they found instead in the sequences upstream was not something they could explain at the time. Five sequences of repeated bases, 29 in total length, with spacers of DNA in between them.

The existence of spacers, each unique in the nucleotide sequences in them, surrounding direct repetitions of genetic code didn’t mean anything to the scientists at the time. But a decade later when sequencing became easier and faster, these repeated sections with spacers in between were found all across the bacterial tree of life. Yet the purpose of them continued to elude the people that found them.

But due to their repetition, they were decided to eventually, in 2002, be given a name. Thus, they were dubbed “clustered regularly interspaced short palindromic repeats” or CRISPRs for short and, well, you can figure out the rest yourself.

There’s a lot more to the history of CRISPR, discoveries that were made that led up to Doudna’s famous publication, but this is enough to know the basics of how it was first discovered and how it became big. Textbooks will likely go into detail about all of that in the future, but there’s been enough stalling in this article. See the references at the bottom if you’d like to know more.

But now, on to an actual discussion of CRISPR, what it does, and all the varieties of it that have been found.

An Overview of CRISPR

The most immediate thing to note is just how widespread CRISPR is as a process used by cellular life. About 40% of all studied bacteria have the system in their genome, but that’s practically nothing compared to Archaea, where 90% of all those studied have it. They just lose out overall due to having less in numbers, at least of those studied.

It’s also possible for an organism to have more than a single CRISPR locus in their genome. This adds redundancy to its functions and life loves protective redundancy. So, what does CRISPR do? What exactly do these clusters of repeated and palindromic sequences of genetic material accomplish? The answer to this is multi-fold, split into three parts.

Accumulating Spacers

The first is its accumulation of additional spacers into its sequences. Remember those in-between spacers, always different and unique, and that seemed to just separate the repeated sequences? Those are actually sequences of DNA obtained from attacking viruses. Upon defeating its invaders, a bacteria’s CRISPR system cuts out a segment of the viral genome that helps identify it in particular and inserts it in between the repeated sections.

Adding to this allows the bacteria in the future to immediately identify a viral genome trying to worm its way into the bacteria’s DNA and instead chop it into pieces. But this also means that the system can be exploited by scientists, who can add in their own desired DNA fragments into the cluster and have it cut apart that, relying on any genome that CRISPR’s scissors are inserted into to repair itself naturally.

The particular fragment each species of bacteria retains is related to the types of viruses that attack it in its local environment and it passes these on to ensuing generations to protect them as well. Tests have shown that introducing bacteria with CRISPR systems into new environments results in them dropping almost all of the viral fragments they were keeping and obtaining new ones for the different threats they had to then face.

Production of crRNAs and Interrogation

The second step involves the production of CRISPR-specific RNAs (crRNAs). These transcription factors act as interference and target identifiers when a viral attack is under way. To obtain the individual small pieces of crRNA that involve each particular viral sequence, the long copied strand containing all of them must be chopped into pieces. This is one of the tasks of the Cas protein systems, during the pre-production phase of making crRNA.

Once these crRNAs have been produced, it is time to move onto the third and final function, active interference and destruction of the invading virus. They combine with a Cas ribonucleoprotein complex (crRNP), also just called an Cas endonuclease, and begin interrogating the foreign DNA to see if it is just a picked up plasmid or an attack. This process is extremely precise and thus can be easily broken, as the CRISPR sequence needs to perfectly match every single nucleotide of the tested DNA for it to activate the scissor snipping process. So bacteria are very protective of their CRISPR sequences to make sure no errors are introduced. This is what aids in their precision.

The crRNAs guide the resulting complex toward matching (or thought to be matching) sequences. This targeting system is controlled and activated by what is called the protospacer adjacent motif (PAM), a small segment of DNA that is essentially an activator for cleaving the invading DNA. It is attached to the end of the reading crRNAs that replicate the viral spacer segments and, if the CRISPR and Cas complex do not read a PAM sequence after a spacer, they will not activate their cleavage. Though, it should be noted, this isn’t true for all Cas systems.

Targeting RNA

It was originally believed that CRISPR systems in bacteria only targeted viral DNA and that any RNA specific phages would be free to attack bacteria. But due to their comparative rarity, this would make sense, since the bacteria would encounter them often enough to form an appropriate defense. Archaea, comparatively, have RNA targeting systems that seem to match the viruses they fight.

But that isn’t entirely true for all bacteria. Last year, a new CRISPR system named C2c2 was found in the bacteria Leptotrichia shahii that does specifically target viral RNA, showing that there are indeed some bacteria that develop those sorts of defenses when they need it. And this was an important find, as being able to target RNA for editing as well opens up even more options for the future.

That should be enough to help explain the basics of how CRISPR functions and how it does what it does. There’s plenty of more specific technical and chemical information involved, but this is enough detail for an overview article. Now, on to the specific types of CRISPR systems that have been discovered and the differences between them.

Cas9 And Other Systems

When discussing the specific Cas systems, that inevitably leads to a breakdown of the Cas endonuclease, because it is this complex that the CRISPR RNAs attach to that ultimately does the active work of reading and cleaving any offending DNA (or RNA) segments. But let’s break down the types first.

There are five primary types of Cas systems, running from the roman numerals of I to V (The RNA-based C2c2 system mentioned before might actually count as a new type VI), though they are largely classed into two functional groups. Types I, III, and IV are grouped together because they all make up what are called multiprotein effector complexes. What are those? Let’s talk about them for a moment.

Cascade (Type I)

Type I was originally discovered in E. coli and was dubbed Cascade, due to how it uses a cascading system of activation. The Cascade complex is what the CRISPR RNAs bind to in E. coli and are made of five different Cas proteins (labeled as CasA-CasE). The system as a whole is incredibly basic, but sufficient to protect the bacteria from phage attacks. But this simplicity was what allowed scientists to properly unravel how it functioned.

What they found was that the most important part of the process was Cas3 (CasC), which invokes a catalytic response in the Cascade complex and accelerates its ability to destroy the foreign DNA.

CRISPR-Cmr (Type III-B) and CRISPR-Csm (Type III-A)

These systems are rather similar to the previously mentioned primitive Cascade process, implying perhaps that divergent evolution resulted in these subtype splitting off sometime in early history. The primary difference for these CRISPR types is that Type I and II focus on double-stranded DNA, as you’d normally find in most organisms and a wide variety of viruses, but Type III instead targets single-stranded RNA. Furthermore, these types are almost always found in Archaea rather than bacteria.

The partner subtype, CRISPR-Csm, is the same as CRISPR-Cmr and also shares a number of crRNAs. The main differences between Type III subtypes and other Cas units is some structural changes within the complex, such as swapping out the Cas8 subunit on the complex in Type I to a different subunit titled Cas10 in this type. Also, there is no need for a PAM sequence to begin the process of reading and cleaving and they instead use Cas10 for that purpose. Other biological and chemical changes such as this exist, but are too detailed to relay here.

That’s basically it for this system. It functions the same as the CRISPR-complex combination does in all the other Cas types.

CRISPR-Cas Type IV

This is the loneliest of the types of systems, because so little is known about it. It is largely just considered an experimental group and only perfunctorily is added to the class 1 groups with Type I and Type III systems. We know that they form multi-protein complexes just like the other two types, but they don’t associate with the usual, common Cas proteins or the CRISPR coding itself. It has the proteins that build the effector complex, but none of the spacer or cleavage proteins like Cas1 and Cas2.

There is currently little understanding of what the mode of action for Type IV is or what it really does.

Class 2 (Types II and V)

Finally onto the second category of Cas systems, which are instead characterized by single proteins making up the effector complex, rather than multi-protein units as in the other types. These proteins are large and are capable of acting within multiple domains, such as reading, cleaving, and regulation of the process, which needed multiple proteins to manage in other types.

This difference results in what has been called a more “streamlined” CRISPR system that is easier to use without as many parts, but retaining the same amount of functionality as the other types. It’s due to this that Type II and Type V are the most popular of the types to use in active gene editing in science.

CRISPR-Cas9 (Type II)

And, of course, the biggest up and comer of them all is the Cas9 endonuclease that makes up Type II. It uses a dual guidance system involving crRNA and trans-activating RNA (tracrRNA) that are able to guide the complexes to create very precise blunt DNA breaks for those fragments in front of a PAM sequence. The focus on double-stranded DNA, as is most common, is also helpful, because that means this easiest to use system can be utilized against all multi-cellular life.

The Cas9 protein acts across multiple domains, with it having a small amount of interaction in processing crRNAs into being, the entire target binding and reading system, the entire cleavage system, and a small amount of spacer insertion as well. It’s because of this multi-tiered functionality that Cas9 is thought of as the most important figure in all the CRISPR systems, since it doesn’t require all the other Cas proteins, except for Cas1 and Cas2 for spacer insertion.

Type II Subtypes (Csn2 and Cas4)

Similar to Type III, there are also two primary subtypes that make up Type II Cas9 systems. The first, dubbed Type II-A, uses the protein Csn2 that helps in integrating new spacers into the CRISPR locus. What it specifically does in this process is somewhat unknown, however, though it is suspected that it helps bind the new DNA fragments into place and protects them from degradation or errors, since CRISPR spacers have to be precise to be useful. It also may invite DNA repair molecules to seal the breaks around the spacers during insertion.

Comparatively, Type II-B instead use the Cas4 family of proteins and, similarly to its sibling subtype, its actual function isn’t quite clear. Due to it not being directly linked to the CRISPR system, unlike Csn2, it is believed that it may be instead involved in regulating the immunity defense portion of the Cas complex.

CRISPR-Cpf1 (Type V)

The final type of system to discuss is much rarer and took longer to discover and properly characterize. In many ways, it is similar to Cas9, which is why they ended up being classed together. The Cpf1 protein takes the same position as Cas9 and most of the functionality, with it acting as a single protein effector complex for binding to crRNAs and singling out the target viral DNA, along with it retaining cleavage capabilities. Though it lacks the crRNA production or spacer insertion methods, which aren’t really necessary and are covered by other proteins.

The primary distinction between the two systems is that Cpf1 creates different types of ends to DNA fragments when cutting open a genome for insertion. Rather than the blunt ends that Cas9 makes, Cpf1 creates sticky ends that have overhangs that stick out from a single strand, containing several more nucleotide base pairs. In many ways, these sorts of cuts are similar to when using restriction endonucleases, which are enzymes that cut up DNA before or after particular amino acids, depending on which endonuclease it is.

For a lot of cases in molecular biology, a sticky end cutting process is preferable, because the overhanging nucleotides will be complementary to those on the opposite strand on the actual genome, meaning they will naturally and by themselves bond to each other. This results in the insertion happening by itself once the cut has occurred.

Blunt ends are more complicated. Since they are straight up and down cuts with proper nucleotide pairing already in place, a ligase must be used to bind the actual ends of the inserted fragment to the rest of the DNA strands. This is an extra process that is less efficient and has, however small, a higher chance of an error occurring than with sticky ends.

An additional factor that aids Cpf1 in comparisons to Cas9 is that it is much smaller, lacking the extra domains Cas9 has, the extra proteins for crRNA processing, and one less protein for cleavage, all of which are unnecessary within scientific usages of the systems. This shortness then means that Cpf1 can be inserted into more types of cells and also packaged into viral vectors or plasmids with extra guide crRNAs (or tracrRNAs) that allow for more precision and additional fragments to target.

It’s quite possible that in the near future, Cpf1 will eclipse Cas9 in popularity due to its many-fold advantages. But only time will tell in that regard.

The End (For Now)

And that’s the end of this primer on CRISPR and the main Cas systems available for it. There is still plenty of other discoveries within the CRISPR field to discuss and new types of systems being discovered practically daily, but as an overview of the topic, this feels comprehensive enough.

I hope this guide through the history of CRISPR, how it functions, and what the different types of Cas systems accomplish has been helpful to your understanding of CRISPR and its place of importance within the field of biology.

Further Reading

1. The New Smallest Version of CRISPR-Cas9 Has Been Discovered – A look into the specifics on Cas9 system sizes and the search for smaller variants that allow greater functionality, specifically the discovery of CjCas9.

2. A New Class of CRISPR Detailed and Its Variants Discovered – An overview of the new Type VI class that C2c2 (Now known as Cas13) has been assigned to.

3. RNA-Targeting CRISPR-Cas9 Cures Neurological Disease in Living Cells – A better explanation of how the PAM sequence functions, along with a discussion on how Cas9 has been expanded to work with RNA.

4. The Relaxed System of CRISPR-Cas10 Is Deadly To Mutated Viruses – A look at CRISPR-Cas10 (discussed above as Csm) and its ability to target even mutated gene sequences.

References

1. Zimmer, C. (2015, February 6). Breakthrough DNA Editor Born of Bacteria. Quanta Magazine. Retrieved March 10, 2017, from https://www.quantamagazine.org/20150206-crispr-dna-editor-bacteria/

2. Jinek, M., Chylinski, K., Fonfara, I., Hauer, M., Doudna, J. A., & Charpentier, E. (2012). A Programmable Dual-RNA–Guided DNA Endonuclease in Adaptive Bacterial Immunity [Abstract]. Science, 337(6096), 816-821. doi:10.1126/science.1225829

3. Marraffini, L. A., & Sontheimer, E. J. (March 2010). CRISPR interference: RNA-directed adaptive immunity in bacteria and archaea. Nature Reviews: Genetics, 11(3), 181-190. doi:10.1038/nrg2749

4. Abudayyeh, O. O., Gootenberg, J. S., Konermann, S., Joung, J., Slaymaker, I. M., Cox, D. B., . . . Zhang, F. (June 2016). C2c2 is a single-component programmable RNA-guided RNA-targeting CRISPR effector [Abstract]. Science, 353(6299). doi:10.1126/science.aaf5573

5. Barrangou, R. (2015). Diversity of CRISPR-Cas immune systems and molecular machines. Genome Biology, 16, 247. doi:10.1186/s13059-015-0816-9

6. Pečnerová, P. (2016, September 8). The almighty CRISPR-Cas9 technology: How does it work? The Molecular Ecologist. Retrieved March 20, 2017, from http://www.molecularecologist.com/2016/09/the-almighty-crispr-cas9-technology-how-does-it-work/

7. Karginov, F. V., & Hannon, G. J. (Jan 2010). The CRISPR system: small RNA-guided defense in bacteria and archaea. Molecular Cell, 37(1), 7. doi:10.1016/j.molcel.2009.12.033

8. Taylor, D. W., Zhu, Y., Staals, R. H., Kornfeld, J. E., Shinkai, A., Van der Oost, J., . . . Doudna, J. A. (May 2015). Structures of the CRISPR-Cmr complex reveal mode of RNA target positioning [Abstract]. Science, 348(6234), 581-585. doi:10.1126/science.aaa4535

9. Staals, R. H. et al. (Nov 2014). RNA Targeting by the Type III-A CRISPR-Cas Csm Complex of Thermus thermophilus. Molecular Cell, 56(4), 518-530. doi:10.1016/j.molcel.2014.10.005

10. Makarova, K. S. et al. (Sep 2015). Figure 1: Functional classification of Cas proteins (from “An updated evolutionary classification of CRISPR–Cas systems”) . Nature Reviews: Microbiology, 13(11), 722. doi:10.1038/nrmicro3569

11. Chylinski, K. et al. (Jun 2014). Classification and evolution of type II CRISPR-Cas systems. Nucleic Acids Research, 42(10), 6091-6105. doi:10.1093/nar/gku241

12. Ledford, H. (2015, September 25). Alternative CRISPR system could improve genome editing. Nature, 526(7571), 17. doi:10.1038/nature.2015.18432