The CRISPR recipe has two main parts: A “Scissor” Cas protein that cuts or nicks the genome and a “Bloodhound” RNA guide that tethers the scissor protein to the target gene.
The team then trained their ProGen2 language model-which was fine-tuned for protein discovery-using the CRISPR atlas.
The AI eventually generated four million protein sequences with potential Cas activity.
The AI also designed proteins resembling Cas13, which targets RNA, and Cas12a, which is more compact than Cas9.
They trained the AI on roughly 240,000 different Cas9 protein structures from multiple types of animals, with the goal of generating similar proteins to replace natural ones-but with higher efficacy or precision.
With the CRISPR-Cas atlas, the team also trained AI to generate an RNA guide when given a protein sequence.
The result is a CRISPR gene editor with both components-Cas protein and RNA guide- designed by AI. Dubbed OpenCRISPR-1, its gene editing activity was similar to classic CRISPR-Cas9 systems when tested in cultured human kidney cells.