Harnessing Generative AI to Treat Undruggable Diseases
A new platform can design and match small peptides with complex, tangled proteins previously considered unreachable
Biomedical engineers at Duke University have developed an AI-based platform that designs short proteins, termed peptides, capable of binding and destroying previously undruggable disease-causing proteins. Inspired by OpenAI’s image generation model, their new algorithm can rapidly prioritize peptides for experimental testing.
The work appeared Jan. 22 in the journal Science Advances.
One approach to treat disease is to develop therapeutics that can specifically target and destroy the proteins driving it. Sometimes these key proteins have well-defined structures, like a neatly folded origami crane, so conventional small molecule therapies can easily bind to them. But more than 80% of disease-causing proteins instead resemble a messy ball of yarn––disordered and tangled––making it incredibly difficult for standard therapies to find a pocket on the surface to latch on and do their job.
To circumvent this issue, researchers have explored how peptides can be used to bind to and degrade disease-causing proteins. Because peptides are small versions of proteins, they don’t require surface pockets for binding. Instead, they can bind to various amino acid sequences throughout the protein. But even these approaches have their limits, as existing “off-the-shelf” binders have not been designed to attach to unstable or overly tangled protein structures. While scientists have been working on developing new binding proteins, these approaches still rely on mapping out the 3D structural information of the target protein, which are not available for disordered targets.
Rather than try to map out the structures of the disease-causing proteins, Pranam Chatterjee, an assistant professor of biomedical engineering at Duke, and his team took inspiration from generative large language models (LLMs) to create a solution. The result is PepPrCLIP, or Peptide Prioritization via CLIP. The first component of their tool––PepPr––uses a generative algorithm trained on a vast library of natural protein sequences to design new ‘guide’ proteins with specified characteristics. CLIP, the second component of their platform, utilizes an algorithm framework, initially developed by OpenAI to match images corresponding captions together, to test and screen which of these peptides match with their targeted proteins. The CLIP model, here, only requires the target sequence.
Assistant Professor of Biomedical EngineeringOpenAI’s CLIP algorithm connects language with an image. If you have text that says ‘dog,’ you should get an image of a dog. Instead of language and image, we trained it to match peptides and proteins. PepPr makes the peptides, and our adapted CLIP algorithm will screen those peptides and tell us which ones will make a good match.
In a comparison against RFDiffusion, an existing platform for generating peptides using the 3D structure of a target, PepPrCLIP was faster and was able to create peptides that were almost always a better match for their targeted protein. To gauge how well PepPrCLIP could work with both ordered and disordered protein targets, Chatterjee and his lab teamed up with teams of researchers from Duke University Medical School, Cornell University, and Sanford Burnham Prebys Medical Discovery Institute to experimentally test the platform.
In the first test, the team showed that PepPrCLIP-generated peptides could effectively bind to and inhibit the activity of UltraID, a relatively simple and stable enzyme protein. Next, they used PepPrCLIP to design peptides that could attach to beta-catenin, a disordered, complex protein involved in signaling for several different types of cancer. The team generated six peptides that CLIP indicated could bind to the protein and showed that four could effectively bond to and degrade their target. By destroying the protein, they can slow down cancer cell signaling.
In their most complicated test, the team designed peptides that could bind to a highly disordered protein affiliated with synovial sarcoma, a rare, aggressive cancer that can develop in soft tissue and mostly affects children and young adults. And according to Chatterjee, “It’s like a bowl of spaghetti. It’s the most disordered protein in the world.”
The team tested 10 designs by putting their peptides into synovial sarcoma cells. They observed that the PepPrCLIP-designed peptides could both bind and degrade the protein, just as it had with simpler targets. And if they can destroy the protein, they have an opportunity to develop a therapy for a previously undruggable cancer.
Beyond plans to continue to improve their platform, Chatterjee and his team plan to partner with medical and industry professionals to begin creating peptides that could eventually be used in new therapies for diseases caused by unstable proteins like Alexander’s Disease, a fatal neurological disease that primarily affects children, and different types of cancers.
“These complex, disordered proteins have made a lot of cancers and diseases practically undruggable because we couldn’t design molecules that bind to them,” said Chatterjee. But PepPrCLIP showed that it could work on even the most complicated protein, and that opens up a lot of exciting clinical possibilities.