13 May 2024

Fragments in cells, writ large

Earlier this year we highlighted work in which a dozen fragments were screened against cells to look for noncovalent binders across the proteome. A new paper in Science by Georg Winter and collaborators at the Austrian Academy of Sciences, Pfizer, and several other organizations ups the game by more than an order of magnitude, and uses machine learning to make predictions about fragments’ cellular destinations and binding partners. (See also Derek Lowe’s post here.)
 
The researchers started with 407 diverse fully functionalized fragments (FFFs), which as we previously discussed consist of a variable fragment coupled to a photoreactive group and an alkyne moiety that can be used to pull down any bound proteins using click chemistry. These were selected from a larger set of ~6000 FFFs available from Enamine. The FFFs were incubated at 50 µM with intact HEK293T cells, followed by ultraviolet crosslinking.
 
Next, cells were lysed and treated with a biotin-azide probe that reacts with the alkyne on the FFFs. Covalently modified proteins were captured on streptavidin resin and proteolytically digested. Tandem mass tag (TMT) proteomics, which we wrote about here, was used to identify captured proteins. Unlike earlier methods, the researchers did not pinpoint the specific fragment binding sites on proteins.
 
In total the researchers found 2667 proteins bound to one or more fragments, of which ~86% had no reported ligands. Both proteins and ligands varied considerably in promiscuity: some proteins bound to more than half of the FFFs, and some fragments bound to hundreds of proteins, while others bound only a few, or none. To look for specific interactions, the researchers focused on proteins bound by fewer than 10 different ligands.
 
Three protein-ligand interactions were analyzed in some detail: the kinase CDK2 (and other CDK family members), the adapter protein DDB1, and the solute carrier protein SLC29A1. In each case the researchers confirmed the results from their chemoproteomic screens. Follow-up studies with related molecules led to more potent derivatives, with a CDK2 inhibitor showing low micromolar activity in a biochemical assay and an SLC29A1 inhibitor showing micromolar activity in a cell-based assay.
 
The researchers also found patterns in their larger data set. Armed with 47,658 protein-ligand interactions, the researchers were able to use machine learning to start to predict which molecular features were associated with binding. They ranked fragments as promiscuous or nonpromiscuous and built a promiscuity model. Molecules with higher lipophilicity and a greater fraction of aromatic carbon atoms tended to be more promiscuous, but the model could correctly categorize compounds as promiscuous even if they had lower ClogP values, or nonpromiscuous even if they had higher ClogP values.
 
Beyond promiscuity, the researchers used machine learning to predict other behavior, such as subcellular localization. A relatively easy case was to predict which molecules would accumulate in lysosomes; these tended to be hydrophobic basic amines. More impressively, the researchers could predict fragments likely to bind to transmembrane transporters, RNA binding proteins, and even intrinsically disordered proteins. And this is just the start: they hope one day to predict “target proteins from an input chemical structure alone.”
 
Perhaps most exciting, all of the data and models are available for free at Ligand Discovery. You can explore the proteins bound across all 407 fragments, input one or more proteins and find ligands, predict whether any given FFF is likely to be promiscuous or not, and even “build a machine learning model on the fly to predict potential interactions.” 
 
Check it out and let us know your experience.

No comments:

Post a Comment