Showing posts with label artifact. Show all posts
Showing posts with label artifact. Show all posts

18 August 2025

Hundreds of crystallographic ligands for FABP4 – many not as expected

The ten human fatty-acid binding proteins (FABPs) shuttle lipids around cells. As we noted several years ago, FABP4 and FABP5 are potential drug targets for diabetes and atherosclerosis, but selectivity over FABP3 is needed to avoid cardiotoxicity. Markus Rudolph and colleagues at Hoffmann-La Roche describe progress towards selective molecules in three consecutive open-access Acta. Cryst. D papers. Perhaps more importantly, they gift a massive high quality data set to the scientific community – along with some important caveats about data for protein-ligand structures.
 
The first paper focuses on purification and NMR characterization of FABP4. Recombinant FABPs are normally expressed in E. coli, and they always contain natural fatty acids that copurify with the protein. This can complicate ligand binding studies, since the endogenous fatty acids act as competitors. Indeed, the researchers highlight two structures in the protein data bank (PDB) whose supposed ligands are probably fatty acids.
 
To solve this problem, the researchers denature FABP4, separate the fatty acid, and then refold the protein. This truly apo form of the protein was studied by NMR, revealing that the protein becomes more rigid upon ligand-binding.
 
The second paper is of more general interest. It reports a set of 229 crystal structures of various FABPs, of which 216 have a bound ligand. Of these, 75 have associated IC50 values for at least one FABP, and 50 compounds have IC50 values reported for FABP3, FABP4, and FABP5. Importantly, the structures are solved to high resolution, with a median of 1.12 Å. Two crystal forms are particularly suitable for soaking, and compounds were typically soaked at 60 mM in 30% DMSO overnight.
 
All the crystal structures are deposited in the PDB, and all the binding data are provided in the supporting information. Given FABPs’ predilection for carboxylic acids, the ligands contain a variety of carboxylic acid mimetics. This wealth of high-quality data should be valuable for constructing machine-learning binding models, and the researchers conclude by calling “on other industrial organizations to also make their legacy data available such that prediction models with broader applicability may be developed more quickly.”
 
But it was the third paper that really caught my attention: the researchers summarized it as “what is written on the bottle is not what is in the crystal.” In fact, of the 216 ligands reported, a whopping 33 (15%) do not match the compound registered. These are grouped into several categories and described in detail.
 
Human error is the simplest to explain: the researchers show an example where a 1,2-benzoxazole was registered as a 1,3-benzoxazole. Because the molecules have the same molecular weight, mass spectrometry could not distinguish them. Similarly, the researchers find several cases where the wrong enantiomer or diastereomer was registered. In another case, a racemic mixture led to a single enantiomer bound to FABP4, with the protein acting as a “chiral sponge.”
 
Other cases are more unusual, and include ring closing, ring opening, acyl shifts, hydrolysis, and instances of ligand decomposition or incomplete reactions. The researchers note that small amounts of impurities could be particularly problematic at the high ligand concentrations used for soaking; they calculate that just 0.06% impurity would be equivalent to the total amount of FABP in a crystal. Some fragment screens are done at even higher concentrations, further increasing the risk of enriching impurities.
 
A 15% rate of unexpected ligands is comparable to the numbers we blogged about here, but those were commercial libraries, whereas this set is from Roche, which likely has better internal quality control. One factor that led to the recognition of the problem is the high resolution, where a single atom change could be readily seen. Another is the buried nature of the ligands; ligands bound on the surface of a protein may have more dynamically disordered bits, which would be difficult to distinguish from missing moieties caused by decomposition.
 
Indeed, the researchers examine two other proteins, PDE10 and ATX, for which they have also released ~200 ligand-bound structures but at lower average resolutions. There are some unexpected ligands for these proteins too, but many fewer than for the FABPs – or perhaps we just can’t observe some of them.
 
As we noted back in 2014, up to a quarter of ligand-containing crystal structures in the PDB may contain serious errors, and the researchers cite a study suggesting that 12% are “just bad.” These could have obvious negative consequences for training computational models, and the researchers call on the community to set standards to create a rigorously chosen training set. Perhaps this discussion could be held in parallel with the discussion on how to house fragment screening data, which we wrote about last month.

04 August 2025

The Chemical Probes Portal turns ten. Use it!

Last week we highlighted a new tool to computationally predict whether a molecule might aggregate, thereby causing false positives. This doesn’t necessarily mean the molecules are bad (after all, some approved drugs aggregate), but it’s all too easy to screen molecules under inappropriate conditions. This brings up the topic of chemical probes, and as it happens the Chemical Probes Portal turns ten years old this year, as celebrated in a Cancer Cell Commentary by Susanne Müller, Domenico Sanfelice, and Paul Workman and a blog post by Ben Kolbington at the Institute of Cancer Research.
 
We first wrote about the Chemical Probes Portal in July 2015, when it contained just 7 compounds. When we returned in 2023 it contained more than 500 compounds, and by the end of last year the number was up to 803. As of today it lists 1174 probes for 622 targets. Nearly a third of the probes also have chemically related inactive controls. These seem like large numbers, but the the human genome conservatively encodes for some 20,000 proteins, and the ambitious Target 2035 initiative seeks chemical probes for all of them.
 
The new paper emphasizes that the standards are in some ways higher for chemical probes than for approved drugs: “whereas probes principally require a high degree of selectivity, drugs need ‘only’ to be safe and effective and may often hit several targets.” Dimethyl fumarate comes to mind as a highly promiscuous covalent modifier that is nonetheless a useful drug for multiple sclerosis and psoriasis.
 
Even when a compound hits a target of interest, that doesn’t mean any biological effects observed are due to the target, particularly when the readout is cell death. The researchers note that TH588 was originally reported as a potent inhibitor of MTH1, but it actually kills cancer cells by binding to tubulin, a fact not always mentioned by chemical suppliers. Another study found that ten clinical compounds were still active in cells even when their putative target was knocked out using CRISPR.
 
The tone of the Commentary is pragmatic, emphasizing that for new or difficult targets, it may be difficult to find good chemical probes. For example, LY294002 is mentioned as a “pathfinder tool” that was useful to explore the biology around the PI3 kinase family but has now been superseded by more selective molecules.
 
Unfortunately, not everyone seems to have gotten the message. Curcumin, which as we noted can aggregate, form nonselective covalent adducts, fluoresce, and generate reactive oxygen species, appears in >2600 PubMed publicationsjust in the past year. What a waste.
 
If you’re exploring the biology of a target, please check the Portal to see whether there are good probes. If you’re reading (or reviewing!) a paper that reports small molecule studies, please check to see whether the probe has been assessed - especially to see if it shows up as one of more than 250 Unsuitables. And if you’re interested in participating, please consider reviewing or even hosting a Probe Hackathon.

28 July 2025

Can machine learning help you avoid SCAMs?

Among the many types of artifacts that can fool screens and derail efforts to find leads, small colloidally aggregating molecules (SCAMs) are particularly pernicious. As we discussed way back in 2009, these molecules can form aggregates in aqueous buffer that interfere with a variety of assays, leading to wasted resources and embarrassing publications.
 
The problem is that there isn’t necessarily anything wrong with the molecules per se, and even many approved drugs can form aggregates. Thus, it is difficult to predict whether any given molecule will be a troublemaker. In a new (open-access) Angew. Chem. Int. Ed. paper, Pascal Friederich, Rebecca Davis, and collaborators at Karlsruhe Institute of Technology and University of Manitoba Winnipeg explore whether machine learning can help.
 
The researchers built a Multi-Explanation Graph Attention Network, or MEGAN, which is accessible through a simple web interface. Rather than a homicidal doll, this MEGAN represents atoms as nodes and bonds as edges in a graph, similar to the Fragment Network we wrote about here. MEGAN was trained on a set of 12,338 aggregators and 177,048 non-aggregating molecules. Importantly, the researchers used explainable AI (xAI), which colors portions of the molecule according to their importance for (non)aggregation.
 
Testing MEGAN on a set of 1500 aggregators and 1500 non-aggregators, none of which were included in the training set, yielded an accuracy of 82%. Given that most molecules don’t aggregate, a model biased towards non-aggregators would be expected to have a high accuracy, and to account for this the researchers assessed the “F1” score, which was similarly impressive.
 
 
The researchers provide several examples in which subtle variations transform a molecule from a non-aggregator to an aggregator, and show that MEGAN correctly predicts these. Furthermore, it “shows its work,” highlighting the chemical features underlying the prediction. For example, 9H-pyrido[3,4-b]indole is predicted with 92% confidence not to be an aggregator.
 
 
Just adding a methyl group flips the odds in favor of aggregation to 92%.
 
Exploring the molecular features that lead to aggregation can reveal general trends, such as rigid, “flat” molecules with moieties that can serve either as hydrogen bond donors or acceptors. This is consistent with a paper we discussed last year, though unfortunately the researchers do not cite it.
 
To further assess the tool, it was tested against a set of drugs that had been characterized as aggregators or non-aggregators. MEGAN correctly classified 15 of 30 aggregators and 24 of 28 non-aggregators. In contrast, a different program caught only 2 of the aggregators. The researchers note that most of the training data for MEGAN came from a single screen in phosphate buffer at pH 7, and aggregation can be very dependent on buffer components and pH.
 
Practical Fragments has previously highlighted other aggregation predictors, most notably Aggregator Advisor and Liability Predictor. As for any computational model, the old chestnut “trust but verify” applies. MEGAN appears to be a useful tool, but please run physical experiments if the molecule is important.

24 March 2025

Fragments and nanodiscs: beware nonspecific binding

Membrane proteins make up roughly a quarter of human proteins, including many important drug targets. Biophysical methods for fragment screening typically require pure, isolated proteins, and removing membrane proteins from their native environment is not always possible. One solution has been to create nanodiscs which, as we described previously, are isolated little membranes containing the protein of interest. These nanodiscs can be immobilized to the sensor chips used for surface plasmon resonance (SPR), one of the most popular fragment finding methods. But in a recent open-access Chem. Biol. Drug Des. paper, Marcellus Ubbink and collaborators at Leiden University and ZoBio show that the precise composition of the nanodiscs can have a profound effect on the results.
 
The researchers chose cytochrome P450 3A4 (CYP3A4) as a model membrane protein. This enzyme metabolizes a large fraction of drugs and has a capacious active site able to bind a wide variety of substrates. Four different lipids were chosen for the nanodisc, all of which contained phosphatidylcholine headgroups and differing hydrophobic tails: POPC, DPPC, DMPC, and DPhPC.
 
Nanodiscs were prepared either with or without CYP3A4 and immobilized to SPR chips. Unlike some membrane proteins, it is possible to isolate and immobilize CYP3A4 in the absence of membranes, though the protein forms physiologically less relevant oligomers.
 
Next, the researchers examined 13 known (non-fragment) CYP3A4 ligands. Unfortunately, most of these bound to the empty nanodiscs, and in some cases more than ten ligands bound to a single empty nanodisc. This nonspecific binding correlated with lipophilicity, with only the three least lipophilic molecules showing no binding to empty nanodiscs. One of these was the antifungal drug fluconazole, with a clogP = 0.4. Happily though SPR studies using either free or nanodisc-bound CYP3A4 yielded dissociation constants of 10-20 µM, consistent with published values.
 
Thus encouraged, the researchers screened a diverse set of 140 fragments at 250 or 500 µM against empty and CYP3A4-loaded nanodiscs using SPR. Just as with the larger molecules, there was a good correlation between cLogP and nonspecific binding to empty nanodiscs. Fragments that bound to one type of nanodisc (POPC, for example) also tended to bind nonspecifically to other types of nanodiscs (DPPC, DMPC, and DPhPC). Fewer fragments bound nonspecifically to DMPC nanodiscs than to the others, suggesting this may be the best lipid to use.
 
Fragment hits were defined as those binding to CYP3A4-containing nanodiscs more than they bound empty nanodiscs (or, for isolated CYP3A4, the unmodified SPR chip). Hit rates varied dramatically, from 9 of 140 fragments tested against CYP3A4 in POPC nanodiscs to 33 of 140 tested against CYP3A4 in DMPC nanodiscs. There were also 33 hits against free CYP3A4, 11 of which were unique. However, all 11 of these are somewhat lipophilic (average cLogP ~2.3) and most also bound significantly to empty nanodiscs. The researchers suggest that these bind “aspecifically” to CYP3A4.
 
A Venn diagram of all the hits shows only two that bind to free CYP3A4 as well as all nanodiscs containing CYP3A4, and the researchers highlight these two as the most promising. Unfortunately these are not further characterized.
 
Near the beginning of the paper, the researchers note that very few fragment screens have been conducted against membrane proteins incorporated into nanodiscs. This analysis suggests why this is so. If you use nanodiscs, make sure to consider different types of ligands. And look carefully for nonspecific binding.

09 December 2024

They may be cons, but they’re our CONS

Practical Fragments has written repeatedly about various assay artifacts (vide infra). Different technologies are susceptible to different interference mechanisms, making general rules difficult. Earlier this year we wrote about the Metal Ion Interference Set, or MIIS: a collection of a dozen salts that could be used to assess the sensitivity of assays to metal contaminants. In a recent open-access JACS Au paper, Huabin Hu (Uppsala University), Jonathan Baell (Monash University), and collaborators extend the concept to small molecules.
 
The researchers have compiled a Collection Of useful Nuisance compounds, or CONS, perhaps with a nod to “Chemical con artists foil drug discovery” published a decade ago, which we highlighted here. The 103 members of the CONS are divided into three categories.
 
The first set contains five aggregators: molecules that have been shown to form colloidal clusters that non-specifically interfere with biological assays, as discussed here.
 
The largest set, at 67 members, consists of PAINS, or pan-assay interference compounds, which we first wrote about in 2010. These are themselves divided into various subcategories: non-specific electrophiles such as curcumin and an isothiazolone, redox cyclers such as quinones, contaminants such as the decomposition products of certain fused tetrahydroquinolines, miscellaneous, metal chelators, and additional mechanisms including optical interference and singlet oxygen quenchers, which are particularly problematic in AlphaScreen assays.  
 
The last set consists of 31 compounds that can cause problems in phenotypic assays. Some of these non-specifically disrupt cell membranes. Others have well-defined but toxic effects, such as interfering with tubulin or intercalating into DNA. Such bioactivity is not always a bad thing: some of these molecules, such as topotecan and colchicine, are approved drugs, but it’s useful to be aware of whether these types of activities will affect your assay.
 
One criticism of the PAINS concept is that it lumps together multiple mechanisms. (Pete Kenny wrote about this recently.) Another criticism is that, by focusing on chemical substructures, true hits may be unfairly deprioritized based on structure alone. What’s nice about the CONS list is that the potentially interfering mechanisms of each molecule are documented and categorized so they can be considered when establishing an assay. For example, you may not care whether a compound interferes in a phenotypic assay if you are performing a screen on an isolated enzyme.
 
The entire set of compounds is available from Enamine, and additional vendors are provided in a supplementary table. If you’re doing a lot of assays, particularly on new targets and mechanisms, it may be worth testing the CONS to understand what kinds of false positives might occur.

04 November 2024

Catching virtual cheaters

As experienced practitioners of fragment-based lead discovery will know, the best way to avoid being misled by artifacts is to combine multiple methods. (Please vote on which methods you use if you haven’t already done so.) Normally this advice is for physical methods, but what’s true in real life also applies to virtual reality, as demonstrated in a recent J. Med. Chem. paper by Brian Shoichet and collaborators at University of California San Francisco, Schrödinger, and University of Michigan Ann Arbor.
 
The Shoichet group has been pushing the limits of computational screening using ever larger libraries. Five years ago they reported screens of more than 100 million molecules, and today multi-billion compound libraries are becoming routine. But as more compounds are screened, an unusual type of artifact is emerging: molecules that seem to “cheat” the scoring function and appear to be virtual winners but are completely inactive when actually tested. Although rare, as screens increase in size these artifacts can make up an increasingly large fraction of hits.
 
Reasoning that these types of artifacts may be peculiar to a given scoring function, the researchers decided to rescore the top hits using a different approach to see whether the cheaters could be caught. They started with a previous screen in which 1.71 billion molecules had been docked against the antibacterial target AmpC β-lactamase using DOCK3.8, and more than 1400 hits were synthesized and tested. These were rescreened using a different scoring approach called FACTS (fast analytical continuum treatment of solvation). Plotting the scores against each other revealed a bimodal distribution, with most of the true hits clustering together. Of the 268 molecules that lay outside of this cluster, 262 showed no activity against AmpC even at 200 µM.
 
Thus encouraged, the researchers turned to other studies in which between 32 and 537 compounds had been experimentally tested. The top 165,000 to 500,000 scoring hits were tested using FACTS, and 7-19% of the initial DOCK hits showed up as outliers and thus likely cheaters. For six of the targets, none of these outliers were strong hits. For each of the other three, a single potent ligand had been flagged as a potential cheater.
 
To evaluate whether this “cross-filtering” approach would work prospectively as well as retrospectively, the researchers focused on 128 very high scoring hits from their previous AmpC virtual screen that had not already been experimentally tested. These were categorized as outliers (possible cheaters) or not and then synthesized and tested. Of the 39 outliers, none were active at 200 µM. But of the other 89, more than half (51) showed inhibition at 200 µM, and 19 of these gave Ki values < 50 µM. As we noted back in 2009, AmpC is particularly susceptible to aggregation artifacts, so the researchers tested the ten most potent inhibitors and found that only one formed detectable aggregates.
 
In addition to FACTS, the researchers also used two other computational methods to look for cheaters: AB-FEP (absolute binding free energy perturbation) and GBMV (generalized Born using molecular volume), both of which are more computationally intensive than either FACTS or DOCK. Interestingly, GBMV performed worse than FACTS, finding at best only 24 cheaters but also falsely flagging 9 true binders. AB-FEP was better, finding 37 cheaters while not flagging any of the experimentally validated hits.
 
This is an important paper, particularly as virtual screens of multi-billion compound libraries become increasingly common. Indeed, the researchers note that “as our libraries grow toward trillions of molecules… there may be hundreds of thousands of cheating artifacts.”
 
And although the researchers acknowledge that their cross-filtering aproach has only been tested for DOCK, it seems likely to apply to other computational methods too. I look forward to seeing the results of these studies.

29 July 2024

How to avoid metal artifacts

Back in 2017 we observed with characteristic subtlety that “heavy metals suck.” That post described a hit-finding campaign which foundered when the apparent activity of the fragments turned out to be due to contaminating zinc. A new paper in J. Med. Chem. by Thomas Gerstberger, Peter Ettmayer, and colleagues at Boehringer Ingelheim (BI) describes a similar story, along with suggestions of how to avoid being misled.
 
BI had a collaboration with FORMA Therapeutics that entailed screening roughly 1.7 million compounds against ten targets using biochemical and cell-based assays. The effort resulted in chemical probes against BCL6 and SOS1 and a clinical compound against the latter. Another target was the activated (GTP-loaded) form of KRASG12D. Of the 6917 hits from the primary AlphaScreen assay, 1535 gave dose-response curves and passed various counter screens. Of these, 87 representative compounds were tested in STD NMR and thermal shift assays. Only seven confirmed by STD NMR, but these did not confirm by SPR or crystallography.
 
In parallel, the researchers were successfully using FBLD to develop inhibitors of KRASG12D, which we wrote about here. Some of the fragment hits were structurally similar to those from the HTS screen, and further searching of the FORMA library led to fairly potent (high nanomolar or low micromolar) hits in the AlphaScreen assay. Two of these even yielded crystal structures, though despite their chemical similarity to one another they bound to the protein in completely different orientations.
 
Unfortunately, follow-up work “revealed erratic structure-activity relationships,” and upon resynthesis the compounds were much less active. At this point the researchers became suspicious, and analyses of the original samples showed they contained >20,000 ppm of palladium contamination. Furthermore, PdCl2 itself turned out to be a low micromolar inhibitor in the assay.
 
Metals are frequently used as catalysts or reagents in organic synthesis and can be difficult to completely remove during purification. Worse, their presence is often not detectable using standard purity assessments such as HPLC and NMR. Particularly in the case of fragments, which are expected to have low affinities, a small amount of metal contaminant could give a reasonable-looking but misleading signal in an assay.
 
To avoid this problem in the future the researchers developed a Metal Ion Interference Set, or MIIS, consisting of a dozen different metal ions and other salts, all soluble in DMSO so as to be compatible with typical screens. The MIIS is now routinely screened before initiating HTS campaigns, and the results of 74 assays are summarized in the paper. Pd2+, Au3+, and Ag1+ are particularly nasty, often giving IC50 values < 1 µM, but every metal gave IC50 values < 10 µM in at least two assays. Biochemical assays such as AlphaScreen or TR-FRET were more susceptible to artifacts, with 20.9% showing IC50 < 10 µM, while biophysics assays such as mass spectrometry were better behaved, with only 2.3% showing IC50 < 10 µM. Cellular assays were also surprisingly robust, with 6.3% showing IC50 < 10 µM.
 
This is a nice paper showing that even a massive screen may produce no useful chemical matter. Soberingly, the fact that some of the fragments gave reasonable-looking crystal structures even though the functional activity came from metal contaminants is a salutary reminder that just because you have a crystal structure of a bound ligand doesn’t mean you have a viable starting point.
 
Forewarned is forearmed, and the MIIS appears to be a valuable tool for assessing assay sensitivity to metal ions, which are all too often lurking invisibly in compound samples.

22 April 2024

The limits of published data

Machine learning (or, for investors, artificial intelligence) has received plenty of attention. To be successful you need lots of data. If you’re trying to, say, train a large language model to write limericks you’ve got oodles of text to draw from. But to train a model that can predict binding affinities you need lots of measurements, and they must be accurate. Among public repositories, ChEMBL is one of the most prominent due to its size (>2.4 million compounds) and quality, achieved through manual curation. But even here you need to be cautious, as illustrated in a recent open-access J. Chem. Inf. Model. paper by Gregory Landrum and Sereina Riniker (both at ETH Zurich).
 
The researchers were interested in the consistency of IC50 or Ki values for the same compound against the same target. They downloaded data for >50,000 compounds run at least twice against the same target. Activity data were compared either directly or after “maximal curation,” which entailed removing duplicate measurements from the same paper, removing data against mutant proteins, separating binding vs functional data, and several other quality checks. They used a variety of statistical tests (R2, Kendall τ, Cohen’s κ, Matthew’s correlation coefficient) to evaluate how well the data agreed, the simplest being the fraction of pairs where the difference was more than 0.3 or 1 log units, roughly two-fold or ten-fold differences.
 
The results were not encouraging. Looking at IC50 values, 64% of pairs differed by >0.3 log units, and 27% differed by more than 1 log unit. In other words, for more than a quarter of measurements a molecule might test as 100 nM in one assay and >1 µM in another.
 
Of course, as the researchers note, “it is generally not scientifically valid to combine values from different IC50 assays without knowledge of the assay conditions.” For example, the concentration of ATP in a kinase assay can have dramatic effects on the IC50 values for an inhibitor. Surely Ki values should be more comparable. But no, 67% of pairs differed by >0.3 log units and 30% differed by >1!
 
The situation improved for IC50 values using maximal curation, with the fraction of pairs differing by >0.3 and >1 log units dropping to 48% and 13%. However, this came at the expense of throwing away 99% of the data.
 
Surprisingly, using maximal curation data for Ki data actually made the situation worse. Digging into the data, the researchers found that 32 assays reporting Ki values for human carbonic anhydrase I, all from the same corresponding author, include “a significant number of overlapping compounds, with results that are sometimes inconsistent.” Scrubbing these improved the situation, but 38% of pairs still differed by >0.3 log units, and 21% differed by >1 log unit.
 
This is all rather sobering, and suggests there are limits to the quality of available data. As we noted in January there are all kinds of reasons assays can differ even within the same lab. Add in subtle variations in protein activity or buffer conditions and perhaps we should not be too surprised at log-order differences in experimental measurements. And this assumes everyone is trying to do good science: I’m sure sloppy and fraudulent data only make the situation worse. No matter how well we build our computational tools, noisy data will ensure they often differ from reality, whatever that may be.

15 January 2024

What makes molecules aggregate?

The propensity for some small molecules to form aggregates in water has bedeviled fragment-finding efforts for decades. Indeed, the phenomenon was not fully recognized until early this century. Although plenty of tools are available for detecting aggregates, I still see too many papers that omit these crucial quality controls. As annoying as aggregation can be in activity assays, in certain cases it could actually be useful for formulating drugs. There has been speculation that the good oral bioavailability of venetoclax is due to aggregation. But despite computational methods to predict aggregation, the structural features of molecules that cause them to aggregate are still not well understood. In a new open-access Nature Comm. paper, Daniel Heller and collaborators at Memorial Sloan Kettering Cancer Center and elsewhere provide some answers.
 
The researchers had previously published an article describing how indocyanine green (ICG) could be used to stabilize and visualize aggregates, and they applied the same technique to examine the aggregation potential of a small set of fragments. Benzoic acid and 2-napthoic acid did not aggregate, while 4-phenylbenzoic acid did. Intrigued, the researchers tested a set of 14 4-substituted biphenyl fragments and found that those containing both a hydrogen bond donor and acceptor, such as acids, sulfonamides, amides, and ureas, could aggregate, while those containing only donors (aniline) or acceptors (nitrile) did not.
 
Fourier transform infrared spectroscopy was used to examine the stretching region of the carbonyl of 4-phenylbenzoic acid in various states: in an aqueous aggregate, in solution in either t-butanol or DMSO, or in the solid state. Interestingly, the aggregate most resembled the solid state, consistent with close-packed self-assembly as opposed to free in solution.
 
From all this, the researchers hypothesized that a combination of aromatic groups and hydrogen bond donors and acceptors was necessary for aggregation. However, having these features does not mean aggregation is inevitable. Neither 3-phenylbenzoic acid nor 2-phenylbenzoic acid formed aggregates, with the former precipitating while the latter remained completely soluble. These three phenylbenzoic acid isomers behave very differently despite the fact that they have the same calculated logP values, and the suggestion is that the latter two molecules are less able to form pi-pi stacking interactions that lead to stable aggregation.
 
Next the researchers examined the approved drug sorafenib, which had previously been shown to aggregate. This was confirmed, and the aggregates were characterized with a battery of biophysical methods including dynamic light scattering, transmission electron microscopy, and X-ray scattering, along with molecular dynamics simulations. The conclusion is that sorafenib forms amorphous aggregates whose assembly is driven by a combination of pi-pi stacking and hydrogen-bonding. A series of sorafenib analogs was synthesized, and those that could not form strong intermolecular hydrogen bonds were less prone to aggregation.
 
All of this is fascinating from a molecular assembly viewpoint and will help to explain and predict which compounds are likely to aggregate, for better or for worse. But as of now, experimental assessment is still best practice for any new compound.

30 October 2023

NMR for SAR: All about the ligand

In last week’s post we described a free online tool for predicting bad behavior of compounds in various assays. But as we noted, you often get what you pay for, and computational methods can’t (yet) take the place of experimentation. In a new (open-access) J. Med. Chem. paper, Steven LaPlante and collaborators at NMX and INRS describe a roadmap for discovering, validating, and advancing weak fragments. They call it NMR by SAR
 
Unlike SAR by NMR, the grand-daddy of fragment-finding techniques which involves protein-detected NMR, NMR for SAR focuses heavily on the ligand. The researchers illustrate the process by finding ligands for the protein HRAS, for which drug discovery has lagged in comparison to its sibling KRAS.
 
The researchers started by screening the G12V mutant form of HRAS in its inactive (GDP-bound) state. They screened their internal library of 461 fluorinated fragments in pools of 11-15 compounds (each at ~0.24 mM) using 19F NMR. An initial screen at 15 µM protein produced a very low hit rate, so the protein concentration was increased to 50 µM. After deconvolution, two hits confirmed, one of which was NMX-10001.
 
The affinity of the compound was found to be so low that 1H NMR experiments could not detect binding. Thus, the researchers kept to fluorine NMR to screen for commercial analogs. They used 19F-detected versions of differential line width (DLW) and CPMG experiments to rank affinities, and the latter technique was also used to test for compound aggregation using methodology we highlighted in 2019. Indeed, the researchers have developed multiple tools for detecting aggregators, such as those we wrote about in 2022.
 
Ligand concentrations were measured by NMR, which sometimes differed from the assumed concentrations. As the researchers note, these differences, which are normally not measured experimentally, can lead to errors in ranking the affinities of compounds. The researchers also examined the 1D spectra of the proteins to assess whether compounds caused dramatic changes via pathological mechanisms, such as precipitation.
 
The researchers turned to protein-detected 2D NMR for orthogonal validation and to determine the binding sites of their ligands. These experiments revealed that the compounds bind in a shallow pocket that has previously been targeted by several groups (see here for example). Optimization of their initial hit ultimately led to NMX-10095, which binds to the protein with low double digit micromolar affinity. This compound also blocked SOS-mediated nucleotide exchange and was cytotoxic, albeit at high concentrations.

I do wish the researchers had measured the affinity of their molecules towards other RAS isoforms as this binding pocket is conserved, and inhibiting all RAS activity in cells is generally toxic. Moreover, the best compound is reminiscent of a series reported by Steve Fesik back in 2012.
 
But this specific example is less important than the clear description of an NMR-heavy assay cascade that weeds out artifacts in the quest for true binders. The strategy is reminiscent of the “validation cross” we mentioned back in 2016. Perhaps someday computational methods will advance to the point where “wet” experiments become an afterthought. But in the meantime, this paper provides a nice set of tools to find and rigorously validate even weak binders.

23 October 2023

A Liability Predictor for avoiding artifacts?

False positives and artifacts are a constant source of irritation – and worse – in compound screening. We’ve written frequently about small molecule aggregation as well as generically reactive molecules that repeatedly come up as screening hits. It is possible to weed these out experimentally, but this can entail considerable effort, and for particularly difficult targets, false positives may dominate. Indeed, there may be no true hits at all, as we noted in this account of a five-year and ultimately fruitless hunt for prion protein binders.
 
A computational screen to rapidly assess small molecule hits as possible artifacts would be nice, and in fact several have been developed. Among the most popular are computational filters for pan-assay interference compounds, or PAINs. However, as Pete Kenny and others have pointed out, these were developed using data from a limited number of screens in one particular assay format. Now Alexander Tropsha and collaborators at University of North Carolina Chapel Hill and the National Center for Advancing Translational Science (NCATS) at the NIH have provided a broader resource in a new J. Med. Chem. paper.
 
The researchers experimentally screened around 5000 compounds, taken from the NCATS Pharmacologically Active Chemical Toolbox, in four different assays: a fluorescence-based thiol reactivity assay, an assay for redox activity, a firefly luciferase (FLuc) assay, and a nanoluciferase (NLux) assay. The latter two assays are commonly used in cell-based screens to measure gene transcription. The thiol reactivity assay yielded around 1000 interfering compounds, while the other three assays each produced from 97 to 142. Interestingly, there was little overlap among the problematic compounds.
 
These data were used to develop quantitative structure-interference relationship (QSIR) models. The NCATS library of nearly 64,000 compounds was virtually screened, and around 200 compounds were tested experimentally for interference in the four assays, with around half predicted to interfere and the other half predicted not to interfere. The researchers had also previously built a computational model to predict aggregation, and this – along with the four models discussed here – have been combined into a free web-based “Liability Predictor.”
 
So how well does it work? The researchers calculated the sensitivity, specificity, and balanced accuracy for each of the models and state that “they can detect around 55%-80% of interfering compounds.”
 
This sounded encouraging, so naturally I took it for a spin. Unfortunately, my mileage varied. Or, to pile on the metaphors, lots of wolves successfully passed themselves off as sheep. Iniparib was recognized correctly as a possible thiol interference compound. On the other hand, the known redox cycler toxoflavin was predicted not to be a redox cycler – with 97.12% confidence. Similarly, curcumin, which can form adducts with thiols as well as aggregate and redox cycle, was pronounced innocent. Quercetin was recognized as possibly thiol-reactive, but its known propensity to aggregate was not. Weirdly, Walrycin B, which the researchers note interferes with all the assays, got a clean bill of health. Perhaps the online tool is still being optimized.
 
At this point, perhaps the Liability Predictor is best treated as a cautionary tool: molecules that come up with a warning should be singled out for particular interrogation, but passing does not mean the molecule is innocent. Laudably, the researchers have made all the underlying data and models publicly available for others to build on, and I hope this happens. But for now, it seems that no computational tool can substitute for experimental (in)validation of hits.

17 July 2023

A rule of two for using chemical probes?

Earlier this year we highlighted the growth of the Chemical Probes Portal, a free website that profiles more than 500 small molecules targeting more than 400 proteins. Each chemical probe is evaluated by experts based on published literature and then scored for use in cells or in vivo. More than 300 chemical probes have received three or four stars and are thus recommended. But even a good probe can be misused, and this is the subject of a recent (open-access) Nat. Comm. paper from Adam McCluskey, Lenka Munoz, and colleagues at the University of Sydney and the University of Newcastle. (The paper has also been discussed by Paul Workman and Derek Lowe.)
 
The researchers chose eight probes targeting histone methyltransferases, a histone demethylase, a histone acetyltransferase, and several kinases. All but one of these probes had first been disclosed before 2015. A literature search revealed 662 papers that used these probes in cellular studies, ranging from 21 to 134 publications per probe.
 
Centuries ago the alchemist Paracelsus noted that everything is poisonous at high enough doses, and indeed even the best probes might hit dozens or hundreds of protein targets. For this reason the Chemical Probes Portal recommends maximum concentrations for cellular assays. The researchers examined whether papers exceeded these concentrations. The overall results were encouraging, with just 22% of papers exceeding recommended limits. However, there was considerable variation: for one chemical probe, 70% of papers exceeded the limit. (For this particular case, the maximum recommended cellular concentration was just 250 nM.)
 
Because chemical probes can have off-target activity even at recommended concentrations, best practices are to include a related but inactive control compound plus a second chemically differentiated probe. All but one of the eight probes chosen for analysis had orthogonal probes available, and five had inactive controls. So how frequently were these used? Unfortunately, 58% of papers did not use an orthogonal probe, and a whopping 92% of papers did not use available inactive control compounds. In fact, just 4% of the papers “used chemical probes within the recommended concentration range and included inactive compounds as well as orthogonal chemical probes.”
 
A wider analysis of nearly 15,000 papers that cited the 662 publications produced similar results, with 17% exceeding recommended concentrations, 59% not using differentiated chemical probes, and 83% not using inactive controls.
 
The researchers propose a “'rule of two': At least two chemical probes (either orthogonal target-engaging probes, and/or a pair of a chemical probe and matched target-inactive compound) to be employed at recommended concentrations in every study.” To encourage best practices, the paper provides a simple “Researchers’ Flowchart” to help investigators select probes and controls. And because science is self-regulated, they provide a five-item “Reviewers’ Checklist.” The paper also includes a nice list of links to other resources, including webinars and slide decks.
 
Overall I think following these guidelines would be beneficial, and the Reviewers’ Checklist in particular could be usefully incorporated into journal publication requirements.
 
Of course, the vast majority of protein targets don’t have even a single good chemical probe, let alone two or more. Which means that there are plenty of opportunities to identify new probes and make better use of those that already exist.

23 January 2023

The Chemical Probes Portal at Eight

Back in 2015, Practical Fragments highlighted a new resource calling itself “The Chemical Probes Portal.” At the time it included just seven probes, and my post concluded, “I hope this takes off. Understanding the natural world is hard enough even with well-behaved reagents and carefully controlled experiments.”
 
Well, take off it has, as illustrated by a new (open access) paper in Nucleic Acids Res. by Susanne Müller (Goethe University Frankfurt), Bissan Al-Lazikani (MD Anderson Cancer Center), Paul Workman (Institute of Cancer Research), and collaborators.
 
The paper notes that “the widespread use of small molecule compounds that are claimed as chemical probes but are lacking sufficient quality, especially being inadequately selective for the desired target or even broadly promiscuous in behavior, has resulted in many erroneous conclusions in the biomedical literature.” As an antidote, the Portal is an “expert review-based public resource to empower chemical probe assessment, selection, and use.”
 
Any scientist can suggest a potential probe, and these are then internally reviewed and curated. Assuming enough public information is available about the molecule, probes are then sent to three members of a Scientific Expert Review Panel for further vetting. Reviewers rate probes from one to four stars for use in cellular and/or animal models and recommend relevant concentration ranges. Importantly, reviewers can also include comments to highlight off-targets, lack of certain data, oral bioavailability, or anything else.
 
From a mere seven probes in 2015 the Portal has grown to include more than 500 molecules covering more than 400 protein targets in about 100 protein families. About two thirds of the probes have three or more stars, meaning they are recommended. The Portal is very easy to use and can be searched by probe or protein. Laudably, all the data can also be easily downloaded in bulk.
 
In addition to the chemical probes, the Portal also contains around 250 “Historical Compounds” that have been described in the literature but “are not recommended to be used to study the function of specific proteins as they are seriously flawed.” These include molecules such as gossypol, a known aggregator that has been reported as an inhibitor of multiple proteins, and curcumin. If you see a molecule used as a probe in the literature, it’s worth checking to see whether it shows up in the Portal.
 
The Chemical Probes Portal features heavily in a Conversation between Cheryl Arrowsmith (Structural Genomics Consortium) and Paul Workman published (open access) last year in Nat. Commun. The researchers concisely define chemical probes as “small-molecule modulators to interrogate the functions of their target proteins, as opposed to protein location, or other physical properties.” Importantly, they differentiate chemical probes from drugs. “Drugs don’t necessarily need to be as selective as high-quality chemical probes. They just need to get the job done on the disease and be safe to use. In fact, many drugs act on multiple targets as part of their therapeutic mechanism.” I have frequently heard people make comments such as, “this is just a probe, not a drug,” but a good probe should actually be more selective than many drugs.
 
That said, you do want a drug to actually hit the target of interest. The researchers highlight iniparib, a putative PARP inhibitor that made it all the way to phase 3 clinical trials for breast cancer and was tested in >2500 cancer patients. It failed. Moreover, that failure cast a pall over the field which likely delayed the development of actual PARP inhibitor drugs.
 
The researchers also discuss aggregators, which are still being reported uncritically in the literature, along with PAINS. “Such compounds should never be considered further or used as chemical probes. They should be excluded from compound libraries. Yet many are sold by commercial vendors as chemical probes and widely used.”
 
This statement raised the hackles of Pete Kenny. In a recently published critique, he states: “it is asserted in the conversation that commercial vendors are selling compounds as chemical probes that are unfit for purpose and I strongly recommend that anybody making such assertions should carefully examine the supporting evidence.”
 
Dear reader, please try the following experiment. Enter “iniparib supplier” in your favorite search engine and see what comes up. For me, the first 10 results include several that describe it as a PARP inhibitor. I won’t link to them here because I don’t want to encourage traffic to their sites. (This is also part of the reason Practical Fragments has discontinued PAINS shaming, as it only increases the profile of sloppy or harmful papers.)
 
Pete goes on to write: “I would strongly advise against making statements that a compound is unfit for use as a chemical probe unless the assertion is supported by measured data in the public domain for the compound in question.”
 
Frankly, I don’t understand Pete’s position, which I parodied here. Life is short and biology is complicated, so why waste time with dirty or inadequately characterized reagents? For me, everything is an artifact until proven otherwise. And the Chemical Probes Portal goes a long way towards demonstrating whether a particular probe is fit for purpose.

14 November 2022

The agony and ecstasy of thiazoles

Earlier this year we highlighted an analysis of rings found in drugs. Thiazoles are tied for thirteenth place, occurring in at least 30 drugs. (A substructure search in DrugBank pulls up 49.) They pack a lot of diversity into just 5 heavy atoms, with a nitrogen atom capable of acting as a hydrogen bond acceptor as well as a sulfur atom. But they can also be tricksy: an analysis several years ago found that 2-aminothiazoles are over-represented as hits in fragment screens but are often not advanceable. A new open-access paper in ACS Med. Chem. Lett. by Rok Frlan and collaborators at the University of Ljubljana confirms and broadens these conclusions.
 
The researchers assembled a library of 44 fragment-sized 1,3-thiazoles and five 1,3,4-thiadizaozles. These were then screened at 0.5 or 0.625 mM against four unrelated enzymes in biochemical assays. Two of the enzymes contain catalytic cysteine residues, and these had high hit rats: 14 hits for the SARS-CoV-2 3CLpro enzyme and 26 for the E. coli MurA enzyme. In contrast, MetAP1a had only 3 hits, while DdlB had none. Are any of these hits real?
 
None of the compounds had been classified as PAINS, and aggregation was deemed unlikely for all but one compound based on chemical searches and the presence of detergent in the assays for MurA, DdlB, and 3CLpro. One compound also seemed to interfere with the fluorescent assays and was ruled a false positive. So far, so good.
 
However, 8 of the compounds turned out to be unstable in aqueous buffer. Moreover, four compounds turned out to be redox active in at least one of three different assays. Redox cycling can generate reactive oxygen species, which inhibit cysteine-dependent proteins nonspecifically.
 
Next the researchers tested to see whether their fragments reacted with a small test thiol, 5-mercapto-2-nitrobenzoic acid. Shockingly, 19 of them did, and most of these inhibited at least one of the enzymes. Many of these contain potential leaving groups such as halogen atoms, but some didn’t, leaving the nature of the reaction unclear. Still, the results suggest that the fragments are more thiol-specific than protein-specific, and so another potential source of false leads.
 
When the researchers retested the ability of the fragments to inhibit the enzymes in the presence of the reducing agent DTT, only one of the CLpro hits reproduced – and that was the compound that showed fluorescence interference. The results were not quite so bad for MurA, though many hits fell out.
 
Finally, the researchers tried to correlate reactivity with quantum-mechanical calculations using several different methods. Unfortunately, as they note, “no meaningful relationships were observed.” Laudably, data for all the compounds are provided, so interested readers are free to try their own analyses.
 
In the end it is not clear whether any of the hits will be useful, but the high correlation between pathological mechanisms and activity does not make one optimistic. As the first paragraph above makes clear, this does not mean that thiazoles should be avoided. Indeed, the researchers explicitly state that “we do not want to establish a general knockout criterion to exclude thiazole or thiadizaole screening hits from further development, but it is essential to evaluate their reactivity if they prove to be hits.” This is where orthogonal biophysical methods, such as crystallography, can distinguish true hits from artifacts.

17 January 2022

An epidemic of aggregators, and suggestions for cures

COVID-19 has been with us for over two years now. While the human effects have been unquestionably negative, for science it has been the best of times and the worst of times. The development of remarkably effective vaccines in less than a year stands as a triumph of twenty-first century medicine, as does the discovery of nirmatrelvir, a covalent inhibitor of the SARS-CoV-2 main protease Mpro (also called 3CL-Pro). But there is a lot of junk-science out there too, as illuminated in a recent J. Med. Chem. paper by Brian Shoichet and colleagues at University of California San Francisco.
 
Before vaccines and custom-built drugs were developed, labs everywhere started screening all the compounds they could get against targets relevant for COVID-19. The most popular molecules to test were approved drugs, the idea being that if any of these turned out to be effective they could immediately be put to use.
 
One of the most common artifacts in screening is caused by aggregation: small molecules can form colloids that non-specifically inhibit a variety of different assays. This phenomenon has been understood for more than two decades; Practical Fragments wrote about it back in 2009. Unfortunately, many labs ignore it.
 
The UCSF lab investigated 56 drugs that had been reported in 12 papers as inhibitors against two targets relevant for SARS-CoV-2, including 3CL-Pro. The molecules were characterized in multiple assays: particle formation and clean autocorrelation curves in dynamic light scattering (DLS), inhibition of an aggregation-sensitive enzyme in the absence of detergent but no inhibition in the presence of detergent, and a high Hill slope in the dose-response curve. Nineteen molecules, four of them fragment-sized, were positive in most of these assays, clearly indicating aggregation. (Interestingly, several of these gave reasonable Hill slopes (<1.4), and the researchers suggest this be a “soft criterion.”) Another 14 molecules gave more ambiguous results, such as forming particles by DLS but not inhibiting the sentinel enzyme.
 
OK, so maybe the molecules are aggregators, but perhaps they also act legitimately? Unfortunately, of the 12 drugs reported in the literature to inhibit 3CL-Pro, only two inhibited the enzyme in the presence of detergent, and one of these was five-fold less potent than reported. And as the researchers point out, detergent is not a magic elixir, and sometimes only right-shifts the onset of aggregation. Moreover, of the 19 molecules conclusively found to be aggregators, detergent was not included for 15 of them in the original publications. Brian may be too polite to write this, but channeling my inner Teddy, I would argue that the authors are negligent for failing to test for aggregation, as are the editors and reviewers who allowed these papers to be published.
 
And the problem is not confined to the COVID-19 literature. The researchers examined a commercial library of 2336 FDA-approved drugs, 73 of which are known aggregators. Another 356 were flagged in the very useful Aggregation Advisor tool (see here), and 6 of 15 experimentally evaluated tested positive in all the aggregation assays.
 
How do you avoid being misled by these artifacts? An extensive suite of tools for assessing aggregation is provided in a recent Nat. Protoc. paper by Steven LaPlante and colleagues at Université du Québec and NMX. The procedures are described in sufficient detail that they “can be easily performed by graduate students and even undergraduate students.”
 
Most of the focus is on various NMR techniques, such as one we wrote about here. The easiest is an NMR dilution assay, in which a 20 mM solution of a compound in DMSO is serially diluted into aqueous buffer at concentrations from 200 to 12 µM. If the number, shape, shift, or intensities of the NMR resonances changes, aggregation is likely.
 
Another assay involves testing compounds in the absence and presence of various detergents, including NP40, Triton, SDS, CHAPS, Tween 20, and Tween 80. Again, changes in the NMR spectra suggest aggregation.
 
The researchers note that “no one technique can detect all the types of aggregates that exist; thus, a combination of strategies is necessary.” Indeed, the various techniques can distinguish different types of aggregates which can vary in size and polydispersity. On a lemons-to-lemonade note, these “nano-entities” might even be useful for “drug delivery, anti-aggregates, cell penetrators and bioavailability enhancers.”
 
We live in the age of wisdom and the age of foolishness. As scientists – and as people – it is our responsibility to aspire to the former by being aware of “unknown knowns,” such as aggregation. And perhaps, by even taking advantage of the weird phenomena that can occur with small molecules in water.

10 November 2019

A new tool for detecting aggregation

Historically the most popular method for finding fragments has been ligand-detected NMR. Preliminary results of our current poll (to the right) suggest crystallography has pulled ahead. (Please do vote if you haven’t already done so.) However, NMR has many uses beyond finding fragments, as illustrated in a recent J. Med. Chem. paper by Sacha Larda, Steven LaPlante, and colleagues at INRS-Centre Armand-Frappier Santé Biotechnologie, NMX, and Harvard.

Among the many artifacts that can occur in screening for small molecules, one of the most insidious is aggregation. A distrubing number of small molecules form aggregates in water, and these aggregates give false positives in multiple assays. Unfortunately, determining whether aggregation is occurring is not always straightforward. The new paper provides a simple NMR-based tool to do just that.

All molecules tumble in solution, but small fragment- or drug-sized molecules tumble more rapidly than large molecules such as proteins. The “relaxation” of proton resonances is faster in slower tumbling molecules, and in the NMR experiment called spin-spin relaxation Carr-Purcell-Meiboom-Gill (T2-CPMG) various delays are introduced and slower tumbling molecules show loss of resonances. Indeed, this technique has frequently been used in fragment screening: if a fragment binds to a protein, it will tumble more slowly, resulting in loss of signal.

The researchers recognized that an aggregate could behave like a large molecule, and they confirmed this to be the case for known aggregators, while non-aggregators did not. The experiment is relatively rapid (~30 seconds), and has been used to profile a 5000-compound library to remove aggregators.

One of the frustrations of aggregators is that it is currently impossible to predict whether a molecule will aggregate, and indeed, the researchers show several examples of closely related compounds in which one is an aggregator while the other is not. Even worse, the phenomenon can be buffer-dependent: the researchers show a fragment that aggregates in one buffer but not in another, even under the same pH.

Many fragment screens are done with pools of compounds, and the researchers find that molecules can show a “bad apple effect”, whereby previously well-behaved molecules appear to be recruited to aggregates.

The limit of detection for T2-CPMG is said to be single-digit micromolar concentration of small molecule, though the researchers note that double- or triple-digit micromolar concentrations are more practical, which is more typical of fragment screens anyway. And some compounds may show rapid relaxation due to non-pathological mechanisms, such as tautomerization or various conformational changes.

Still, this approach seems like a powerful means to rapidly assess hits, and pre-screening a library makes sense. Another NMR technique using interligand nuclear Overhauser effect (ILOE) has also been used to test for aggregation, though not to my knowledge so systematically. For the NMR folks out there, which methods do you think are best to weed out aggregators?

26 August 2019

Biophysics beyond fragments: a case study with ATAD2

Three years ago we highlighted a paper from AstraZeneca arguing for close cooperation of biophysics with high-throughput screening (HTS) to effectively find genuine hits. A lovely case study just published in J. Med. Chem. shows just how beneficial this can be.

Paul Bamborough, Chun-wa Chung, and colleagues at GlaxoSmithKline and Cellzome were interested in the bromodomain ATAD2, which is implicated in cancer. (Chun-wa presented some of this story at the FragNet meeting last year.) Among epigenetic readers, bromodomains are usually quite ligandable, but ATAD2 is an exception, and when this work began there were no known ligands.

Recognizing that they might face challenges, the researchers started by carefully optimizing their protein construct to be stable and robust to assay conditions. This included screening 1408 diverse compounds, none of which were expected to bind. Disturbingly, a TR-FRET screen at 10 µM yielded a 4.1% hit rate, suggesting many false positives. Indeed, when an apparently 30 nM hit from this screen was tested by two-dimensional 15N-1H HSQC NMR, it showed no binding. The researchers thus made further refinements to the protein construct to improve stability and reduce the hit rate against this “robustness set.”

This exercise illustrates an important point: make sure your protein is the highest quality possible!

Having done this, the researchers screened 1.7 million compounds and obtained a relatively modest 0.6% hit rate. Of these 9441 molecules, 428 showed dose response curves and were tested using SPR and HSQC NMR. In the case of SPR, the researchers also tested a mutant form of the enzyme that was not expected to bind to the acetyl-lysine mimics being sought. Most of the hits did not reproduce in either the SPR or the NMR assays, and at the end of the process just 16 closely related molecules confirmed – a true hit rate of just 0.001%!

Compound 23 is the most potent molecule disclosed, but the researchers mention a manuscript in preparation that describes further optimization. The compound shows promising selectivity against other bromodomains; it certainly doesn’t look like a classic bromodomain binder. X-ray crystallography revealed that the binding mode is in fact different from other bromodomain ligands. Trimming down compound 23 produced compound 35, which shows reasonable activity and ligand efficiency.

This paper nicely demonstrates the power of biophysics to discern a still small signal in a sea of noise. As the researchers note, PAINS filters and computational approaches would not have worked due to the sheer diversity of the misbehaving compounds. (That said, if the library had been infested with PAINS, the false positive rate would have been even higher!)

The paper is also a good argument for FBLD. Compound 35 is probably too large to really qualify as a fragment, but perhaps related molecules could have led to this series. And GSK also discovered a different series of potent ATAD2 inhibitors from fragments, which Teddy wrote about.

15 April 2019

Fourteenth Annual Fragment-based Drug Discovery Meeting

CHI’s Drug Discovery Chemistry (DDC) meeting took place last week in San Diego. I think this was the largest yet, with >825 attendees, a third from outside the US, and nearly 70% from industry. The initial DDC meeting in 2006 had just four tracks, of which FBDD is the only one that remains. This one had nine tracks and four one-day symposia, so it was obviously impossible to see everything. Like last year, I’ll just stick to broad themes.

Success Stories
As always, clinical compounds received deserved attention. Among two I’ve covered recently, Paul Sprengeler described eFFECTOR’s MNK1/2 inhibitor eFT508, while Wolfgang Jahnke discussed Novartis’s allosteric BCR-ABL1 inhibitor ABL001. As previously mentioned, ABL001 is a case study in persistence: the project started in stealth mode and was put on hold a couple times until seemingly intractable problems could be overcome.

Another story of persistence, albeit with a less happy outcome, was presented by Erik Hembre, who discussed Lilly’s BACE1 program. Teddy wrote about their first fragment-derived molecule to enter the clinic, LY2811376, back in 2011. Unfortunately this molecule showed retinal toxicity in three-month animal studies, so the researchers further optimized their molecule to LY2886721, which made it to phase 2 studies before dropping out due to elevated liver enzymes. Reasoning that a more potent molecule would require a lower dose and thus lower the risk of toxicity, the researchers used structure-based drug design to get to picomolar LY3202626, which also made it to phase 2 before being scuttled due to the apparent invalidation of BACE1 as an Alzheimer’s disease target.

Talks on BCL2 and MCL1 inhibitors from Vernalis, AstraZeneca, and Servier all involved fragments in some capacity, but unfortunately they were in the protein-protein interaction track which was held concurrently with the FBDD session I was chairing. Suffice it to say you can expect to hear more about the phase 1 compounds AZD5991 and S654315.

A few earlier-stage success stories included Till Maurer’s discussion of the Genentech USP7 program (see here), Santosh Neelamkavil on Merck’s Factor XIa inhibitors, and Rod Hubbard on Vernalis DYRK1A, PAK1, and LRRK2 inhibitors. We have previously written about how displacing “high-energy” water molecules can be useful, and this tactic was used by Sven Hoelder at the Institute of Cancer Research for their BCL6 inhibitors. Last week we highlighted halogen bonds, which proved important for transforming molecules that simply bind to MEK1 to molecules that bind and inhibit the protein, as described by AstraZeneca’s Paolo Di Fruscia.

Methods
The MEK1 story Paolo told began with a very weak (0.45 mM) fragment that the team was able to advance to 300 nM in the absence of structure, though they did eventually obtain a crystal structure that supported further optimization. On the topic of crystallography, Marc O’Reilly discussed the Astex MiniFrag approach, which we recently wrote about here. Only a couple of these fragments contain a bromine atom, but Marc did mention that, of the 10,051 X-ray complexes solved at Astex, a number show halogen bonds, including some to the hinge region in kinases.

At FBLD 2018 Astex’s Chris Murray showed the first cryo-EM structure of a fragment bound to a protein, and Marc confirmed that they have now obtained structures of fragments bound to two targets, with fragments as small as 120 Da and resolution as good as 2.3 Å. They are increasing automation, with turnaround times of less than 24 hours in some cases. Santosh also mentioned that Merck is applying cryo-EM to fragments.

Frank McCormick (UCSF) highlighted multiple fragment-finding methods used to discover inhibitors against RAS family proteins, which are responsible for more than a million cancer deaths each year. In addition to stalwarts such as crystallography and NMR, these include less common methods such as Tethering and the second harmonic generation (SHG) approach for detecting conformational changes used by Biodesy. RAS was reported as a cancer driver almost forty years ago, but only now are the first direct inhibitors entering the clinic – a testimony to both the challenging nature of the target and how far we’ve come.

SHG and Tethering were also highlighted elsewhere: Charles Wartchow described how SHG identified 392 hits from a collection of 2563 fragments against an E3 ligase bound to a target protein at Novartis, while Michelle Arkin described her use of Tethering at UCSF to find molecules that could stabilize a complex of 14-3-3 bound to a specific client protein (see here).

An effective sponsored talk was presented by Björn Walse of SARomics Biostructures and Red Glead Discovery, who described weak affinity chromatography (WAC). Once they saw the schedule for DDC, they looked for a target that would be presented shortly before their presentation, and chose the protein USP7 as a test case. Beginning in January, they screened a library of 1200 fragments to obtain 34 hits, of which 7 confirmed in a thermal shift assay. This led to an SAR-by-catalog experiment, and 11 of the 31 fragments tested showed activity, as did a Genentech positive control compound.

All methods can generate false positives and false negatives (see for example here and here), some of which were described in an excellent talk by Engi Hassaan of Philipps University. Engi discussed how improving the sensitivity of an STD assay by decreasing salt concentration identified more fragments that had previously been found by crystallographic screening. She also presented a case study of how introducing a tryptophan residue into a small protein to facilitate purification led to problems down the road when the tryptophan side chain blocked a key pocket in the crystal lattice. Gregg Siegal (ZoBio) also highlighted a case where a fragment bound to the dimer interface in a crystal structure, whereas in solution the fragment bound to the active site, as observed by NMR.

Finally, among computational methods, Pawel Sledz (University of Zurich) gave a nice overview of the SEED and AutoCouple methods, while Paul Hawkins (OpenEye) described rapid searching of more than 10 billion chemical structures using ROCS (rapid overlay of chemical features). SkyFragNet is looking closer with each passing year.

There is much more to say, so please feel free to comment. Several good events are still coming up this year, and mark your calendar for 2020, when DDC returns to San Diego April 13-17!