Practical Fragments: false negative

Showing posts with label false negative. Show all posts

04 November 2024

Catching virtual cheaters

As experienced practitioners of fragment-based lead discovery will know, the best way to avoid being misled by artifacts is to combine multiple methods. (Please vote on which methods you use if you haven’t already done so.) Normally this advice is for physical methods, but what’s true in real life also applies to virtual reality, as demonstrated in a recent J. Med. Chem. paper by Brian Shoichet and collaborators at University of California San Francisco, Schrödinger, and University of Michigan Ann Arbor.

The Shoichet group has been pushing the limits of computational screening using ever larger libraries. Five years ago they reported screens of more than 100 million molecules, and today multi-billion compound libraries are becoming routine. But as more compounds are screened, an unusual type of artifact is emerging: molecules that seem to “cheat” the scoring function and appear to be virtual winners but are completely inactive when actually tested. Although rare, as screens increase in size these artifacts can make up an increasingly large fraction of hits.

Reasoning that these types of artifacts may be peculiar to a given scoring function, the researchers decided to rescore the top hits using a different approach to see whether the cheaters could be caught. They started with a previous screen in which 1.71 billion molecules had been docked against the antibacterial target AmpC β-lactamase using DOCK3.8, and more than 1400 hits were synthesized and tested. These were rescreened using a different scoring approach called FACTS (fast analytical continuum treatment of solvation). Plotting the scores against each other revealed a bimodal distribution, with most of the true hits clustering together. Of the 268 molecules that lay outside of this cluster, 262 showed no activity against AmpC even at 200 µM.

Thus encouraged, the researchers turned to other studies in which between 32 and 537 compounds had been experimentally tested. The top 165,000 to 500,000 scoring hits were tested using FACTS, and 7-19% of the initial DOCK hits showed up as outliers and thus likely cheaters. For six of the targets, none of these outliers were strong hits. For each of the other three, a single potent ligand had been flagged as a potential cheater.

To evaluate whether this “cross-filtering” approach would work prospectively as well as retrospectively, the researchers focused on 128 very high scoring hits from their previous AmpC virtual screen that had not already been experimentally tested. These were categorized as outliers (possible cheaters) or not and then synthesized and tested. Of the 39 outliers, none were active at 200 µM. But of the other 89, more than half (51) showed inhibition at 200 µM, and 19 of these gave K_i values < 50 µM. As we noted back in 2009, AmpC is particularly susceptible to aggregation artifacts, so the researchers tested the ten most potent inhibitors and found that only one formed detectable aggregates.

In addition to FACTS, the researchers also used two other computational methods to look for cheaters: AB-FEP (absolute binding free energy perturbation) and GBMV (generalized Born using molecular volume), both of which are more computationally intensive than either FACTS or DOCK. Interestingly, GBMV performed worse than FACTS, finding at best only 24 cheaters but also falsely flagging 9 true binders. AB-FEP was better, finding 37 cheaters while not flagging any of the experimentally validated hits.

This is an important paper, particularly as virtual screens of multi-billion compound libraries become increasingly common. Indeed, the researchers note that “as our libraries grow toward trillions of molecules… there may be hundreds of thousands of cheating artifacts.”

And although the researchers acknowledge that their cross-filtering aproach has only been tested for DOCK, it seems likely to apply to other computational methods too. I look forward to seeing the results of these studies.

02 May 2016

A strong case for crystallography first

We noted last week that one theme of the recent CHI FBDD meeting was the increasing throughput of crystallography. Crystal structures can provide the clearest information on binding modes, and a key function of standard screening cascades is to whittle the number of fragments down to manageably small numbers for crystal soaking. Only a few groups have used crystallography as a primary screen. A team led by Gerhard Klebe at Philipps-Universität Marburg argues in ACS Chem. Biol. that crystallography should be brought to the forefront.

The researchers were interested in the model protein endothiapepsin. As discussed last year, they had previously screened this protein against a library of 361 compounds using six different methods, and the agreement among methods was – to put it charitably – poor. Nonetheless, many hits that did not confirm in orthogonal assays produced crystal structures when soaked into the protein. Thus emboldened, the researchers decided to soak all 361 fragments individually into crystals of endothiapepsin. This resulted in 71 structures, a hit rate of 20%, higher than any of the other methods (which ranged from 2-17%). Even more shocking, 31 of the fragments were not identified by any of the other methods, and another 21 were only identified by one other method. Thus, a cascade of any two assays would have found, at best, only a quarter of the crystallographically validated hits.

In agreement with other recent work, the fragments bound in multiple locations, including eight subsites within the binding cleft as well as three potentially allosteric sites. Not all of these sites were found using other methods.

But are these fragments so weak as to be uninteresting? To find out, the researchers performed isothermal titration calorimetry (ITC) to determine dissociation constants for 59 of the crystallographic hits. Three of the 21 most potent (submillimolar) binders were not detected by any of the other methods, and another seven were only found by one.

What factors led to this crystallographic bonanza? First, the researchers used the very high concentration of 90 mM for each fragment (in practice sometimes <90 mM because of precipitation). Not surprisingly, solubility was important: 97% of the hits had solubilities of at least 1 mM in aqueous buffer, and the soaking solution contained 10% DMSO as well as plenty of glycerol and PEG. Achieving such high concentrations is harder when multiple fragments are present, and the researchers argue from some of their historical data that the common use of cocktails lowers success rates.

How did different methods compare? Interestingly, functional assays such as high-concentration screening or reporter-displacement assays fared best, while electrospray ionization mass-spectrometry (ESI-MS) and microscale thermophoresis (MST) were close to random. This is in marked contrast to other reports for ESI-MS and MST, and the researchers are careful to note that “the choice and success of the individual biophysical screens likely depend on the target and expertise of the involved research groups.”

Primary crystallographic screening was an early strategy at Astex, and although this may not have been fully feasible 15 years ago, it seems they were on the right track. Of course, not all targets are amenable to crystallography, and not everyone has ready access to a synchrotron beam with lots of automation. But for those that are, it might be time to drop the pre-screens and step directly into the light.

23 December 2013

Fragments in Australia

Last year we highlighted the first FBDD conference held in Australia. That meeting has now led to a dozen papers in the December issue of Aus. J. Chem. Many of the papers use the same fragment libraries, so this is a good opportunity to survey a variety of outcomes from different techniques and targets.

The collection of papers (essentially a symposium in print) starts with a clear, concise overview of fragment-based lead discovery by Ray Norton of Monash University. Ray also outlines the rest of the articles in the issue.

A well-designed fragment library is key to getting good hits, and the next two papers address this issue. Jamie Simpson, Martin Scanlon, and colleagues at Monash University discuss the design and construction of a library built for NMR screening. Compounds were selected using slightly relaxed rule-of-three criteria, and special care was taken to ensure that at least 10 analogs of each were commercially available to facilitate follow-up studies. Remarkably, of 1592 compounds purchased, only 1192 passed quality control and were soluble at 1 mM in phosphate buffer. The properties of the final library are compared with nearly two dozen other libraries reported in the literature; this is the most extensive summary I’ve seen on published fragment libraries. The paper also analyzes the results of 14 screens on various targets using saturation transfer difference (STD) NMR. As the researchers note, this technique is prone to false positives, and indeed the average hit rate of 22.5% is high, with only about 50% confirming in secondary assays. There is also a nice analysis of what features are common to hits, along with a list of the 24 compounds that hit in more than 90% of screens.

The other paper on library design, by Tom Peat, Jack Ryan and others (including Pete Kenny), discusses library design at CSIRO. The researchers started with 500 fragments commercially available from Maybridge and supplemented these with roughly the same number of fragments from a collection of small heterocycles that had been synthesized internally; additional “three-dimensional” fragments are also being constructed. At CSIRO the primary screening method appears to be surface plasmon resonance (SPR), in particular the ProteOn instrument that allows simultaneous analysis of six fragments against six targets. Eight of about ten targets have yielded confirmed hits. The researchers show examples of specific (good), nonspecific (probably bad) and ill-behaved (ugly) fragments.

Next up is an excellent discussion of PAINS by Jonathan Baell (at Monash) and collaborators. Although Practical Fragments has covered this topic repeatedly (here, here, here, here, here, and here) it is a sad fact that more examples appear in the literature every day, so there is always something new to write about.

Fragment-finding methods make up the next several papers, starting with a nice overview of native mass spectrometry by Sally-Ann Poulsen at Griffith University. This paper covers theory, practical issues, and recent examples. Roisin McMahon and Jennifer Martin at University of Queensland, along with Martin Scanlon, describe thermal shift assays. In addition to highlighting a number of published examples, the paper also delves into some of the technical challenges and issues with false positives and false negatives, concluding with a nuanced discussion of how to deal with conflicting data.

The subject of conflicting data is central to the work of Olan Dolezal and Tom Peat, both of CSIRO, and their collaborators. They screened the protein trypsin against 500 Maybridge fragments using SPR. Unfortunately they couldn’t go higher than 100 micromolar without running into problems of solubility and aggregation, but even at this relatively low concentration they found 18 hits. X-ray crystallography validated 9 of them, and isothermal titration calorimetry (ITC) also validated 9, with 7 confirmed by all three techniques. (Incidentally, there are lots of great experimental details here.) Four of the SPR hits could not be confirmed by either ITC or X-ray, and 3 turned out to be false positives when repurchased and tested; in one case this appeared to be due to cross-contamination with a more potent compound. In general, the more potent compounds tended to be the ones that reproduced best, and solubility seemed to be a limiting factor for ITC. Despite the imperfect agreement of biophysical techniques, these were still superior to computational approaches on the same target with the same library. As they conclude:

It is gratifying to know (at least for these authors) that experimental data are still of enormous value in the area of fragment-based ligand design and that the modelling community still has a way to go before the experimentalists are put out to pasture.

But experimentalists should not get too cocky: the next paper, by Jamie Simpson and collaborators at Monash University, describes some of the things that can go wrong. An STD NMR screen of the antimicrobial target ketopantoate reductase (KPR) using the same Maybridge library of 500 compounds revealed 196 hits! The 47 with the strongest STD signals were then tested in a ¹H/¹⁵N-HSQC NMR assay, leading to 14 hits, of which 4 gave measurable IC₅₀ values in an enzymatic assay. Unfortunately, follow-up SAR was disappointing, and subsequent experiments revealed that aggregation was to blame: when the biochemical experiments were rerun in the presence of 0.01% Tween-20, only a single fragment gave a measurable IC₅₀ value. The researchers redid their STD-NMR screen in the presence of detergent, resulting in 71 hits, all of which were tested in the biochemical screen. This led to the identification of a new (and fairly potent) hit that had previously been missed. This nicely illustrates the fact that false positives are not just a problem in terms of wasted resources, they can also overwhelm the signal from true positives. The moral? Always use detergent in your assay!

The question of whether structure is needed to prosecute fragments has come up before, and the next paper, by Stephen Headey, Steve Bottomley, and collaborators at Monash University, addresses this question directly. The target protein, a mutant form of α1-antitrypsin called Z-AAT, unfolds and polymerizes in vivo, causing a genetic disease. The researchers used an STD NMR fragment screen of 1137 fragments to identify several hundred hits, and focused on those that bound to the mutant form of the protein rather than the wild-type. They then used a technique called Carr-Purcell-Meiboom-Gill (CPMG) NMR (which relies on line broadening when fragments bind to a protein) to confirm 80 hits, the best of which had a dissociation constant of 330 micromolar. If you’ve stuck through this post thus far you’ll recall that the Monash library was designed for “SAR by catalog”, and 100 analogs of this fragment were purchased and tested, leading to several new hits, one with a dissociation constant of 49 micromolar. Although there is still a long way to go, metastable proteins are tough targets, so this is a nice start.

The next paper, by Ray Norton at Monash University and collaborators, describes a fragment screening cascade against the antimalarial target apical membrane antigen 1 (AMA1). An initial STD NMR assay of 1140 fragments produced 208 hits, but competition experiments with a peptide ligand whittled this number down to 57 that confirmed in both STD and CPMG NMR assays. Of these, 46 confirmed in an SPR assay, and although most are fairly weak, some SAR is starting to emerge as new analogs are synthesized.

Another antimicrobial target, 6-hydroxymethyl-7,8-dihydropterin pyrophosphokinase (HPPK), is the subject of a paper by James Swarbrick at Monash and collaborators. An initial STD NMR screen gave an unnervingly high hit rate (notice any themes emerging?), so 2D ¹⁵N-HMQC experiments were performed on 750 Maybridge fragments, yielding 16 hits. Competition experiments using CPMG NMR and close analyses of the chemical shifts suggested that these fragments bind in the substrate binding site, and SPR confirmed binding for some of the fragments.

Finally, Martin Drysdale of the Beatson Institute highlights some of the success stories of FBDD, including clinical compounds, and ends with a call for shapelier fragments.

All in all this is a great collection of papers, particularly for those relatively new to the field. It will be fun to revisit some of these projects in a few years to see how they’ve progressed.