Practical Fragments: fragment docking

Showing posts with label fragment docking. Show all posts

02 September 2025

Keeping molecular dynamics cool for fragments

Accurately and reliably predicting fragment binding modes would be preferrable to doing messy, expensive, and sometimes tedious experimental work, but we’re not there yet. One of the biggest problems is that, because fragments usually bind weakly to proteins, it is hard to tell which of several possible binding modes is most favorable. In an open-access J. Chem. Inf. Model. paper published earlier this year, Stefano Moro and colleagues at University of Padova report progress.

Their approach, called Thermal Titration Molecular Dynamics (TTMD), analyzes short molecular dynamics simulations across increasing temperatures; if the ligand remains bound to the protein, this indicates a more stable binding mode. (It seems a bit like the dynamic undocking we wrote about here.) The researchers had previously reported good results for larger, drug-sized molecules, but not for four fragment-protein complexes.

Recognizing the low affinities of fragments, the researchers decided to lower the (virtual) temperatures. Rather than heating from 300 to 450 K, they heated from 73 to 233 K; ie, from just below the boiling point of liquid nitrogen to a moderately cold winter’s day in Minnesota. They first docked fragments using PLANTS-ChemPLP, which is free for academics, and chose the five best-scoring poses for evaluation.

Next, the researchers performed TTMD. There are several different ways to assess how well the ligand remains bound to the protein over the course of a molecular dynamics simulation, and four different scoring methods were chosen. When TTMD was tested on the four fragment-protein complexes that had previously failed, at least two of the scoring methods correctly identified the crystallographic binding mode for three of the fragments.

Thus encouraged, the researchers tested ten more compounds bound to six new proteins. The results were quite encouraging, with up to 86% of crystallographic binding modes being correctly identified by at least one of the scoring functions in TTMD vs 50% for docking alone. Impressively, two of the examples were MiniFrag-sized, with just 6 or 7 non-hydrogen atoms, yet the crystallographic pose was identified as the lowest energy in all four TTMD scores.

This is nice work, but the question arises how these specific ligands and proteins were chosen. Several years ago we highlighted a curated set of 93 protein-ligand structures that were used to benchmark other virtual approaches, and it would be nice to see how TTMD performs on these. Still, TTMD’s performance on its chosen examples is encouraging, and laudably the researchers have made their code freely available. If you try it out, please let us know how it works in your hands.

24 February 2025

Fragments beat lead-like compounds in a screen against OGG1

The twin rise of make-on-demand libraries and speedy in silico docking has supercharged fragment screening and optimization: we’ve written previously about V-SYNTHES, Crystal Structure First and a related method. Another advance is described by Jens Carlsson (Uppsala University) and a large group of multinational collaborators in an (open access) Nat. Commun. paper.

The researchers were interested in 8-oxoguanine DNA glycosylase (OGG1), a DNA-repair enzyme and potential anti-inflammatory and anticancer target. They started with a crystal structure into which they docked 14 million fragments (MW < 250 Da) or 235 million lead-like molecules (250-350 Da) from ZINC15. Multiple conformations and thousands of orientations were sampled for each molecule. In all, 13 trillion fragment complexes and 149 trillion lead-like complexes were evaluated using DOCK3.7, a process that took just 2 hours and 11 hours on a 3500 core cluster.

After removing PAINS and molecules similar to previously reported OGG1 inhibitors, the top-scoring 0.05-0.07% molecules from each screen were clustered and, after manual evaluation, 29 fragments and 36 lead-like compounds were purchased from make-on-demand catalogs. These were tested at 495 µM (for fragments) or 99 µM (for larger molecules) in a DSF screen. None of the lead-like compounds significantly stabilized the protein, while several fragments did. Four of the fragments were successfully crystallized with OGG1, and in all cases the key interactions predicted in the computational screens were confirmed in the actual crystal structures.

Compound 1 showed the greatest stabilization of OGG1 (2.8 ºC) and some inhibition in an enzymatic assay, but not enough to calculate an IC₅₀. Searching for analogs that contained compound 1 as a substructure in the Enamine REAL database of 11 billion compounds produced few hits, but, as before, thinking in fragments proved fruitful. Searching for molecules containing just the core heterocycle and amide (colored blue below) yielded nearly 43,000 possibilities. Docking these and making and testing a few dozen led to compound 5, with mid-micromolar inhibition. Further iterations led to low micromolar compound 7.

At this point the researchers turned from make-on-demand libraries to synthetically accessible virtual libraries to fine-tune the molecule. After docking 6720 virtual molecules, they synthesized and tested 16, of which 12 were more potent than compound 7, with five of them being submicromolar. Compound 23 showed low micromolar activity in two different cell assays and was selective against four other DNA repair enzymes.

The same high-throughput docking approach was applied to three other protein targets: SMYD3, NUDT5, and PHIP. In each case crystal structures of bound fragments were available to use as starting points. Multiple compounds with improved docking scores compared to the initial fragments were identified, though no compounds were actually synthesized and tested.

The success in finding compound 1 demonstrates experimentally the advantage fragments have in efficiently searching chemical space. The researchers note that 97% of the >30 billion currently available make-on-demand compounds have molecular weights >350 Da, while only 50 million are < 250 Da. Screening all of these fragments in silico is possible; screening everything, less so. Although the fragment hits for OGG1 were weak, this isn’t always the case, as noted here. The fact that fragment 1 could be advanced to a sub-micromolar inhibitor after synthesizing just a few dozen molecules also testifies to the efficiency of in silico approaches.

The paper contains lots of useful details and suggestions for streamlining the process and is well worth perusing if you are trying to find hits against a structurally-enabled protein.

20 June 2022

KinaFrag: a free, searchable database of kinase fragments

Four of the six approved fragment-derived drugs are kinase inhibitors, and three of these bind in the active site. Despite these successes, there are plenty of opportunities for new kinase-directed drugs, particularly those targeting cancer resistance mutations. In a recent Brief Bioinform. article, Guang-Fu Yang and colleagues at Central China Normal University describe a new tool to facilitate these discoveries.

The researchers started by trawling multiple databases such as kinase.com, DrugBank, ChEMBL, and the Protein Data Bank for kinase inhibitors. The results were combined and collated to yield a set of 7783 kinase-inhibitor fragment complexes, with more than 3000 unique fragments. Most of these bind in the “front cleft” of the active site, where the adenine of ATP normally binds, but several hundred also sit in the so-called back pocket or the intervening area.

What’s nice is that all this information is available on a free website called KinaFrag. You can download the structures yourself, but the site can also be browsed or searched. Fragments are annotated with links to various databases; here’s an example.

There are some bugs. While I was able to search by physicochemical parameters such as molecular weight and number of hydrogen bond donors, I could not get the substructure search to work. I’d be curious as to whether readers could do so.

To demonstrate the utility of KinaFrag, the researchers describe a case study in which they started with the anticancer drug larotrectinib, which inhibits TRK family kinases. However, the molecule is less effective against several mutations observed in the clinic. Examining the bound structure revealed that the mutations introduce steric clashes. Retaining the hinge-binding fragment while performing virtual screening of fragments from KinaFrag led to molecules such as YT3, potent against both wild type TRKA and two resistance mutants, and further optimization resulted in YT9.

Not only was YT9 active against the wild type and mutant forms of TRKA, it showed good oral bioavailability and pharmacokinetics in rats. Encouragingly, the molecule slowed tumor growth in both wild type and mutant TRKA mouse xenograft models.

One could debate whether this is an example of FBLD; the discovery of YT9 could also be considered a classic case of scaffold hopping. But semantics aside, this is a nice example of thinking in terms of fragmenting molecules. More broadly, KinaFrag looks like a useful tool for work on kinases – especially if the substructure search works.

18 January 2021

Does configurational entropy explain why fragment linking is so hard?

Linking two weak fragments to get a potent binder is something many of us hope for. Unfortunately, as a poll taken a few years back indicates, it often doesn’t work. But why? This is the question tackled by Lingle Wang and collaborators at Schrödinger and D. E. Shaw in a recent J. Chem. Theory Comput. paper.

When a ligand binds to a protein it pays a thermodynamic cost in terms of lost translational and orientational entropy. By linking two fragments, this cost is paid only once instead of twice. In theory this should lead to an improvement of 3.5-4.8 kcal/mol in binding energy, resulting in a 400-3000-fold improvement in affinity over what would be expected from simple additivity. As we noted here, this is possible, though rare. Linker strain often takes the blame as a primary villain. But is there more to the story?

The researchers computationally examined published examples of fragment linking (most of which we’ve covered on Practical Fragments) using free energy perturbation (FEP) to try to understand why the linked molecules bound more or less tightly than expected. Impressively, they were able to computationally reproduce experimentally derived numbers, and by building a thermodynamic cycle they could extract the various components of the “connection Gibbs free energy.” These included changes in binding mode or tautomerization, linker strain or linker interactions with the protein, and the previously mentioned entropic benefits of fragment linking.

The analysis also identified two additional components. If two fragments favorably interact with each other, covalently linking them may not give as much of a boost. This concept had been considered decades ago, though the current work provides a more general understanding.

The more important factor appears to be what the researchers refer to as “configurational entropy.” The notion is that even when a fragment is bound to a protein, both the ligand and protein retain considerable flexibility, which is entropically favorable. Linking two fragments reduces the configurational entropy of each component fragment, and the linked molecule binds less tightly than would be expected. The researchers argue that this previously unrecognized “unfavorable change in the relative configurational entropy of two fragments in the protein pocket upon linkage is the primary reason most fragment linking strategies fail.” They advise that maintaining a bit of flexibility in the linker can help, as has been previously suggested.

This is an interesting analysis, and explicitly considering configurational entropy is likely to improve our understanding of molecular interactions. But is it really the main barrier to successful fragment linking? The researchers explore only nine different protein-ligand systems, though they did consider multiple linked molecules for three of these (pantothenate synthetase, RPA, and LDHA). Still, these represent just a fraction of the 45 examples collected in a recent review, and they only considered one somewhat contrived case (avidin) in which especially strong superadditivity was observed. It will be interesting to see whether the analysis holds true for more examples of fragment linking.

11 January 2021

Hundreds of fragments hits for the SARS-CoV-2 Nsp3 Macrodomain

COVID-19 will be with us for some time. Despite the unprecedented speed of vaccine development, it is worth remembering that humanity has only truly eradicated two widespread viral diseases, smallpox and rinderpest. Thus, the long march of small molecule drug discovery against SARS-CoV-2 is justified. In a paper recently posted on bioRxiv, Ivan Ahel and more than 50 multinational collaborators take the first steps.

Last year we highlighted two independent crystallographic screens against the main protease of SARS-CoV-2. Another potential viral target is the macrodomain (Mac1) portion of non-structural protein 3 (Nsp3), an enzyme which clips ADP-ribose from modified proteins, thus helping the virus evade the immune response.

The researchers soaked crystals of Mac1 against a total of 2683 fragments curated from several collections. This yielded 214 hits, and most of the structures were solved at high resolution (better than 1.35 Å). About 80% of the fragments bound in the active site, with many binding in the adenosine sub-pocket. Two different crystal forms were used for soaking, and one set of 320 fragments was soaked against both. Interestingly, this yielded a hit rate of 21% for one crystal form and just 1.3% for the other. Even more surprising, of the five hits found in both crystal forms, only two bound in the same manner in both. This is a clear demonstration that it is worth investing up-front effort to develop a suitable crystal form of a protein before rushing into soaking experiments.

Independently, the researchers computationally screened more than 20 million fragments (mostly from ZINC15) against the protein using DOCK3.7, a process which took just under 5 hours on a 500-core computer cluster. Of 60 top hits chosen for crystallographic soaking, 20 yielded structures, all at high resolution (0.94-1.01 Å). The ultra-high resolution structures revealed that four fragments had misassigned structures (wrong isomers), which long-time readers may not find surprising. Importantly, most of the 20 experimentally determined structures confirmed the docking predictions.

A strength and weakness of crystallographic screening is that it can find extraordinarily weak binders, which may be difficult to optimize. To see whether they could independently verify binding, the researchers tested 54 of the docking hits in a differential scanning fluorimetry (DSF) assay. Ten increased thermal stability, and all of these had yielded crystal structures. Only four of 19 fragments tested yielded reliable data in isothermal titration calorimetry (ITC) assays, but encouragingly these four also gave among the most significant thermal shifts in the DSF assay. Finally, 57 of the docking hits and 18 of the crystallographic hits were tested in a homogenous time-resolved fluorescence (HTRF) based peptide-displacement assay, yielding 8 and 3 hits respectively, the best with an IC₅₀ of 180 µM.

This paper is a tour de force, and may represent the largest collection of high-resolution crystallographic fragment hits against any target. Laudably, all 234 of the crystal structures have been released in the public domain, and the researchers have already suggested ideas for merging and linking. As they point out, many of the fragments bind in the adenine pocket, so selectivity will be an issue not just against human macrodomains but also against kinases and other ATP-dependent enzymes. Still, as the dozens of approved kinases inhibitors demonstrate, achieving selectivity is possible.

From a technology perspective, this publication affirms the rising power of both crystallographic and computational screening. Indeed, the hundreds of crystal structures will themselves be useful input for training new computational methods. And from a drug discovery perspective, each of these fragments represents a potential starting point for SARS-CoV-2 leads.

Let’s get busy!

14 December 2020

Benchmarking docking methods: a new public resource

Despite advances in crystallography, obtaining structures of fragments bound to proteins is still often elusive. Computational docking is likely to forever be faster than experimental methods, but how good is it? A new paper in J. Chem. Inf. Mod. by Laura Chachulski (Jacobs University Bremen) and Björn Windshügel (Universität Hamburg) assess four popular methods and also provide a public validation set for others to use.

When evaluating fragment docking methods, it is essential to have a well-curated set of experimental structures. To this end, the researchers started by combing the PDB for high quality, high resolution (< 2 Å) structures of protein-fragment complexes. They used automated methods to remove structures with poor electron density, close contacts with other ligands, and various other complications. Further manual curation yielded 93 protein-ligand complex structures. The fragments span a relaxed rule-of-three, with 7 to 22 non-hydrogen atoms (averaging 13) and ClogP ranging from -4.1to 3.5 (averaging 1.1). I confess that some choices are rather odd, including oxidized dithiothreitol, benzaldehyde, and γ-aminobutyric acid. The researchers might have saved themselves some effort, and obtained a more pharmaceutically attractive set, by starting with previous efforts such as this one.

Having built their benchmark data set, called LEADS-FRAG, the researchers next tested AutoDock, AutoDock Vina, FlexX, and GOLD to see how well they would be able to recapitulate reality. The results? Let’s just say that crystallographers look likely to have job security for some time.

Only 13 of the 93 protein-fragment complexes were correctly reproduced as the top hit using all four methods (even with a reasonably generous RMSD cutoff criterion of < 1.5 Å).There were 18 complexes that none of the methods predicted successfully. Across the four methods, the top-ranked poses were “correct” 33-54% of the time. Docking methods usually provide multiple different poses with different scores; up to 30 were considered here. Looking at lower-ranked poses increased the number of successes to 27 of the 93 fragments, while only three failed in all methods. Overall, the correct structure was present among the poses in 53-86% of cases. Changing the scoring function sometimes led to further improvements.

Why were some fragments more successfully docked than others? Fragments that were more buried within the protein (lower solvent-accessible surface area, or SASA) yielded better predictions than those that were more solvent-exposed. The researchers did not report on the effect of rotatable bonds; intuitively, one might think that a more flexible fragment would be harder to dock. A study we highlighted nearly ten years ago found that fragments with higher ligand efficiency also had higher docking scores, and it would be interesting to know if that reproduced with this larger data set.

The researchers conclude by noting that “these programs do not represent the optimal solution for fragment docking.” I think this is a fair assessment. And as the researchers acknowledge, the bar was set low: compounds were docked against the crystal structure of the protein with ligand computationally removed. In the real world, proteins often change conformation upon ligand-binding, which would make docking even more difficult.

In addition to trying to determine how a specific fragment binds, it can also be valuable to computationally screen large numbers of fragments. The programs used here took between 10 seconds and 42 minutes per ligand, but as we highlighted last year speed continues to increase.

Most importantly, the public availability of LEADS-FRAG will allow others to assess their own computational approaches. It will be fun to revisit this topic in a few years to see how much things have improved.

19 August 2015

Caveat Emptor...or marketing does not always tell you whats really in the package.

In case you missed it, I spoke at the ACS on Sunday. It was in a computational session looking at designing libraries and I am pretty sure I was the only non compchemist. It was about all the problem compchemists have caused in library design. My talk was even live tweeted by Ash (@curiouswavefn) and was well received. So, looking at the next paper in my queue, its a computational-focused paper. So, After spending several hours on a Sunday listening to compchemists, have I softened?

This paper is the subject of today's post. It is an extension of this paper which describes their virtual screen. From a 2 million compound virtual screen, they tested 17 compounds in vitro leading to 2 micromolar compounds. This paper is the story of the most potent of the two micromolar compounds. The target is CREBBP, which is another in the long line of epigenetic targets. Compound A was one of the original in silico actives that was tested. Three analogs were obtained and tested (B-D).

Figure 1. Original active, A. Analogs B-D. Common structural motif is shown in blue.

Compound B was the most potent and become the focus of their optimization efforts. Of course, my eyes are drawn to that potential michael acceptor, but the authors dismiss it based upon their docking results: the only alkylatable residue in the area of its putative binding is well buried. It is a moot point anyway because they were able to replace it with a isopthalate group and increase potency by 5x, 0.9uM (Compound 6). Interestingly,the potency of 6 is different depending on the assay used: 0.8 um in a competition binding assay and 8.7uM in a TR-FRET assay.

Figure 2. Compound 6

This compound was crystallized and showed that the predicted binding mode was correct.

They then performed some gobbledy-gook MD calculations (finite-difference Poisson, warning PDF) in order to evaluate the electrostatic contribution of the polar contribution to binding of 6 and 7. Compound 6 had more favorable electrostatic interactions (0.8 kcal/mol) than 7, which had more favorable van der Waals interactions (1.4 kcal/mol). With this crucial information AND the crystal structure in hand, they then explored additional chemical space.

Despite the authors' claim, I don't think they actually improved the potency significantly. Compound 6 is 8 uM in the TR-FRET assay and the best compounds they claim are 1 or 2 uM. I really have to call monkeyshines here. They use the different assays interchangeably, yet never explain the one is used for what purpose. Its cherry picking values. When talking about selectivity, they switch to using thermal shift values. And we all know the value of that. So, I find it hard to believe their "most potent" this or "selectivity" that. The title of the paper includes "nanomolar", but that is only in one assay. That's like saying I can run a 6 minute mile, since I did it once under optimum conditions. Honestly, my typical times (WAY back when) were more 8:30 miles. That honesty in data reporting. Since they obviously had access to different assays, why weren't all compounds run in one, or optimally both. I don't see that the MD calculations had any positive impact. Maybe its the heat, but this paper is a not a sham, but definitely full of deceptive advertising.

25 November 2014

Docking covalent fragments

Most drugs interact non-covalently with their target. The conventional wisdom was that covalent drugs – especially irreversible ones – would have dangerous side effects. Although this is still a concern, the success of drugs such as ibrutinib and dimethyl fumarate has caused a resurgence of interest. In a new paper in Nature Chemical Biology, Brian Shoichet, Jack Taunton, and colleagues at the University of California San Francisco describe how computational chemistry can be used to find new covalent inhibitors.

The researchers created a modified version of the program DOCK called – wait for it – DOCKovalent. Happily, they have made this available for free to anyone. To start, you upload your crystal structure and choose which amino acid residue you are interested in targeting. You can then pick from 9 different libraries of various electrophiles, each covering a different class of covalent “warhead”: epoxides, aldehydes, etc. There are about 650,000 molecules in total, roughly half of which easily qualify as fragments, with the rest being lead-like (still < 350 Da). Each molecule is either commercially available or readily synthesized in one or two steps.

The program then virtually links each molecule with the selected protein residue (typically cysteine or serine) and calculates scores based on predicted van der Waals and electrostatic interactions as well as desolvation. Multiple conformations of each ligand are sampled (with fragments there are not that many) as are different rotamers of the nucleophile. Users then manually inspect and test the top hits.

The researchers first benchmarked the program against four proteins with known covalent inhibitors, where it performed well. In the case of the bacterial protein AmpC β-lactamase (which we previously discussed here), the program retrospectively predicted the correct structure of 15 out of 23 known boronic acid ligands. In one case where the prediction differed from the reported co-crystal structure, the researchers re-determined the co-crystal structure at high resolution and found that DOCKovalent was actually correct.

Thus confident, the researchers docked 23,000 commercial boronic acids against AmpC and selected 6 on the basis of score and structural novelty. Of these, 5 had inhibition constants of 3.55 µM or better, with the best being 40 nM. A crystal structure of this compound bound to the protein led them to purchase 7 additional compounds, one of which had K_i = 10 nM and a ligand efficiency of 0.73 kcal/mol/atom. Most of the molecules were also selective against 4 other proteases and were able to reverse antibiotic resistance in AmpC-expressing bacteria.

Of course, by design all of these molecules have a boronic acid warhead; will any such molecule inhibit this enzyme? To find out, the researchers tested 5 low-scoring molecules and found that 4 of them showed, as hoped, less than 10% inhibition at 10 µM. However, a fifth molecule showed reasonable inhibition, with Ki = 3.2 µM. To understand this false-negative, the team solved the crystal structure of the molecule bound to AmpC. Interestingly, the molecule bound in a conformation different than had been predicted – one that also required conformational changes in the protein, which are not allowed in DOCKovalent.

The researchers took a similar approach to seek novel inhibitors of the kinases RSK2 and MSK1 using reversible cyanoacrylamide-containing molecules (previously highlighted here). Here too the researchers were able to identify selective nanomolar cell-active inhibitors.

This looks like a very nice approach. Of course, it does require a crystal structure (or at least a good model). Also, as mentioned above, the fact that the protein is kept rigid means the program will be unable to detect ligands that bind to cryptic pockets, so there is still plenty of opportunity for empirical surprises. Still, the fact that DOCKovalent is freely available will hopefully encourage people to give it a try on their favorite protein.

12 December 2013

Upon Request

Dan and I blog here because we love it; we don't get paid, it takes a lot of time, and has very little reward. I love it when I meet someone new and they say, "Oh, I read your blog." However, this allows us to have freedom to review what we want, when we want, and how we want. We don't sell advertising, we don't generate revenue, and so on. Sometimes people agree with us, sometimes they don't. These posts are our opinions and like bellybuttons, everyone has one. Sometimes, we get pinged by somebody who just published a paper and would really like to see us blog about it. Sometimes we do, sometimes we already have and they missed it, and sometimes we don't.

I received a polite email recently, pointing out this paper. It was already on my radar to blog about, so I bumped it up in the queue. This paper caught my attention because it is a fragment screen against a DNA-target, specifically the G-quadruplex from c-MYC. G-quadruplexes are found in the promoters to many oncogenes and the supposition is that by stabilizing them you can reduce their transcription. It is an intriguing idea which has already been investigated with a number of compounds to date. These authors decided to use fragments against the G-quadruplex without knowing if fragments would bind to a nucleic acid target with sufficient affinity and selectivity. Their primary screen was an Intercalator Displacement Assay (IDA) which has been used previously to find G-quadruplex binding ligands. A 1377 fragment library (@5mM) (previously used against riboswitches) was used and it obeyed the Voldemort Rule, had >95% purity, and 1mM aqueous solubility. The top 10 hits from this screen could be placed in three groups.

Then, in order to confirm their biochemical assay results they decided to dock them these top 10 fragments. WHHAAAAT you say? That was my initial reaction. Why oh why doth they vex me so? They then go into EXCRUCIATING detail about the docking results, even concluding from the results some SAR hypotheses. I kid you not. They also evaluated these top 10 fragments in a cellular assay (125um and 250uM) using a Western blot readout. These concentrations were chosen in order to not show short or long-term toxicity, but Mirabile dictu, Data Not Shown. All fragments, except two (7A3 and 2G5), showed significant changes in c-Myc expression levels. Interestingly, "no significant changes" still gives a 20% reduction in c-Myc levels.

Four fragments were able to reproduce this effect, of which 11D6 was the best. The four best were then run pair-wise to and every combination induced a significant reduction of c-Myc.

So what does this tell us? Well, I think they have found fragments which bind to the c-MYC promoter G-quadruplex. It may be exhibiting this binding in the cells. There are a few experiments that I would like to see (and would have asked for if I had reviewed this paper): a binding assay (SPR, ITC, NMR, whatevs) being he primary one. We also continue to know that docking really does not add anything to the discovery process.

30 July 2013

Don't Read This!

Epigenetics is one of the hot new areas for drug research. It seems like every company has several targets among the bevy of readers, writers, erasers out there. We have blogged about this previously. Now, as the field sees more and more players, more and more papers are appearing describing various drug discovery efforts, largely on bromodomains, the readers. The current state of the art for bromodomain compounds, is shown below. FBDD approach and led to Cpd 4b of modest potency. In this paper, the authors aim to diversify the range of scaffolds for bromodomain inhibition.

Cpd 1 is the famous JQ1, cpd 2 from GSK has a very similar scaffold to JQ1, Cpd 3 from Oxford and GSK has the 3,5-dimethylisoxazole scaffold and acts as a AcK mimic (like DMSO), Cpd 4 is from a

This group describes how they put together their fragment library (which is a detail that is often lacking in papers). Starting from the ZINC database, they applied the following filters: 1. MW < 250 (HAC<18 1="" 2.="" 3.5="" 3.="" 4.="" 5="" bonds="" logp="" rotatable="" u=""><

smallest set of smallest ring<4. [I have no idea what this last filter means, I have reproduced it faithfully and maybe somebody can tell me what it means in the comments.] They then clustered the compounds with a Tainomoto cutoff of 0.7 and the center (centroid?) of the cluster was chosen to represent the clusters. They then applied the "Reality Check" and had a real person look at the compounds and 500 were chosen, with 487 being purchasable. The authors state that some of their compounds do not obey the Voldemort Rule. Good for them! More people should. They then spend several sentences defending this choice. Shame on the reviewer who made them. The properties of their library are presented graphically.

They started by docking their fragment against the JQ1 structure of BRD4. All binding conformations were assessed for their interaction with Asn141. 41 Cpds were then put into co-crystallization experiments, they obtained 60 crystals which were diffracted. Four of these fragments are exemplified below.

It is satisfying to see that four different moieties bind in the roughly same location with similar orientations. Compound 8 was chosen for further optimization, but with no explanation. The end up with Cpd 40a which is a 0.23 uM inhibitor (LEAN = 0.26), compared to > 100uM for the parent fragment.

Cpd 40a

The authors then performed PK testing, reasoning that bad PK here should kill the whole series, while also cautioning that PK does not necessarily extend to all the compounds in a series. 40a (and the parent compound) showed cleaned metabolic stability and stability in human microsomes. 40a also exhibited inhibition of several CYPs at < 50% at 10uM, reinforcing the metabolic stability. They finally looked at the cellular activity. They found that compound inhibition is not well correlated with anti-proliferative activity. Reviewing the compounds used in this assay they found that logP was crucial here, lower logP means better anti-proliferative activity. PSA did not correlate.

This is an nice paper, not only for showing the range of chemical diversity that can be identified in this target class, but because it gets to several important questions that are being discussed here and in the LinkedIn group. Namely, what should your library look like and how to screen it? I applaud them for not being hide-bound to the Voldemort Rule. As recently mentioned, there is a difference in how X-ray and NMR view fragments. X-ray being notably different by requiring high occupancy. In this field, using X-ray makes a lot of sense because it reduces the likelihood of false positives due to such things as DMSO binding.

24 August 2012

ACS Fall Meeting 2012

I recently returned from Philadelphia, where the American Chemical Society held its 244^th national fall meeting. As always this was a massive affair, but fragments were well-represented, particularly in a nice session organized by Percy Carter, Debbie Loughney, and Romyr Dominique.

I opened the session by giving an introduction to fragment-screening, as well as an overview of some of the work we’re doing at Carmot. Andrew Good had perhaps the best title (“Fragment fat wobbles too”), and discussed some of the work done at Genzyme on Pim-1 kinase. Eric Manas next described some of the computational tools being used at GlaxoSmithKline, in particular strategies to deal with water. He also discussed the utility of looking for fragment analogs early in a project. In the last talk before the intermission, Chris Abell from the University of Cambridge described a number of projects from his group, starting with antimicrobial targets (such as this one); we’ll cover another in a separate post. Chris is unabashedly going after difficult targets, not just protein-protein interactions, but oligonucleotides – specifically riboswitches. There is only limited precedent for targeting RNA with fragments, so it will be fun to see how this progresses.

Francisco Talamas next described a nice example from Roche using FBLD to discover hepatitis C NS5B Palm I allosteric inhibitors. An HTS campaign of around 900,000 molecules yielded just 3 hits, none of which were advanced. A fragment screen of about 2700 fragments gave a better hit rate (5.9%), but of the 29 co-crystal structures attempted only a single structure was obtained. However, by combining the information from this crystal structure with information from other crystal structures, both proprietary and public, the researchers put together a set of rules to design a de novo fragment library tailored to this protein. This effort ultimately yielded compounds that were optimized to a clinical candidate.

Next, Nick Wurtz from Bristol-Myers Squibb described his company’s approach to discover neutral Factor VIIa inhibitors. The researchers used a combination of computational, functional, and biophysical approaches to find uncharged fragments that would bind in the P1 pocket, leading to a couple dozen crystal structures. Despite the low affinities of these fragments (typically mM), many of them could successfully be merged onto an existing series, replacing a positively charged moiety to yield potent molecules with better permeability. This is the first time I’ve seen a fragment story out of BMS, so I'm glad to see that they’re active in this area. This is also a prime example of what has been described as fragment-assisted drug discovery.

Finally, Prabha Ibrahim of Plexxikon gave a lovely overview of the discovery and development of vemurafenib, including a more detailed description of the SAR than has been presented in their earlier papers.

In addition to this dedicated session, there was a scattering of other talks and posters, including a notable poster from Timothy Rooney at the University of Oxford using fragment-based approaches to discover bromodomain inhibitors, a target class we’ve previously discussed.

A session entitled “A medicinal chemist’s toolbox” ranged over several topics of interest. Ernesto Freire of Johns Hopkins gave a great overview of thermodynamics in drug discovery, a topic we’ve previously covered. Most readers are probably familiar with the concept of enthalpy-entropy compensation, in which (for example) an added hydrogen bond fails to achieve the desired boost in potency due to unfavorable entropy. Recognizing this, he suggested that one should target groups in proteins that are already well-structured, so you don’t have to pay the cost of structuring a disordered part of the protein. He also suggested that if you introduce one hydrogen bond, you might be better off introducing a second one too, as the incremental entropic cost is likely to be low.

György Keserű of Gedeon Richter discussed the importance of avoiding lipophilicity by using tools such as LELP, which we’ve covered here and here. Continuing this theme, Kevin Freeman-Cook of Pfizer described two examples of using LLE in lead discovery programs, in particular calculating LLE values before making compounds. Although this may seem obvious, what was quite striking was the dramatic effect subtle changes in structure could make to ClogP values.

Of course, these are just a few of thousands of presentations. Please feel free to point out any that caught your eye, or expand on some of those mentioned above. And just a reminder, it’s only 4 weeks to FBLD 2012 in San Francisco – the biggest fragment event of the year!

18 July 2012

Finding cryptic pockets computationally

The mobility of proteins is a constant source of wonder. I enjoy looking at experimentally-determined structures of small molecules bound to proteins, and it’s even more fun when the protein undergoes dramatic conformational changes to accommodate the ligand. But what may be fun for a chemist is a considerable challenge for molecular modeling: it’s hard enough to dock small molecules to a rigid model of a protein, and all the more so when an apparently flat protein surface yawns open to reveal a new pocket. In a recent paper in J. Comp. Chem., Olgun Guvench and collaborators at the University of New England College of Pharmacy and the University of Maryland look specifically for such “cryptic” binding sites.

The researchers used the cytokine IL-2, which is known to have cryptic pockets. In fact, small molecule inhibitors have been found that target the IL-2 receptor binding site in part by binding to cryptic pockets in the cytokine. The apo form of IL-2 (ie, without any small molecule bound) was used as a starting structure in a computational technique called Site Identification by Ligand Competitive Saturation (SILCS). In this approach, multiple molecular dynamic simulations are carried out with the protein “soaked” in a virtual solution of water and ligand (in this case very simple molecules such as benzene, propane, or acetonitrile). The idea is to let pockets form and see if the ligands bind in the pockets.

Molecular dynamics simulations, in which individual atoms within a protein are allowed to move, run the risk that the protein will deviate too far from a stable structure and denature completely. This can be avoided by introducing various restraints to keep atoms from moving too much, but if the restraints are too severe the protein is too rigid and you won’t see pockets form.

Also, as many people are painfully aware, small molecules can form aggregates in aqueous solution, and the same thing can happen in virtual water. In SILCS, the virtual fragments are programmed to repulse each other, keeping the fragments more or less distributed in solution.

The researchers found that they could in fact identify the cryptic pockets in IL-2, either by using relatively loose restraints or by running multiple unrestrained simulations and simply discarding those in which the protein denatured dramatically. Only the hydrophobic fragments found the cryptic binding sites, perhaps reflecting the relatively hydrophobic nature of the pockets. Additional pockets were also found, though whether these are real or not is unclear.

It’s still a long way from simulations with propane to running molecular dynamics screening simulations on hundreds or thousands of unique fragments, but given the increasing speed of processing power, perhaps the gap will be bridged sooner than expected.

03 May 2009

More on DOCKing fragments and sampling chemical space

A few weeks ago, we highlighted a paper from Brian Shoichet’s group at UCSF demonstrating that computational screening could successfully identify fragments binding to a protein target, and that the binding modes predicted were actually observed experimentally. A companion paper just published online in PNAS now extends these results, and also beautifully illustrates that it is possible to cover much more chemical space with fragments than with lead-like molecules.

Denise Teotico, Shoichet, and colleagues used the program DOCK 3.5.54 to screen 137,639 fragments against AmpC beta-lactamase, a bacterial protein responsible for antibiotic resistance. The protein had previously been the target of HTS and computational screens of drug-like like molecules. The computational screens had modest success rates (2-7%), but the HTS screen was a total bust: of the more than 1200 hits from the 70,000+ compound screening collection, more than 95% of these turned out to be false positives, mostly aggregators, with just a few dozen true inhibitors, all of which turned out to be covalent (irreversible).

In contrast, of the 48 high-scoring fragments that were experimentally tested, 23 had Ki values better than 10 mM, for a hit rate of 48%. The authors also assessed potential for false negatives by choosing 20 random fragments and testing these for inhibition; only one showed inhibition (with a Ki value of 3.1 mM), and this molecule had scored in the top 5% of docked fragments.

The paper presents a fascinating empirical test of the Hannian chemical complexity hypothesis. Starting with the 23 active fragments, the researchers calculated how many lead-like molecules (up to 25 non-hydrogen atoms) could contain these fragments. Of the roughly 47,000,000,000 to 430,000,000,000 possible lead-like molecules, only 675 are commercially available. By repeating this analysis with fragment-sized molecules (up to 17 non-hydrogen atoms), the size of the haystack was reduced by six orders of magnitude: only about 10,000 possible molecules contain these fragments, of which 93 are commercially available. Moreover, many of the active fragments represent unique chemotypes not previously observed in AmpC inhibitors. As the authors note:

The chances of discovering interesting chemotypes for biological targets is many orders of magnitude higher when targeting molecules in the fragment weight range than even at slightly higher size ranges.

But, as the paper asks, “are the docking predictions right for the right reasons?” The researchers solved the crystal structures of 8 fragments bound to AmpC. Four of these reproduced the docking predictions well, two were somewhat different, and two were way off. In these last two cases, the protein itself adopted different conformations than had been used in the docking studies.

Protein conformational flexibility is remarkably common, and likely to be a persistent difficulty for computational methods. Clearly, current computational methods can’t identify all possibilities, particularly with fluxional proteins. Still, especially with relatively rigid proteins, computational fragment-screening may reveal chemotypes that HTS won’t.

A notable feature of the fragments is their relatively poor ligand efficiency: with one unusual exception (a phosphinate), all of the active fragments have ligand efficiencies less than 0.3 (kcal/mol)/atom. AmpC has a large, open active site, and the authors suggest that the failure of other hit-ID methods against this target may reflect issues such as solubility.

It remains to be seen whether these fragments can be advanced to low nanomolar inhibitors, but at least fragment-screening has provided many new starting points. And the paper demonstrates, once again, that triaging a fragment set computationally can be an effective means for concentrating the needles in a haystack.

22 March 2009

Fragments in silico, confirmed by X-ray

I’ve always been something of an empiricist, and have therefore been wary of computational fragment screening. It’s not that I think it’s impossible, just that the algorithms and parameters developed to date have not often shown themselves up to the task. A paper just published in Nature Chemical Biology from Brian Shoichet’s group at UCSF has caused me to reconsider my skepticism.

Shoichet and Yu Chen used the program DOCK to screen 67,489 commercially available fragment-sized molecules contained in the database ZINC against the active site of the beta lactamase CTX-M, a bacterial enzyme responsible for resistance to penicillin and cephalosporin. Of 69 top hits, 10 actually inhibited the enzyme when tested experimentally. In contrast, of 37 high-scoring hits from a similar computational screen of 1,147,326 larger lead-like molecules, none showed any inhibition up to the limit of their solubilities.

Interestingly, each of the ten active fragments contained an anionic group: 3 carboxylates, 2 sulfates, and 5 tetrazoles among the set. A reexamination of the docked lead-like molecules revealed a relatively high-scoring tetrazole, which exhibited an experimental Ki value of 21 micromolar (see figure). Although this was an in silico hit, it was swamped by the number of (inactive) hits and so had not been selected for experimental follow-up until the fragment results revealed tetrazoles to be privileged pharmacophores. Additional similarity searching of the lead-like molecules led to two additional low micromolar inhibitors.

Five of the inhibitory fragments and one of the lead-like molecules were characterized crystallographically, and the results were remarkable: all of them bound in a similar manner to that predicted by docking.

Chen and Shoichet also investigated the specificity of the fragments compared to the lead-like compounds, and the results agreed well with those predicted by Hann and colleagues (as discussed on our sister blog FBDD-Lit here). Namely, while the fragments had relatively low specificity against a mechanistically distinct beta lactamase (AmpC), the lead-like molecule exhibited roughly 100-fold tighter inhibition of CTX-M. In other words, fragments likely have a higher hit rate (and correspondingly lower specificity) due in part to their simplicity, but as fragments are elaborated, specificity can be readily built into the molecules.

So does this mean the era of computational fragment-based screening has arrived? While these results are impressive, it is important to keep them in perspective. CTX-M has a relatively rigid active site, while many proteins of interest show a level of flexibility that confounds modeling. Moreover, Chen and Shoichet were working with an ultra-high resolution (0.88-Angstrom) crystal structure of CTX-M in which they could actually see density for hydrogen atoms on some polar groups. Needless to say, this is atypical. Still, the paper does give hope that the computational tools are ready, as long as they are applied to appropriate systems.