Practical Fragments: computational chemistry

Showing posts with label computational chemistry. Show all posts

08 September 2025

Fragment growing in three dimensions made easy

Nearly a decade ago we highlighted a paper from Astex that exhorted chemists to develop new synthetic methodologies useful for fragment-based drug discovery. Peter O’Brien has taken on the challenge, and he and his collaborators at University of York and AstraZeneca report their progress in a recent (open-access) J. Am. Chem. Soc. paper.

The O’Brien group has previously published synthetic routes to shapely fragments, which we wrote about here. These could be useful for expanding fragment collections, but that happens infrequently. The new paper focuses on the far more common challenge of what to do when you have a fragment hit.

The idea was to create a “modular synthetic platform for the elaboration of fragments in three dimensions.” The researchers designed a set of bifunctional building blocks that could be coupled to existing fragments. The two functionalities were N-methyliminodiacetic acid boronate (BMIDA) and a Boc-protected amine. The amine is a versatile handle for multiple types of chemistry, while the BMIDA moiety is particularly useful for Suzuki-Miyaura cross-coupling. (Indeed, two separate groups of researchers had previously built libraries suited for cross-coupling using halogen-containing fragments, as we discussed here.)

For the new building blocks, the researchers considered azetidines, pyrrolidines, and piperidines with fused or spiro-cyclopropyl groups. These are rigid “three-dimensional” units, and the relative locations of the BMIDA group and the amine could provide very different distances and vectors. After modeling 27 possibilities, the researchers chose nine building blocks based on diversity and predicted ease of synthesis. These were synthesized on gram scale, and all nine are now commercially available.

To demonstrate that the building blocks would be generally synthetically useful, the researchers coupled them to a variety of (hetero)aryl bromides, with yields ranging from 10-90%, and most >60%. The Boc group was then deprotected and the crude amine was used in a variety of successful reactions.

The building blocks were each also coupled to 5-bromopyrimidine, the Boc-group was deprotected, and the free amines were capped as methanesulfonamides. Small molecule crystallography of the resulting compounds confirmed modeling results that the two vectors had a wide range of orientations and were separated by 1.5-4.4 Å. Moreover, most compounds were rule-of-three compliant, had good measured aqueous solubility, and were even stable in human liver microsomes and rat hepatocytes.

As a use-case, the researchers considered the approved drug ritlecitinib, an irreversible JAK3 inhibitor. They imagined that its pyrrolopyrimidine moiety was a fragment hit, and then virtually combined it with their nine scaffolds, each functionalized with an acrylamide. These were then virtually docked, and the best two were synthesized and tested. Compound 96 was quite potent, albeit less so than ritlecitinib.

The question of whether three-dimensionality is desirable as a design feature remains unproven, as we noted recently. However, whether the high Fsp³ of the nine new scaffolds is itself a selling point, they do provide new vectors for fragment growing, and their synthetic enablement justifies including them at least in virtual campaigns.

02 June 2025

Small and simple, but novel and potent

Back in 2012 we wrote about GDB-17, a database of possible small molecules having up to 17 carbon, oxygen, nitrogen, sulfur, and halogen atoms, most of which have never been synthesized. Although novelty isn’t strictly necessary for fragments, as evidenced by the fact that 7-azaindole has given rise to three approved drugs, it’s certainly nice to have. In a new (open-access) J. Med. Chem. paper, Jürg Gertsch, Jean-Louis Reymond, and colleagues at the University of Bern synthesize fragments that had not been previously made and show that they are biologically active.

When you start drawing all possible small molecules you get lots of weird stuff, including an explosion of compounds containing multiple three- and four-membered rings, which may be difficult to make. The researchers wisely focused on “mono- and bicyclic ring systems containing only five-, six-, or seven-membered rings.” They further limited their search to molecules containing just carbon and one or two nitrogen atoms (as well as hydrogen, of course). Systematic enumeration led to 1139 scaffolds, ignoring stereochemistry, of which 680 had not been previously reported in PubChem. Out of these, three related scaffolds were chosen for investigation.

Computational retrosynthesis was used to devise routes to the three bicyclic scaffolds, and these were successfully synthesized, along with mono-benzylated versions, for a total of 14 molecules (including stereoisomers), all rule-of-three compliant. The online Polypharmacology Browser 2 (PPB2) was used to predict targets, and several monoamine transporters came up as potential hits. The molecules were tested against norepinephrine transporter (NET), dopamine transporter (DAT), serotonin transporter (SERT), and the σ-R1 receptor in radioligand displacement assays. None of the free diamines were active, but several of the benzylated compounds were, in particular compound 1a.

Compound 1a was initially made as a racemic mixture, and when the two enantiomers were resolved (R,R)-1a was found to be a mid-nanomolar inhibitor of NET while (S,S)-1a was 26-fold weaker. Compound (R,R)-1a was also a mid- to high nanomolar inhibitor of σ-R1, DAT, and SERT. Pharmacokinetic experiments in mice revealed that the molecule had poor oral bioavailability but remarkably high brain penetration and caused sedation. The researchers conducted additional mechanistic studies beyond the scope of this blog post and conclude that (R,R)-1a could be a lead for “neuropsychiatric disorders associated with monoamine dysregulation.”

There are several nice lessons in this paper. First, as we noted more than a decade ago, there is plenty of novelty at the bottom of chemical space. Moreover, and in contrast to our post last week, even small fragments can have high affinities. But novelty comes at a cost: synthesis of compound 2a required eight steps from an inexpensive starting material with an overall yield of just 9%, though this could certainly be optimized. Nonetheless, particularly for CNS-targeting drugs which usually need to be small in order to cross the blood brain barrier, the price might be worth paying.

Of course, even within this paper there are hundreds more scaffolds to look at than the three tested, and perhaps the researchers were lucky that their choices were biologically active. As computational methods continue to advance, it will be worthwhile turning them loose on GDB-17.

03 February 2025

Stitching together fragments with Fragmenstein

As we noted just last week, crystallography has unleashed a torrent of protein-ligand complexes, especially fragments. Historically a single structure might be used for fragment growing, but so many structures present an embarrassment of riches, with sometimes dozens of fragments that bind in the same region. Merging or linking these fragments can be done manually, as seen here and here, but how to do so when the binding modes are partially overlapping is not always intuitive. In a new open-access J. Cheminform. paper, Matteo Ferla and colleagues at University of Oxford and elsewhere describe an open-source solution called Fragmenstein.

We briefly described Fragmenstein in 2023, where it was used to combine pairs of low-affinity fragments bound to the Nsp3 macrodomain of SARS-CoV-2 to generate sub-micromolar inhibitors. The current paper describes the platform in detail.

Fragmenstein starts by taking two (or more) structures of fragments bound to a protein and virtually combining them. This is done by collapsing rings to their individual centroids, stitching these together along with their substituents, and then re-expanding the ring(s) so the substituents will be close to where they were in the initial fragments. This process produces a surprising array of molecules beyond the obvious. For example, if one fragment contains a phenyl ring and the other fragment contains a furan ring, the stitched molecule might just contain the phenyl (if the two rings bind in nearly the same position), a benzofuran (merging the rings), a phenyl ring linked to a furan by one or more atoms, or even a spiro compound if the rings are perpendicular to one another.

In silico approaches sometimes suggest molecules that are synthetically challenging to make, but Fragmenstein can also be used to find purchasable analogs.

Next, the new molecules are energetically minimized, first by themselves and then while docked into the protein. In contrast to other docking programs, which might allow molecules to sample thousands of different conformations and sites in a protein, Fragmenstein maintains the new molecule in a similar position and orientation to the initial fragments, with the assumption that these have already identified energetically favorable interactions.

The researchers successfully applied Fragmenstein retrospectively to several targets. The COVID Moonshot (which we discussed here) crowd-sourced molecule ideas for the SARS-CoV-2 main protease based on structures of bound fragments. Of 87 ligands that had been crystallographically characterized and were designed based on two fragments, Fragmenstein successfully (RMSD < 2 Å) predicted the binding mode for 69%.

Fragmenstein can even be used for covalent ligands, as shown for the target NUDT7, which we wrote about here. Merging two fragments led to compound NUDT7-COV-1, and the RMSD between the Fragmenstein model and the crystal structure was an impressive 0.28 Å.

Of course, as the researchers acknowledge, the number of possible analogs might be daunting, and deciding which to make or buy is not necessarily straightforward. Also, Fragmenstein assumes that the fragments themselves are making productive interactions with the protein, which may not be the case, as we suggested here. Still, the tool is open-source and worth trying, especially if you are swimming in crystal structures.

07 October 2024

Discovery on Target 2024

Last week Boston hosted CHI's 22^nd Annual Discovery on Target. With dozens of talks spread across seven or eight concurrent tracks over three days, and an additional day of pre-conference symposia, I’ll just touch on a few themes.

Computational Approaches

Artificial intelligence and machine learning were well represented. Brandon White described an ML model built at Axiom to predict liver toxicity, responsible for a quarter of clinical trial failures. As we noted last week, good ML models require lots of data, and Axiom has tested 50,000 small molecules in primary human hepatocytes from multiple donors using assays including high-content imaging. Just input a chemical structure and the model will predict toxicity. When run against the FDA’s database of drug-induced liver injury, the model performed with 74% sensitivity and 97% specificity, and even gave good dose predictions.

Woody Sherman (Psivant) laid out a series of “grand challenges for computers in drug discovery.” This is the working title for a publication he is spearheading to focus attention on key problems. They fall into five categories: chemistry (including synthesis, stability, and covalency), structure predictions (including protein-ligand structures, dynamics, and cryptic pockets), energetics (including affinity, selectivity, and kinetics), ADME (including everything from solubility and aggregation to bioavailability), and pharmacology (including toxicity). A sixth category, human considerations (including intellectual property and interpreting experimental data), is also being considered.

The success of AlphaFold to predict protein structures shows what computers can achieve, but in that case the effort was enabled by massive amounts of high-quality public data in the Protein Data Bank. Few of these challenges can draw on anything approaching the PDB. Indeed, even parameters as seemingly simple as solubility can change dramatically depending on crystal form and subtle changes to pH.

Because these computational challenges are so daunting, collecting them into one forum may prove salutary. And other categories may be worth including, such as target discovery. Woody is looking for co-authors, so reach out to him if you’re interested.

Covalent approaches

Covalent approaches to drug discovery have gone mainstream, at least if this conference is any indication. But they are not without risk: Doug Johnson (Biogen) described research implicating the piperidine acrylamide pharmacophore in approved BTK inhibitors with inhibition of ALDH1A1 and possible liver injury.

Several talks focused on methodologies. Alexander Federation (Talus) described data-independent acquisition (DIA) mass spectrometry methods, which can be more comprehensive than the more commonly used data-dependent acquisition (DDA) methods in identifying peptides in chemoproteomic studies, which we first discussed here. Talus is focused specifically on transcription factors.

As we noted earlier this year, Steve Gygi (Harvard) has been at the forefront of increasing the throughput of mass spectrometry methods, and he described how to increase the number of samples that can be analyzed simultaneously from 18 to 35. He also described two approaches, GoDig and CysDig, to look for up to 200 pre-specified proteins in a sample, ensuring identification of even low-abundance targets.

Turning to specific targets, Wai Cheung Adrian Chan described work done at Harvard to find covalent inhibitors against deubiquitinating enzymes (DUBs), reporting that screens of a small library of 178 covalent fragments in cell lysates found hits against several dozen DUBs. (We previously wrote about non-covalent USP7 inhibitors.)

Brooke Brauer described the optimization of a covalent inhibitor of Bfl-1 at AstraZeneca, an interesting oncology target. AZ has published some nice papers on this project which I’ll write about soon.

Last week we mentioned work Michelle Arkin and collaborators had done on 14-3-3 proteins, and Lynn McGregor described work done at Novartis on the same system. A screen of 6000 covalent compounds identified hits that modified a specific cysteine in 14-3-3 more rapidly in the presence of a peptide derived from the estrogen receptor. Stabilizing this interaction could be useful for treating certain cancers.

Not everyone is focused on cysteine: Andrea Zuhl described work done at Hyku Biosciences, which as the name suggests is targeting histidine, tyrosine, and lysine. This has necessitated building a fragment library of more than 6000 compounds, more than 70% of which are stable in buffer. Andrea presented one example targeting the catalytic lysine residue of the oncogenic ALK fusion protein, though the selectivity against other kinases was not disclosed.

All of these examples focused on covalent molecules in which the warhead is maintained during optimization. But as we first wrote about here, fully functionalized fragments (FFFs) contain a photoreactive moiety that reacts covalently with nearby proteins but is subsequently discarded. Sherry Niessen described how Belharra has industrialized this process by creating a library of about 11,000 FFF probes. Because of the low efficiency of protein crosslinking (typically <5%), most of the library consists of enantiomeric pairs to facilitate hit identification. Also, the average molecular weight of the library is around 350 Da, and these super-sized fragments tend to perform better than the strictly rule-of-three compliant molecules.

Covalent success stories

At least two presentations covered covalent fragment-based drug candidates. Shota Kikuchi (Vividion) described the discovery of VVD-214/RO7589831, a WRN inhibitor we wrote about earlier this year. As I speculated at the time, the cyclopropyl group was introduced to lower the reactivity of the vinyl sulfone warhead. Interestingly though, even early molecules were quite selective for WRN. Like sotorasib, binding is largely driven by the k_inact term of k_inact/K_i, again demonstrating that high reactivity for the target does not necessarily mean high chemical reactivity.

Finally, in his plenary keynote Steve Fesik (Vanderbilt University) covered multiple success stories, including the discovery of the KRAS^G12C inhibitor BI 1823911, which we wrote about here. Boehringer Ingelheim has since published molecules that hit multiple KRAS mutants as well as KRAS degraders, and Steve noted that all of these contain the same “squirrely-looking” fragment identified from SAR by NMR, an illustration of the power of fragment-based methods to explore new regions of chemical space.

I’ll close there, but please add your thoughts. There are is still at least one good conference coming up this year, and 2025 is quickly approaching.

30 September 2024

FBLD 2024

The FBLD meetings have always been calendar highlights. Starting in 2008, before Practical Fragments even existed, they have graced cities around the world in 2009, 2010, 2012, 2014, 2016, and 2018. The plan was for 2020 to be held in Cambridge, UK, but for obvious reasons that didn’t happen. Last week, Boston hosted a triumphant return of the event. With more than 30 talks and dozens of posters I’ll just touch on a few major themes.

Crystallography

High-throughput crystallography was prevalent, as befits its growing role in fragment finding. (If you haven’t yet voted in our methods poll on the right side of the page please do so!) Debanu Das (XPose Therapeutics) described how crystallographic screens of just a few hundred fragments identified hits against DNA-damage response proteins such as APE1; these have been advanced to high-nanomolar inhibitors with cell activity. And Andreas Pica described the ALPX platform that enabled screening >4000 hits from an HTS screen against PDEδ resulting in >500 structures.

The Diamond Light Source was a pioneer in developing high-throughput crystallography methods, and several speakers described continued progress. Blake Balcomb noted that since 2015 they have collected >240,000 datasets and identified >30,000 ligands. Of these, some 3750 have been deposited into the Protein Data Bank.

A crystallographic fragment hit is just the start, and Frank von Delft emphasized that “fragment progression is neither fast nor cheap.” His goal is to take a 100 µM binder to a 10 nM lead in less than a week for less than £1000. Toward this end he and his team are using rapid chemical synthesis and crude reaction screening along with various computational approaches and crowd-sourced science. The COVID Moonshot, which we wrote about here, is one model, and Diamond is trying to create a “Moonshot factory” to pursue other viral targets.

Computational Approaches

Computational methods are potentially the least expensive fragment-to-lead method, and these were well represented. One challenge is screening the massive chemical space represented by make-on-demand libraries, and Pat Walters (Relay) described how this can be done using Thompson Sampling, an active-learning method that traces its origins to 1933. Applied to lead discovery, the method involves breaking larger molecules into component fragments and iteratively searching for better binders. Pat showed that searching just 0.1% of a library of 335 million molecules consistently found 90% of the best hits.

Most computational methods rely on experimental data, and over the past 25 years Astex has generated >100 crystal structures on each of more than 40 targets, with >6600 bound fragments in total. Paul Mortenson described how these are being used to develop generative models, with chemists providing feedback on suggested molecules.

Artificial intelligence is the centerpiece of Isomorphic Labs, which has unfettered access to AlphaFold 3. Rebecca Paul described an example starting from a literature fragment in which the predicted affinities matched well with experiment – and the molecules were considerably more potent than those suggested by an experienced medicinal chemist.

Recognizing the need for experimental affinity data for fragments, Isomorphic worked with Arctoris to screen 5420 fragments against 65 kinases covering the diversity of the kinome. After carefully curating the data, including rescreening the actives at a different CRO, they found 485 fragments with an IC₅₀ of 300 µM or better. Interestingly, only about half of these fragments are known kinase binders.

Sandor Vajda (Boston University) suggested there may be limitations to machine learning models. He found that using AlphaFold 2 to find cryptic pockets was dependent on their representation in the PDB, with rare experimental states not being predicted. Sandor also proposed an interesting hypothesis that cryptic pockets created only by the movement of side chains are not very ligandable because the side chains move on such a rapid time scale that they effectively act as competitive inhibitors to ligands.

Success Stories

No FBLD meeting would be complete without success stories, and FBLD 2024 was no exception. Chaohong Sun noted that nearly 80% of the targets at AbbVie taken into fragment-based screening are novel. Of these, more than 80% yield actionable hits, though 44% are not pursued for a variety of reasons, including finding hits from other sources, hits at novel sites with no obvious function, and changes to the portfolio. Chaohong described a series of STING agonists that was taken forward to low nanomolar leads with in vivo activity.

Michelle Arkin (UCSF) described progress on creating molecular glues to link 14-3-3 proteins to the estrogen receptor, which we last wrote about here. Covalent binders to the 14-3-3 protein stabilize the interaction with ERα by more than 100-fold and show activity in cancer cell models.

Multiple talks focused on SARS-CoV-2 targets. Ashley Taylor (Vanderbilt) described fragment screens against the papain-like protease PL^Pro that led to both covalent and non-covalent inhibitors. James Fraser (UCSF) described how a massive crystallographic screen against the Nsp3 macrodomain Mac1 led to high nanomolar compounds, which we wrote about here. And Adam Renslo (UCSF) discussed the further optimization of Mac1 inhibitors to yield molecules that could protect mice from a fatal challenge of the virus.

A drawback of pursuing novel targets is that sometimes the biology proves uncooperative. Andrew Woodhead described a successful fragment screen at Astex against the oncology target elF4E that led to mid-nanomolar binders that could disrupt the protein-protein interaction with eIF4G in cells. Surprisingly, these molecules had no effect on cell viability, and a series of mutational and targeted-protein degradation experiments suggested that blocking a larger region of the protein-protein binding site might be necessary.

Drugs are the ultimate success stories, as David Rees reminded participants in “25 years of thinking small.” In addition to providing an overview of FBLD at Astex, David added up the sales of all seven FDA-approved fragment-derived drugs, which totals more than $3 billion. Harder to quantify—though infinitely more valuable—are the added years of life for patients with once-untreatable cancers. These numbers will only grow as the dozens of fragment-derived molecules in the clinic continue to advance.

I’ll close on that note. If you missed FBLD 2024, you’ll have another chance next year: FBLD 2025 is planned for Cambridge (UK) September 21-24 next year. Barring global pandemics.

17 June 2024

Fragments vs MAT2a: a chemical probe

As many of us know all too well, traditional methods to treat cancer often result in severe and even intolerable side effects. An emerging, gentler approach is based on synthetic lethality: targeting a protein that is essential only in certain cancer cells but not in normal cells. One prominent target is MAT2a, one of two human methionine adenosyltransferases. We’ve written previously about AG-270, a fragment-derived MAT2a inhibitor that entered the clinic. AstraZeneca has also pursued this target, as we discussed here. In a new J. Med. Chem. paper, Stephen Atkinson, Sharan Bagal, and their AstraZeneca colleagues describe a new chemical probe.

A differential scanning fluorimetry (DSF) screen of about 55,000 compounds at 100 µM, nearly a third of which were fragments, resulted in a healthy 1.5% hit rate. Further DSF as well as biochemical testing ultimately delivered compound 8, which is quite potent for a fragment. A crystal structure of the compound bound to MAT2a demonstrated that it bound in the same allosteric site targeted by other compounds. The methoxy group was pointed towards a couple backbone carbonyl oxygen atoms, and adding a couple fluorine atoms created a weak hydrogen bond donor with a satisfying 50-fold boost in potency.

Adding a hydrogen bond acceptor (compound 12) slightly reduced potency but also decreased lipophilicity. Further inspection suggested opportunities for fragment growing, and free energy perturbation (FEP) calculations suggested that adding the methoxyphenyl group of compound 15 would be fruitful. This turned out to be the case, and further optimization led to AZ’9567. The paper provides plenty of meaty medicinal chemistry, with significant efforts focused on reducing lipophilicity and clearance. FEP was used extensively during the design process, and a retrospective analysis found a good correlation between predicted and measured affinity.

AZ’9567 was studied in considerable detail. It has excellent oral bioavailability and good pharmacokinetics in both mice and rats. The compound does not significantly inhibit cytochrome P450 enzymes or hERG and is reasonably clean against a panel of 86 off-targets. The main liability is poor solubility, a problem also faced by AG-270. Nonetheless, the AstraZeneca researchers were able to develop a liquid formulation.

The paper compares AZ’9567 with AG-270, showing that both compounds are potent in biochemical assays as well as against cell lines in which MAT2a is essential. A mouse xenograft model with AZ’9567 showed considerable and sustained tumor growth reduction.

Unfortunately, AG-270 is no longer in clinical development, and there is no mention of a MAT2a inhibitor in the AstraZeneca pipeline. Nonetheless, having a second well-characterized chemical probe will be useful for further characterizing the biology of MAT2a and assessing whether it will be a productive drug target.

28 May 2024

Free computational fragment growing with ChemoDOTS

Back in 2018 we highlighted diversity-oriented target-focused synthesis, or DOTS, a combined computational and experimental method for growing fragments. The computational piece of this has now been turned into a free web server, called ChemoDOTS, and is described in Nucleic Acids Research by Xavier Morelli, Philippe Roche, and colleagues at Aix-Marseille University.

To get started, the user draws or uploads the structure of a fragment hit they wish to expand. ChemoDOTS identifies potentially reactive functionalities, such as amine groups. For each functionality, the program also provides compatible reactions, derived from a set of 58 commonly used in industry. The user then chooses one or more reactions of interest, at which point the program generates a list of molecules that could be created by linking the fragment to various building blocks using the selected chemistries. The building blocks themselves consist of 501,542 commercially available molecules from MolPort and 988,112 molecules from Enamine having between 4 and 24 non-hydrogen atoms.

The program generates molecules quite rapidly, between 1000-1500 per second. All of these can be downloaded at this point, but ChemoDOTS also allows further processing. Histograms showing molecular weight, cLogP, total polar surface area, the number of hydrogen bond donors and acceptors, and Fsp³ for the library are displayed, and the user can adjust sliders to select molecules having, for example, cLogP between 1 and 3 and 0-2 hydrogen bond donors. Finally, ChemoDOTS generates three dimensional conformers in a ready-to-dock format for each compound.

As a retrospective example, the researchers return to the BRD4 case study we wrote about here. Starting from the amine-containing fragment and the sulfonamidation reaction, ChemoDOTS generated 5546 molecules in just 5 seconds, including all 17 of those previously identified.

This is a nice approach, and I believe the researchers are correct when they say that to the best of their knowledge “ChemoDOTS is the only freely accessible functional and maintained web server to combine the design of medchem-compatible virtual libraries with an integrated graphical postprocessing analysis.” They plan to continue improving it, for example by adding new commercial building blocks from other sources.

If I could make one suggestion, it would be to include new types of chemistries beyond the 58, which came from a paper published in 2011. In particular, C-H bond activation methodologies have made impressive strides in recent years. Adding these is all the more important given that, according to a recent analysis, about 80% of successful fragment-growing campaigns involved growth from a carbon atom. But even in its current form, ChemoDOTS looks to be a useful approach for growing focused chemical libraries around fragment hits. Let us know how it works for you!

07 August 2023

Democratizing computational FBLD with BMaps

Computational approaches to FBLD continue to gain in power. For the most part, they require significant knowledge and installation of expensive, customized software. To remedy this, John Kulp, III and colleagues at Conifer Point Pharmaceuticals have introduced a new web-based application, BMaps, which they describe in a recent J. Chem. Inf. Mod. paper.

As the researchers note (and appropriately reference), there are more than a dozen virtual fragment-based design tools and another dozen web-based tools. BMaps (for Boltzmann Maps) aims to provide a full range of functions, from visualizing proteins, finding hot spots, docking fragments, and growing them. It also provides information on the energetics of bound water molecules, which as we’ve written can be crucial players in optimizing protein-ligand interactions.

Two key techniques used by BMaps are Grand Canonical Monte Carlo (GCMC) simulations and Simulated Annealing of Chemical Potential (SACP). The first entails comprehensive sampling of different fragment conformations on a protein of interest and assessing binding free energy. The second tool “forcefully inserts fragments into all the binding sites of the protein” and then removes them slowly to evaluate which are most difficult to remove, and thus most tightly bound. Together, GCMC-SACP can be used to evaluate fragment binding to any protein uploaded to the site from the protein data bank, AlphaFold, or any other source.

One nice feature of BMaps is a repository of several hundred proteins each with more than 100 fragment and water simulations. BMaps also contains a database of more than 4000 fragments, including MiniFrags. Users can import their own fragments or computationally deconstruct larger ligands. The paper itself is quite short, but the supporting information provides more guidance on how to use the software.

The researchers “aim to democratize the availability of accurate fragment and water maps,” a laudable goal. Most computational features are available with a free account, though with restrictions on the number of operations per month.

BMaps looks quite powerful and easy to use, but I do wish the researchers had included some full case studies, for example those used by the free FastGrow tool we highlighted last year. Try it out and let the community know what you think!

24 April 2023

RSC Medicinal Chemistry special FBDD issue

The Royal Society of Chemistry puts out RSC Med. Chem., and last year they asked David Rees (Astex), Anna Hirsch (Helmholtz Institute for Pharmaceutical Research Saarland), and me whether a special themed issue on FBDD would be useful for the community. Naturally we said yes, and the results have now been published. You can read our introduction here.

Unlike olden days, when special issues were bound between covers, this is a virtual special issue, with papers published over a period of several months. Indeed, we already wrote about two of them last year: one on combining DNA-encoded libraries (DEL) with FBLD and one on inhibitors of PRMT5/MTA. (Both of these were also topics at the CHI FBDD meeting earlier this month.) In the next few paragraphs we highlight the rest.

AstraZeneca has been doing FBDD since 2002, and has gained hard-won wisdom, some of which was shared in a 2016 review we wrote about here. After years of screening, their fragment library had started to deteriorate, so they rebuilt it entirely, as described by Simon Lucas and colleagues. Some of the starting fragments came from their previous library, but they also considered molecules from their larger collection. Rather than focusing on the rule of three, they developed their own multiparameter optimization function, “FragScore,” which incorporates logD_7.4, heavy atom count, number of rotatable bonds, and number of hydrogen bond donors. All compounds were inspected to make sure they would be synthetically tractable, and quality was assessed by SPR, NMR, redox activity, and solubility. The final set consists of 2741 fragments, with a subset of 1152 maximally diverse and attractive fragments for ligandability assessments or screening hard-to-make proteins. They also gathered 16,806 near neighbors for hit follow-up. So far the effort has paid off, with all four of the targets screened thus far yielding progressible hits. If you’re building or renovating a fragment library, you should read this paper.

Continuing on the theme of libraries, Bradley Doak, Martin Scanlon, and colleagues at Monash University describe their “MicroFrag” library, a set of 91 tiny (5-8 non-hydrogen atom) compounds similar to MiniFrags and FragLites. A crystallographic screen (at 1 M concentration!) of the MicroFrag library against the difficult E. coli target DsbA yielded a 52% hit rate, compared with a 2% hit rate with a conventional fragment library. Importantly, the MicroFrag screen identified the two main hot spots previously discovered from the conventional fragment library, along with ten others that may be less actionable. Interestingly, a crystallographic screen of 15 organic solvents at even higher concentrations (50-80%) was less informative: the primary hot spot did not distinguish itself from others. In the case of MicroFrags, not only did this hotspot bind the largest number of fragments, but all the molecular interactions seen for larger fragments were observed.

Fluorine NMR takes advantage of its own specialized library, the subject of a paper by Chojiro Kojima (Osaka University), Midori Takimoto-Kamimura (CBI Research Institute) and collaborators from several institutions. The researchers describe the construction of a 220-member library divided into pools of 10-21 compounds. This library was screened against four diverse proteins, yielding between 3 and 16 hits. The three hits against FKBP were characterized in more detail, including two-dimensional NMR and isothermal titration calorimetry. The researchers also discuss using ¹⁹F STD experiments to determine the binding mode of bound fragments.

Fluorine is not the only halogen of interest for library design. We’ve previously described the halogen-enriched fragment library (HEFLib, here and here), which consists of chlorine, bromine, and iodine-containing molecules. Frank Boeckler and collaborators at Eberhard Karls Universität Tübingen and the Max Planck Institute describe screening this library against the Y220C mutant of p53 in an expansion of work they first described back in 2012. Of 14 hits identified by thermal shift or STD NMR, ten confirmed by two-dimensional ¹H-¹⁵N-HSQC NMR. Four of these bound in the cleft created by the Y220C oncogenic mutation. Two other fragments turned out to be covalent binders, though they reacted with more than one cysteine residue. Although all the fragments have low affinities, they could potentially serve as starting points for optimization.

An ongoing debate is whether there is an advantage to screening more “three dimensional” fragments as opposed to planar aromatic fragments. If your taste tends towards the former, the synthetic chemistry can get tricky. According to an analysis we highlighted last year, the piperidine ring is the third most common scaffold found in drugs. Now, Peter O’Brien (University of York) and an international group of collaborators report efficient synthetic routes to all 20 cis- and trans-piperidines substituted with a methyl group and a methyl ester. A virtual library of 80 compounds in which the secondary amine is capped with simple substituents such as methyl or acetyl groups was found to be quite shapely, particularly compared with the disubstituted pyridyl starting materials. Moreover, the fragments are still reasonably sized, with no more than 15 non-hydrogen atoms and ClogP values < 2.

Machine learning is gaining prominence everywhere, not least in drug discovery. In 2021 we highlighted an “autoencoder” designed for constructing fragment libraries biased towards “privileged” fragments more likely to generate hits. However, the method required considerable programming savvy. Now Angelo Pugliese (BioAscent) and collaborators at the Beatson Institute have implemented their model in the open-source KNIME platform, making it accessible to a wider range of researchers. As an example they use the method to construct a GPCR-focused fragment library, with the structures of all the members provided in the supporting information.

On the subject of fragment libraries, please make sure to vote in our 6-question poll on library design (right side of page; you may need to scroll up).

Not all the papers in this special issue involve library design. Marko Hyvönen, David Spring, and collaborators at University of Cambridge and National University of Singapore describe allosteric inhibitors of the kinase CK2α, which has been implicated in cancer cell survival. We highlighted some of their work against this target in 2017, in which they used fragment linking to find high nanomolar inhibitors of the enzyme. In the new paper, the researchers describe additional fragment binders at the so-called αD pocket, distant from the ATP-binding site. Virtual screening for analogs led to a fragment with mid-micromolar activity in biochemical and cell assays, and fragment merging led to low micromolar inhibitors.

This is a nice collection of papers, and for those of you without easy literature access make sure to check them out soon: for the next six months all of them are free to read after free RSC registration. Enjoy!

16 May 2022

SAMPL7: Epic computational fail or just no solution?

Every few years computational chemists are invited to compete in the Statistical Assessment of Proteins and Ligands (SAMPL) challenges. Researchers are asked to solve a problem for which the solution is known but not yet published; this blinded format allows a more rigorous test of methods than the typical retrospective studies. SAMPL7 focused on fragments binding to proteins, and the results have been published (open access) in J. Comp. Aided Mol. Des. by Philip Biggin and collaborators at University of Oxford and elsewhere.

The subject of this challenge was PHIP, a multidomain protein implicated in insulin signaling and tumor metastasis, though the biology is a bit complicated. PHIP contains two bromodomains, small modules that act as epigenetic readers by binding to acetylated lysine residues (Kac), and the researchers chose to focus on the second bromodomain (PHIP2). Bromodomains have proven to be highly ligandable, though this one is unusual in having a threonine in place of a conserved asparagine.

The experimental results that contestants were challenged to predict came from fragment screening using high-throughput crystallography at Diamond Light Source’s XChem. PHIP2 crystals diffracted to high resolution (1.2 Å) and were soaked with 20 mM fragment for 2 hours at 5 °C. In total 799 fragments were screened: 768 from the DSI-poised library (see here) and 31 FragLites (see here). The team took great pains to gather high-quality data, screening the FragLites twice and re-soaking 202 fragments that produced poor R factors or resolution worse than 2 Å. This resulted in 52 hits, a hit rate of 6.5%, consistent with the 2-15% typically seen at XChem. Most (47) of these were in the Kac-binding site, and these were the focus of the SAMPL7 challenge.

The first task was for modelers to simply predict which of the 799 fragments bound and which did not. Full experimental details were provided, including pH and the crystallization conditions. Entrants were given 1 month. There were eight submissions plus a control, which randomly selected compounds as binders or non-binders. Most of the contestants used some sort of docking strategy; details are provided in the paper.

Shockingly, none of the submissions scored better than random. Three of the entrants failed to correctly identify a single binder, and four identified between 1 and 5 of the 47.

The second task was to predict the binding modes of the crystallographically identified ligands. Contestants were provided with the 47 hits and asked to submit up to five poses for each. Perhaps stung from their performance on the first task, or perhaps put off by the two-week requested turnaround time, only five groups submitted entries.

Performance was assessed by calculating the root mean square deviation (RMSD) between the experimental and docked structure(s), with RMSD ≤ 2 Å considered successful. Despite this fairly lenient cutoff, “the performance of the methods was disappointing.” The best scored 24%, while two methods scored 2% and 0%. I’ll leave it to chemists to opine whether even a 24% success rate for docking would give confidence to embark on analog synthesis.

The third task was to select follow-up molecules from a large database for experimental validation, but alas “the COVID-19 pandemic resulted in a diversion of funds before this follow-up study could be done.” Nonetheless, four intrepid groups submitted entries, and these are discussed in the paper.

Taken at face value, this is downright damning for computational chemists. It is also at odds with many nice success stories, for example those described at last month’s DDC conference. So what’s going on?

For one thing, not everyone paid attention to the information provided. The crystals were at pH 5.6, but some of the entrants nonetheless assumed pH 7.4.

This raises a second and more important point. As the researchers acknowledge, “there is the possibility that our fragments do not necessarily bind in solution, whereas scoring functions are almost always calibrated and validated against solution and structural data.” In other words, perhaps the fragments were not identified computationally because they only bind extremely weakly to a crystalline protein soaking in dilute acid.

This highlights perhaps the biggest drawback of fragment screening by crystallography: no matter how beautiful the structure may appear, you get no measure of affinity. Indeed, a paper we highlighted last year was able to confirm binding by NMR for only a minority of crystallographically identified fragments against the SARS-CoV-2 main protease. This does not mean that the crystal structures are “wrong,” but the ligands may be so weak as to be unadvanceable.

A picture can be worth a thousand words, but it can also be misleading. Advancing fragments is best done with the help of multiple orthogonal methods.

29 November 2021

DeepFrag: fragment optimization by machine learning

Machine learning is becoming increasingly common in drug discovery. Just a few months ago we highlighted its use to design a library of privileged fragments. However, constructing a library is usually done infrequently (though continued renovation of a library is always a good idea). In two papers from earlier this year, Jacob Durrant and colleagues at University of Pittsburgh use machine learning to tackle the more common task of lead optimization.

The first paper, in Chem. Sci., describes DeepFrag, a “deep convolutional neural network for fragment-based lead optimization.” The researchers started with the Binding MOAD database, a collection of nearly 39,000 high-quality protein-ligand complex structures from the Protein Data Bank. Ligands were computationally fragmented by chopping off terminal appendages less than 150 Da. The fragments were then converted into molecular fingerprints encoding their structures. Meanwhile, the protein region around each ligand was converted into a three-dimensional grid of voxels, akin to how images used for computer vision training are processed.

The researchers describe the goal as follows. “We propose a new ‘fragment reconstruction’ task where we take a ligand/receptor complex, remove a portion of the ligand, and ask the question ‘what molecular fragment should go here.’”

About 60% of the data were used in a training model for the machine learning algorithm. This was then evaluated on 20% of the data and further refined before the final evaluation on the remaining 20% of the data. The details are beyond the scope of this post (and frankly beyond me as well) but DeepFrag recapitulated known fragments about 60% of the time. Importantly, the model worked for diverse types of fragments, including both polar and hydrophobic examples. Even “wrong” answers were often similar to the “correct” responses, for example a methyl group instead of a chlorine atom. In some cases where DeepFrag’s predictions differed from the original ligand the researchers note that these may be acceptable alternatives, a hypothesis supported by subsequent molecular docking studies.

Of course, the goal for most of us is not to recapitulate known ligands but to optimize them, so the researchers applied DeepFrag to crystallographically identified ligands of the main protease from SARS-CoV-2. Many of them docked well, though they have yet to be synthesized and tested.

Laudably, the model and source code have been released and can be accessed here. However, as these require a certain amount of computer savvy to use, Harrison Green and Jacob Durrant have also created an open-source browser app which is described in an open-access application note in J. Chem. Inf. Mod.

The browser app runs entirely on a local computer, without requiring users to upload possibly sensitive data. The application note describes using the app to recapitulate an example from the original paper. It also describes using it on a fragment bound to antibacterial target GyrB, a fragment-to-lead success story we blogged about last year. DeepFrag correctly predicted some of the same fragment additions that were described in that paper.

The app is incredibly easy to use: just load a protein and ligand (from a pdb file, for example) and the structure appears in a viewer. Click the “Select Atom as Growing Point” button, choose an atom, and hit “Start DeepFrag.” The ranked results are provided as SMILES strings and chemical structures, and the coordinates can also be downloaded. You can also delete atoms before growing if you would like to replace a fragment.

In my own cursory evaluation, DeepFrag correctly suggested adding a second hydroxyl to the ethamivan fragment bound to Hsp90 (see here). It did not suggest an isopropyl replacement for the methoxy group, but it did suggest methyl. Trying a newer example unlikely to have been part of the training set did not recapitulate the ethoxy in the BTK ligand compound 18 (see here), but did suggest a number of interesting and plausible rings. Calculations took a few minutes on my aging personal Windows laptop using Firefox.

In contrast to the hyperbolic claims too often seen in the field, the researchers conclude the Chem. Sci. paper modestly: “though not a substitute for a trained medicinal chemist, DeepFrag is highly effective for hypothesis generation.”

Indeed – I recommend playing around with it. We may still be some way from SkyFragNet, but we’re making progress.

01 June 2021

New fragments suggested by machine learning

Machine learning has become a hot new thang in drug discovery, attracting massive attention and investment. While easy to parody, artificial intelligence techniques are becoming increasingly powerful. A new paper in J. Chem Inf. Mod. by Angelo Pugliese and colleagues at the Beatson Institute applies the methodology to generate a new fragment library.

Machine learning entails collecting large amounts of data, passing that through various neural networks, and obtaining recommendations. In this case, the researchers wanted to generate “privileged fragments” that would hit in multiple assays. (Of course, the idea would be to make genuinely privileged fragments, such as 4-azaindole, rather than PAINS.) The researchers used a training set of 66 fragments that hit in at least three of 25 screens done at the Beatson, for which the average hit rate was 2.18%.

First though, the researchers needed to teach their model how to generate chemically valid fragments in the first place (for example, fewer than 5 bonds to carbon). To do this they used both SMILES (simplified molecular-input line-entry system) and chemical fingerprints from a set of 486,565 commercially available fragments. They then combined this model with the privileged fragments. Extensive details are provided; as they go well beyond my expertise I won’t even attempt to summarize them. (For example, “the classifier for the smi2smi model comprised sequential 64-unit and 32-unit dense ReLU layers followed by a single sigmoid output neuron.”) At the end of the exercise, and after triaging by medicinal chemists, the researchers came up with a set of 741 fragments.

What are their overall properties? For one thing, generated fragments tend to be more planar (as assessed by PBF) and have lower Fsp³ values than the nearly half-million fragments used for training. The researchers acknowledge that this could reflect the historical composition of the Beatson fragment library, although as we argued here it could also be true that flatter fragments just give higher hit rates.

Molecular complexity is a fundamental but poorly defined aspect of fragment-based lead discovery, and the researchers have come up with their own metric, called feature complexity (FeCo), which incorporates information on rotatable bonds, numbers of halogens, hydrogen bond donors and acceptors, charged groups, aromatic rings, and hydrophobic elements, all normalized by the number of heavy atoms. Hopefully this will be explored more fully in a dedicated publication.

What do the individual fragments actually look like? Five examples are shown in the paper, and nearly 200 more are provided in the supporting information. Below are seven chosen arbitrarily from that list (sampling every 30 structures).

Of course, the question remains as to whether these fragments will truly turn out to be privileged. As might be expected given the vastness of chemical space, only 78 of the 741 are commercially available. The researchers note that they are acquiring some of these, and it will be interesting to see how they perform in the screens to come.

30 December 2018

Review of 2018 reviews

As 2018 recedes into history, we are using this last post of the year to do what we have done since 2012 – review notable events along with reviews we didn’t previously cover.

This was a busy year for meetings, starting in January with a FragNet event in Barcelona, then moving to San Diego in April for the annual CHI FBDD meeting. Boston saw an embarrassment of riches, from the first US-based NovAliX meeting, to a symposium on FBDD at the Fall ACS meeting, followed closely by a number of relevant talks at CHI’s Discovery on Target. Finally, the tenth anniversary of the renowned FBLD meeting returned to San Diego. Look for a schedule of 2019 events later this month.

If meetings were abundant, the same can be said for reviews.

Lead optimization

Writing in J. Med. Chem., Dean Brown and Jonas Boström (AstraZeneca) asked “where do recent small molecule clinical development candidates come from?” For three quarters of the 66 molecules published in J. Med. Chem. in 2016 and 2017 the answer is from known compounds or HTS, though fragments accounted for four examples. Although average molecular weight increased during lead optimization, lipophilicity did not, suggesting the importance of this parameter.

The importance of keeping lipophilicity in check is also emphasized by Robert Young (GlaxoSmithKline) and Paul Leeson (Paul Leeson Consulting) in a massive J. Med. Chem. treatise on lead optimization. Buttressed with dozens of examples, including several from FBLD, they show that the final molecule is usually among the most efficient (in terms of LE and LLE) in a given series, even when metrics were not explicitly used by the project team. Perhaps with pedants like Dr. Saysno in mind, they also emphasize the complexity of drug discovery, and note that “seeking optimum efficiencies and physicochemical properties are guiding principles and not rules.”

Lipophilic ligand efficiency (LLE) is also the focus of a paper in Bioorg. Med. Chem. by James Scott (AstraZeneca) and Michael Waring (Newcastle University). This is based largely on personal experiences and provides lots of helpful tips. Importantly, the researchers note that calculated lipophilicity values can differ dramatically from measured values, and go so far as to say that “this variation is sufficient to render LLEs derived from calculated values meaningless.”

Turning wholly to fragments, Chris Johnson and collaborators (including yours truly) from Astex, Carmot, Vrije Universiteit Amsterdam, and Novartis have published an analysis in J. Med. Chem. of fragment-to-lead success stories from last year. This review, the third in a series, also summarizes all 85 examples published between 2015 and 2017, confirming and expanding some of the trends we mentioned last year.

Targets

Two reviews focus on specific target classes. Bas Lamoree and Rod Hubbard (University of York) cover antibiotics in SLAS Discovery. After a nice, concise review of fragment-finding methods, the researchers discuss a number of case studies, many of which will be familiar to regular readers of this blog, including an early example of whole-cell screening.

David Bailey and collaborators from IOTA and University of Cambridge discuss cyclic nucleotide phosphodiesterases (PDEs) in J. Med. Chem. The researchers provide a good overview of the field, including mining the open database ChEMBL for fragment-sized inhibitors. As they point out, the first inhibitors discovered for these cell-signaling enzymes were fragment-sized, so it is no surprise that FBLD has been fruitful – see here for an example from earlier this year. Interestingly though, although at least six fragment-sized PDE inhibitor drugs have been approved, none of these were actually discovered using FBLD.

PDEs are an example of “ligandable” targets, for which small molecule modulators are readily discovered. In Drug Discovery Today, Sinisa Vukovic and David Huggins (University of Cambridge) discuss ligandability “in terms of the balance between effort and reward.” They use a published database of protein-ligand affinities to develop a metric, LIG_exp, for experimental ligandability, and also describe their computational metric, Solvaware, which is based on identifying clusters of water molecules binding weakly to a protein. Comparisons with experimental data and with other predictive metrics, such as FTMap, reveal that while the computational methods are useful, there is still room for improvement.

We have previously written about how target-guided synthesis methods such as dynamic combinatorial chemistry have – despite decades of research – yielded few truly novel, drug-like ligands. Is this because the targets chosen were simply not ligandable? In J. Med. Chem., Anna Hirsch and collaborators at the University of Groningen, the Helmholtz Institute for Pharmaceutical Research, and Saarland University review some (though by no means all) published examples and examine their computationally determined ligandability scores. There seems to be no difference between these targets and a set of traditional drug targets.

Finding fragments

Crystallography continues to be a key tool for FBLD: as we noted in the review of the 2017 literature, 21 of the 30 examples made use of a crystal structure of either the starting fragment or an analog, and only 3 projects didn’t use crystallography at all. That said, FBLD is possible without crystallography, as illustrated through multiple examples in a Cell Chem. Biol. review by Wolfgang Jahnke (Novartis), Ben Davis (Vernalis), and me (Carmot).

In the absence of a crystal structure, NMR is best suited for providing structural information, and this is the subject of a review in Molecules by Barak Akabayov and colleagues at Ben-Gurion University of the Negev. The researchers provide a nice summary of NMR screening methods and success stories within a broader history of FBLD. They also include an extensive list of fragment library providers as well as a discussion of virtual screening.

Speaking of virtual screening, three reviews cover this topic. In Methods Mol. Biol., Durai Sundar and colleagues at Indian Institute of Technology Delhi touch on a number of computational approaches for de novo ligand design, though the lack of structures sometimes makes it challenging to read. A broader, more visually appealing review is published in AAPS Journal by Yuemin Bian and Xiang-Qun Xie at University of Pittsburgh. In addition to an overview and case studies, the researchers also provide a nice table summarizing 15 different computational programs. One of these, SEED, is a main focus of a review in Eur. J. Med. Chem. by Jean-Rémy Marchand and Amedeo Caflisch (University of Zürich). The researchers describe how this docking program can be combined with X-ray crystallography (SEED2XR) to rapidly identify fragments; we highlighted an example with a bromodomain. Their ALTA protocol uses SEED to generate larger, more potent molecules, as we described for the kinase EphB4. The researchers note that together these protocols have led to about 200 protein-ligand crystal structures deposited in the PDB over the past five years.

Rounding out methods, Sten Ohlson and Minh-Dao Duong-Thi (Nanyang Technological University) provide a detailed how-to guide in Methods for performing weak affinity chromatography, and how this can be combined with mass spectrometry (WAC-MS), as we noted last year.

Chemistry

One drawback of some computational approaches for fragment optimization is that they do not consider synthetic accessibility. In Mol. Inform., Philippe Roche, Xavier Morelli, and collaborators at Aix-Marseille University and Institut Paoli-Calmettes focus on hit to lead approaches that do, and provide a handy table summarizing nearly a dozen computational methods. We highlighted one from the authors, DOTS, earlier this year.

DOTS is an example of using DOS, or diversity-oriented synthesis. In Front. Chem., David Spring and colleagues at University of Cambridge review recent applications of DOS for generating new fragments, some of which we recently highlighted. Only a couple examples of successfully screening these new fragments are described, but the authors note that this is likely to increase as virtual library screening continues to advance.

Perhaps the most productive fragment of all time is 7-azaindole, the origin of three fragment-derived clinical compounds. (The moiety appears in both approved FBLD-derived drugs, vemurafenib and venetoclax.) Takayuki Irie and Masaaki Sawa of Carna Biosciences devote their attention to this little bicycle in Chem. Pharm. Bull. The researchers count six clinical kinase inhibitors that contain 7-azaindole (not all from FBLD) as well as more than 100,000 disclosed compounds containing the fragment. More than 90 kinases have been targeted by molecules containing 7-azaindole, and the paper provides a list of 70 PDB structures of 37 different kinases bound to molecules containing the moiety.

Finally, in J. Med. Chem., Brian Raymer and Samit Bhattacharya (Pfizer) survey the universe of “lead-like” drugs. Among the most highly prescribed small molecule drugs, 36% have molecular weights below 300 Da. Only 28 of 174 drugs approved between 2011 and 2017 fall into this category, consistent with the increasing size of newer drugs. The researchers discuss 16 recently approved drugs, and find that 13 have very high ligand efficiencies (at least 0.4 kcal mol^-1 per heavy atom). As noted above, optimization often entails adding molecular weight by growing or linking, and the researchers suggest that alternative strategies such as conformational restriction and truncation also be investigated.

And with that, Practical Fragments wishes you a happy new year. Thanks for reading some of our 686 posts over the past decade plus, and please keep the comments coming!

28 November 2016

How do cryptic pockets form?

Earlier this year we highlighted crystallographic work out of Astex showing that secondary ligand binding sites on proteins are common; in addition to an active site, an enzyme may have several other pockets capable of binding small molecules. Many of these secondary sites are present even in the absence of a ligand. But there are also “cryptic” binding pockets that only appear when a ligand is bound. These are the subject of a new paper in J. Am. Chem. Soc. by Francesco Gervasio and collaborators at University College London and UCB Pharma.

Cryptic pockets are appealing in part because they can salvage an otherwise unligandable target: a featureless flat surface involved in a protein-protein interaction may crack open to reveal a crevasse capable of binding small molecules. Finding these pockets computationally, though, is difficult. In the current paper, the researchers performed molecular dynamics simulations on three different proteins with known cryptic pockets, and the pockets remained mostly closed over hundreds of nanoseconds. Increasing the temperature didn’t help, and even when the simulations were started with structures of the protein-small molecule complexes (with the small molecules removed), the pockets quickly slammed shut. Further calculations suggested that the open forms of the proteins are thermodynamically unstable.

The nice thing about computational approaches is that – unlike Scotty – you can change the laws of physics. In this case, the researchers changed the simulated water molecules to be more attractive to carbon and sulfur atoms in the proteins. (They call this SWISH, for Sampling Water Interfaces through Scaled Hamiltonians). This caused the known cryptic sites to open up during molecular dynamics simulations, even in the absence of ligand.

Next, the researchers added very small fragments (such as benzene), and found that these caused the cryptic pockets to open even further. The researchers speculate that this might reflect how cryptic pockets form in the real world: a ligand could worm its way into a transient pocket, stabilizing it and exposing more room for another ligand (or a different part of the first ligand) to bind.

Of course, just because something shows up in silico doesn’t make it real; how do you avoid false positives? Once the researchers found cryptic pockets using “enhanced” water, they reran simulations using standard parameters to see which pockets remained. The researchers found that subtracting the “density” of fragments bound in a conventional molecular dynamics simulation from the density of fragments in a SWISH simulation causes minor, irrelevant pockets to disappear for their three test proteins, leaving only the known cryptic pockets. Running this subtraction experiment on the protein ubiquitin caused a couple weak superficial pockets to disappear, consistent with the absence of cryptic pockets in this protein.

SWISH is an interesting approach, and I look forward to seeing how it compares with other programs, such as Fragment Hotspots and FTMap. It would also be fun to apply SWISH prospectively to therapeutically important but currently undruggable targets to see whether it is worth taking another look at some of them.