Practical Fragments: crystallography

Showing posts with label crystallography. Show all posts

25 August 2025

Fragments vs KEAP1: Fragment growing this time

Kelch-like ECH-associated protein 1 (KEAP1) binds to nuclear factor erythroid 2-related factor 2 (NRF2), targeting it for degradation. Blocking this interaction has anti-inflammatory effects, and indeed the approved drugs dimethyl fumarate and omaveloxolone are believed to act in part through this mechanism. But those drugs hit a lot of other targets, and more specific molecules have long been sought; we wrote about one in 2016 and another in 2021. In an open-access paper just published in Angew. Chem. Int. Ed., Anders Bach and an international team of collaborators at University of Copenhagen and elsewhere describe a new chemical series.

As in the 2016 paper, the researchers started with a crystallographic screen, in this case using the 768-member DSI-poised library, which we wrote about here. This resulted in 80 hits, all binding in the so-called Kelch pocket, which has previously been targeted. Thirteen of these bound in the central region, and compound 1 showed modest but measurable affinity by SPR.

All previously reported non-covalent high-affinity KEAP1 ligands contain at least one acidic moiety to interact with arginine residues in the protein, so the researchers used structure-based design to add carboxylic acids, resulting in compound 4, with low micromolar affinity. This molecule, unlike the initial fragment, could also block the KEAP1-NRF2 interaction in a fluorescence polarization assay.

Building into a hydrophobic sub-pocket yielded compound 12, and adding strategically placed hydrogen-bond acceptors led to further improvements in affinity, ultimately leading to compound 28, with low nanomolar activity. Crystallography revealed that these molecules bound in a similar fashion as the initial fragment.

Compound 28 and related molecules were tested in a variety of assays. They were selective for KEAP1 over 15 other human Kelch domains in a thermal shift assay. Compound 28 activated NRF-2 regulated cytoprotective genes and decreased inflammatory markers in multiple cell lines. It also displayed RNA expression profiles similar to those of other reported non-covalent KEAP1 inhibitors. Cellular potency in some of these assays was as good as 60 nM.

This is a nice fragment-to-lead story, though no ADME or DMPK data are reported, and the combination of relatively high molecular weight, negative charge, and lipophilicity suggest that permeability and oral bioavailability may be challenging. Indeed, the researchers note that no non-covalent KEAP1-NRF2 inhibitors have entered the clinic. Perhaps this target is better suited for covalent inhibitors, preferably ones more selective than dimethyl fumarate. More on those later.

21 July 2025

How can we house our crystallographic data?

Three years ago we highlighted a growing debate about how and where to house crystallographic fragment data. With the recent surge in high-throughput crystallography, issues including access, accuracy, and capacity have only become more urgent. An open-access perspective in Nat. Comm. by Manfred Weiss (Helmholtz-Zentrum Berlin) and multiple coauthors, including yours truly, calls on the scientific community to make some difficult decisions. Indeed, a session at the 75^th annual meeting of the American Crystallographic Association going on today is devoted to the topic.

High-throughput crystallography can involve soaking more than 1000 crystals with fragments, sometimes yielding hundreds of protein-ligand structures. The paper tabulates a dozen synchrotrons around the world with current or planned high-throughput capabilities. We’ve written recently about the XChem facility at the Diamond Light Source, which is currently running about 80 fragment screens per year. Assuming similar productivity at the other synchrotrons, we might soon see 1000 fragment campaigns per year worldwide. If each of these involves 1000 crystals and we get 10% hit rates, that could mean 100,000 new fragment structures annually.

That is a big number. For reference, 10,000 new crystal structures are currently being released by the protein data bank (PDB) each year. (Director of the RCSB PDB Stephen Burley is one of the authors of the perspective.)

The problem is that, as we discussed in the 2022 blog post, most fragment structures from high-throughput screens are not refined to the level required for the PDB, a process which typically takes a day or two for the researcher and up to 3 hours by a biocurator at the PDB. Moreover, fragments are often identified using PanDDA (Pan-Dataset Density Analysis, which we wrote about here), a process which makes use of the many unbound structures obtained in a dataset. Ideally, these datasets should also be made available.

The challenge is balancing practicality with FAIR (Findable, Accessible, Interoperable, and Reusable) principles. The paper outlines four non-exclusive options. Very briefly, these are:

Option One: Fully refine and deposit all protein-fragment structures just as with other structures.

Option Two: Partially refine structures, and possibly flag or even segregate them from other structures in the PDB.

Option Three: Rather than treating each protein-ligand structure independently, treat each high-throughput screen as a single experiment, and archive all of the data in its entirety, including unbound structures. These data could be housed in the PDB or elsewhere.

Option Four: A hybrid approach, where fully refined structures would be deposited in the PDB and the rest of the data would be stored in a separate branch of the PDB or elsewhere entirely.

There are pros and cons for each option. At the extremes, the first option puts a tremendous burden on experimentalists and the PDB, and potentially valuable information regarding unbound structures is lost, while option three requires setting up new repositories to store vast quantities of data.

The paper intentionally avoids making a specific recommendation and instead calls for discussion within the scientific community. Personally, I favor some sort of hybrid approach such as option four. As the paper notes, no one could have foreseen AlphaFold2 when the PDB was launched in 1971. Over the next decade researchers around the world are likely to generate hundreds of thousands of protein-fragment structures. I don’t pretend to know what the artificial intelligence tools of the future will be able to make of such data, but I hope they will have access.

What do you think?

19 May 2025

Crystallography first in fragment optimization: Binding-Site Purification of Actives (B-SPA)

At FBLD 2024, Frank von Delft (Diamond Light Source) announced the ambitious goal of taking a 100 µM binder to a 10 nM lead in less than a week for less than £1000. Fragment to lead optimization usually takes longer, as dozens or even hundreds of compounds need to be synthesized and tested. One way to speed things up is through “crude reaction screening,” otherwise known as “direct to biology,” in which unpurified reaction mixtures are tested directly. In a new (open-access) Angew. Chem. Int. Ed. paper, Frank, John Spencer, and collaborators at University of Oxford, University of Sussex, and Creoptix apply this approach to crystallographic screening.

The researchers were interested in the second bromodomain of Pleckstrin Homology Domain-Interacting Protein, or PHIP(2), an oncology target. As we discussed in 2016, they had previously run a crystallographic screen and identified multiple hits, including F709, which, despite having no measurable affinity, had good electron density and multiple vectors for optimization. Six separate libraries based on this fragment were constructed, with between 58 and 1024 targeted small-molecule products per library and up to four steps done without purification.

One challenge for crude reaction screening is assessing whether or not a reaction has actually generated product. Typically this is done by analytical liquid chromatography mass spectrometry (LCMS), but analyzing results manually is tedious. Fortunately academics have graduate students and postdocs, and it was presumably these intrepid souls who spent 17 days analyzing the 1876 small-molecule products attempted.

I can say from personal experience that spending hours perusing LCMS chromatograms is not enjoyable, so the researchers built an automated tool called MSCheck, which appears to be freely available here. This showed 83% agreement with the manually curated data, and even identified additional true positives that had been missed. All together 1077 of the reaction mixtures had the desired product, with success rates for the various libraries ranging from 39% to 97%.

The successful reactions were soaked into crystals and screened, and nearly 90% of these generated usable data. A total of 29 crystals had interpretable density in the ligand binding site: 7 were starting materials and 22 were desired products. Of the products, 19 bound with the piperazine core in a similar position as the initial fragment, while three bound in an alternate manner.

Of course, the whole point of this exercise is to find improved binders, so the researchers tested pure versions of each of the 22 crystallographic hits in two different assays. Only compound PHIP-Am1-20 had measurable affinity, with modest ligand efficiency.

This is not the first example of crude reaction screening by crystallography; we wrote about REFiL_x and a related technique in 2020. In one of those papers, the crude reaction mixtures were assessed by SPR as well as crystallography, which revealed that the crystallographic screen missed some binders, and there is no reason to think the same did not happen here. Indeed, molecules that bind tightly in a different conformation may be more likely to shatter the crystal lattice and thus go undetected.

The researchers state that for non-crystallographic crude reaction screening “only strong assay readouts are informative.” But is this bug, or a feature? A 2019 publication that used crude reaction screening to identify KRAS ligands (which I wrote about here) used an assay cascade to quickly select the most potent hits. Even the fastest crystallographic screens can’t compete with plate-based assays in terms of speed.

Perhaps PHIP(2) is a particularly challenging test case. As we discussed in 2022, multiple computational screens performed poorly in predicting crystallographic binding modes of ligands for this protein. But as I wrote at the time, it may be that many crystallographic ligands are just too weak to be useful.

Although there is a strong case for using crystallography first for finding fragments, I am not yet convinced the same applies for optimizing fragments.

17 March 2025

Fragments vs eIF4E: a chemical probe

Cancer cells are known for growing and multiplying quickly, and to do so they need to produce large amounts of protein. The rate determining step in protein translation happens early, when ribosomes are recruited to the 5’-end of mRNA by the eukaryotic initiation factor 4F (eIF4F) complex. This complex has long been a target for drug discovery, and in a recent open-access Nat. Comm. paper Paul Clarke, Andrew Woodhead, Caroline Richardson, and collaborators at Institute of Cancer Research and Astex describe a chemical probe. (Andrew spoke about this program last year at FBLD 2024.)

The eIF4F complex includes three core proteins, confusingly named eIF4E, eIF4G, and eIF4A. eIF4E binds to the 5’cap of mRNA and recruits eIF4G. Blocking the interaction of eIF4E either with mRNA or eIF4G could in principle shut down protein synthesis, but intensive efforts by multiple groups have struggled: the mRNA binding site is very polar, and disrupting protein-protein interactions is tough. Thus, the researchers took a fragment approach.

Developing a form of eIF4E suitable for fragment screening was itself a challenge because the protein mostly exists as part of a complex in cells and the native monomer is unstable. After making more than two dozen different constructs, the researchers developed a stable, soluble form that could be crystallized. This construct was screened against a library of 1371 fragments in pools of four, each at 500 µM, using CPMG NMR followed by crystallography, leading to 50 hits. A few bound at the mRNA cap-binding site but most bound to a previously unreported “site 2,” which is near where eIF4G binds.

One of these, compound 1, has a reasonable ligand efficiency despite its low affinity as assessed by ITC. The phenol appeared to be making no interactions and so was removed. Adding a fluorine usefully enforced the twisted biaryl conformation and filled a small dimple; fragment growing then led to mid micromolar compound 3. Further growing to pick up additional lipophilic and polar contacts eventually led to compound 4, with low nanomolar affinity. Understanding the importance of negative controls for chemical probes, the researchers also switched the stereochemistry at the benzylic carbon to produce compound 5, which has >30-fold lower affinity for eIF4E.

Crystallography revealed that binding of compound 4 to eIF4E causes conformational changes that should impair binding of the protein to eIF4G. Experiments in cell lysates bore out this hypothesis. Moreover, compound 4 also inhibited protein translation in cell lysates at low micromolar concentrations, while compound 5 did not.

Unfortunately, these observations did not extend to intact cells. A cellular thermal shift assay (CETSA) demonstrated that compound 4 did stabilize eIF4E in cells with an EC₅₀ = 2 µM, consistent with binding. But it was much less effective at blocking the interaction with eIF4G in cells, even at high concentrations, and showed no inhibition of protein translation.

To understand why, the researchers conducted a series of targeted protein degradation and genetic rescue experiments that are beyond the scope of this blog post. The upshot is that eIF4G binds to several regions of eIF4E, and that while compound 4 disrupts binding to the “non-canonical binding site”, it does not block binding to the “canonical binding site,” and thereby does not block overall complex formation. Why there should be a difference between intact cells and cell lysates is not obvious to me, but perhaps the more dilute conditions of cell lysates play a role, as seen for a paper we discussed last year.

One interesting feature of this story is that the initial fragment makes no polar interactions with the protein; all of the polar interactions in compound 4 were added during optimization. This is quite the opposite of ASTX660, where all the polar interactions in the final clinical compound came from the initial fragment. Indeed, a 2021 analysis of fragment to lead successes found that fewer than one in ten retained no polar interaction from the initial fragment.

This paper also illustrates the gap that can occur between research and publication; a couple of the authors listed as affiliated with Astex left in 2017. But better late than never, and this study nicely integrates fragment-based lead discovery with elegant biology. Compound 4 should be a useful tool for further exploring the nuances of eIF4E.

13 January 2025

Berotralstat: an overlooked fragment-derived drug

At the end of 2023 I mentioned that a paper by Dean Brown listed berotralstat as a fragment-derived drug. Readers will notice this molecule does not appear on our “fragments in the clinic” list. Did we miss it? After reading a (2021!) J. Med. Chem. paper by Pravin Kotian and colleagues at BioCryst, I believe the answer is yes.

Hereditary angioedema (HAE) is a rare genetic disease caused primarily by deficiencies in a protein that inhibits a serine protease called plasma kallikrein, or PKal. Drugs had already been developed to replace the inhibitor protein, but these need to be injected or infused. Since PKal is an enzyme, the researchers sought to make a small molecule inhibitor that could be taken as a pill.

BioCryst had developed an earlier drug called BCX4161, which is potent but has poor oral bioavailability. To find a better molecule, the researchers turned to the rich literature around serine protease inhibitors, which led them to make compound 2, a fragment of previously reported inhibitors of other serine proteases. The protonated benzylamine was expected to bind in the S1 pocket of the enzyme, and indeed the molecule did show weak but measurable activity.

Fragment growing led to compound 4, with double-digit micromolar activity. Building off the new phenyl ring led to more potent molecules such as compound 13, with low micromolar activity. Further structure-based design eventually led to BCX7353, or berotralstat. The paper provides good descriptions of the design rationale. For example, the fluorine was added to improve permeability, and the nitrile was added to improve the ADME profile. Modeling was used both to improve potency as well as to gain selectivity over other serine proteases. This proved to be successful: berotralstat is a subnanomolar inhibitor of PKal and at least several thousand-fold selective over trypsin and other serine proteases such as thrombin and FXa.

The pharmacokinetic properties of berotralstat in rats and monkeys were also good, and according to clinicaltrials.gov the molecule first entered the clinic in 2015. In December of 2020 the FDA approved berotralstat for prophylactic treatment of HAE attacks.

This is a nice story, and I agree with Dean that the discovery of berotralstat was “based on a legacy clinical candidate and fragment approaches.” The earlier molecule BCX4161 contained a benzamidine moiety, which was in part responsible for the poor oral bioavailability. Replacing this with a benzylamine fragment from the literature is a classic fragment strategy, and compound 2 is fully compliant with the rule of three.

So how was it missed? The abstract only states that berotralstat was discovered “using a structure-guided drug design strategy.” Indeed, the word “fragment” appears precisely once in the paper, albeit in a very telling sentence: “We evaluated these fragments in our PKal^pur inhibitor assay…”

From a timeline perspective, the approval of berotralstat makes it the fifth approved fragment-derived drug, after pexidartinib and before sotorasib. I’ll include it in the next update of clinical compounds, along with my standard disclosure that “the list is almost certainly incomplete.” What else are we missing?

30 December 2024

Review of 2024 reviews

Long winter is here in the global north, with its dim days and gaping nights. As is tradition, Practical Fragments looks back on the year almost done. 2024 was the best year for conferences since the arrival of COVID. I wrote about CHI’s Drug Discovery Chemistry in San Diego, FBDD-DU in Brisbane, FBLD 2024, and CHI’s Discovery on Target, both in Boston.

Another tradition is the annual J. Med. Chem. fragment-to-lead success story review; the latest covers the year 2022 and was written by Andrew Woodhead (Astex) and collaborators, including yours truly.

For a timely and accessible overview of “how to find a fragment,” look no further than a review of that title in ChemMedChem by Marcio Vinicius Bertacine Dias and collaborators at University of São Paulo and University of Warwick. This covers crystallography, cryo-EM, NMR, SPR, thermal shift, virtual screening, functional screening, ITC, mass spectrometry (including HDX-MS), MST, and BLI, and concludes with a nice comparison table. Some twenty other reviews were also published throughout the year, and these are discussed thematically.

Structure-based methods

Three reviews cover NMR. In J. Med. Chem., Janet Caceres-Cortes and colleagues at Bristol-Myers Squibb provide “perspectives on nuclear magnetic resonance spectroscopy in drug discovery.” Ligand- and protein-detected screening are covered thoroughly, with examples such as the discoveries of BI-2852 and venetoclax. Applications beyond hit finding are also discussed, such as the characterization of atropisomers in sotorasib, the identification and characterization of impurities and metabolites, in-cell NMR, and much more.

“Perspectives on applications of ¹⁹F-NMR in fragment-based drug discovery” is the title of an open-access review in Molecules by Qingxin Li and CongBao Kang at Guangdong Academy of Sciences and A*STAR, respectively. As we discussed in 2020, fluorine NMR is becoming increasingly common in FBLD, and this paper covers the various methods, including using ¹⁹F-NMR to measure ligand affinity. The authors also include a table summarizing 17 fragment screens that used fluorine NMR.

The rise of powerful permanent magnets has enabled low-maintenance benchtop NMR instruments that can be yours for as little as $50,000, compared to upwards of $1 million for a 600 MHz superconducting machine. Although sensitivity is at least 150-fold lower, hyperpolarization techniques such as photo-CIDNP, which we wrote about here, can close the gap. The latest developments are described (open access) in Chemistry–Methods by Felix Torres and collaborators at NexMR and the ETHZ.

X-ray crystallography has retained the top position among fragment-finding methods according to our most recent poll. In an open-access Applied Research paper, Daren Fearon, Frank von Delft, and collaborators describe high-throughput crystallographic fragment screening at the Diamond Light Source. As of August 2024 they have collected more than 240,000 academic data sets on hundreds of targets, and the paper distills some of the key lessons, some of which were applied to the COVID Moonshot, which we last wrote about here. The paper also describes future developments and needs, such as how and where to house such massive quantities of data.

Another center for high-throughput crystallographic screening is the Helmhotz-Zentrum Berlin (HZB) F2X-Facility at the BESSY II synchrotron, and in an open-access Applied Research paper Manfred Weiss and collaborators provide an overview of workflows and capabilities. One unique offering is the F2X-GO kit, in which F2X fragment libraries (which we wrote about here) as well other supplies are shipped to users to do soaking experiments in their own laboratories prior to shipment to the synchrotron.

“Structure-based virtual screening of vast chemical space” is the topic of an open-access review in Curr. Opin. Struct. Biol. by Jens Carlsson (Uppsala University) and Andreas Luttens (MIT). The Enamine REAL collection currently contains 40 billion compounds, and the researchers predict that “make-on-demand chemical libraries will likely reach more than one trillion compounds in the next few years,” which presents both opportunities and challenges, particularly given the existence of the “virtual cheaters” we recently discussed. Machine learning and fragment-based methods such as V-SYNTHES could help.

Continuing the virtual theme, Li Wang and collaborators at Nantong University discuss “molecular fragmentation as a crucial step in the AI-based drug development pathway” in an open-access Commun. Chem. paper. This summarizes 15 different computational methods for dissecting larger molecules into fragments, and also includes a list of 11 library vendors.

Other methods

Among experimental methods, few can match the throughput of fluorescence techniques, the subject of an open-access Heliyon review by Neelagandan Kamariah and colleagues at inSTEM & NCBS in Bangalore. It has short sections on “fluorescence polarization (FP) and anisotropy (FA), Förster resonance energy transfer (FRET), time-resolved Förster resonance energy transfer (TR-FRET), fluorescence lifetime (FLT), protein-induced fluorescence enhancement (PIFE), fluorescence thermal shift assay (FTSA) and microscale thermophoresis.” It also describes applications to GPCRs, protein-protein interactions, and other biological systems.

Native mass spectrometry (nMS), which we wrote about most recently in 2022, is the subject of an RSC Med. Chem. review by Louise Sternicki and Sally-Ann Poulsen at Griffith University. Sally-Ann is a leading expert in nMS, and the paper describes the technique and how it compares to other fragment-finding methods. It also includes a nice table summarizing 17 studies published between 2013-2023 that used nMS for FBLD; a 2013 review of nMS covered earlier examples.

Covalent Fragments

Mass spectrometry plays a prominent role in finding covalent fragments, as discussed in an open-access SLAS Discovery review by Simon Lucas and colleagues at AstraZeneca. “Covalent hits and where to find them” also describes other biophysical, biochemical, cellular approaches, and even DEL screening. It also discusses covalent libraries (which we wrote about earlier this month) and successful examples such as the discovery of sotorasib. In my opinion the researchers succeed in their “hope that this review will help serve as a useful roadmap to those seeking to drug the undruggable.”

A concise open-access review in Curr. Opin. Struct. Biol. by Katrin Rittinger and collaborators at The Francis Crick Institute and GSK focuses on using covalent fragments to assess target tractability, specifically ligandability and functionality. Target-based, proteome-wide and function-first approaches are summarized, and the researchers also discuss the importance of negative control compounds such as inactive enantiomers.

Continuing the theme of “assayability,” Micah Niphakis (Lundbeck) and Ben Cravatt (Scripps) review “ligand discovery by activity-based protein profiling” (ABPP) in a Cell Chem. Biol. paper. Because ABPP is usually conducted in cells or cell lysates, full-length proteins are assayed in their native environment, facilitating the discovery of allosteric ligands as in the case of WRN, which we wrote about earlier this year. The paper summarizes multiple examples of finding covalent ligands for challenging targets, and also highlights future challenges such as increasing throughput and targeting residues beyond cysteine.

Reversible covalent inhibitors are the topic of an open-access review by Dustin Duncan and colleagues at Brock University in ACS Chem. Biol. The researchers argue that reversible covalent inhibitors may cause less accumulation of the off-target adducts that could form with irreversible inhibitors. The paper includes a figure showing reversible covalent warheads, details on how to characterize them, a nice summary of general considerations, and success stories for JAK3, BTK, and proteases.

Targets

Covalent ligands have been particularly important for cancer targets, as reviewed by Xiaoyu Zhang (Northwestern University) and Ben Cravatt (Scripps) in an open-access Annual Review of Cancer Biology paper. The focus is on use of chemical proteomics “to expand the druggability of cancer proteomes.” The paper presents examples of finding and characterizing covalent ligands for a variety of oncology targets including KRAS^G12C.

E3 ligases are briefly mentioned, and this target class is the focus of an Expert Opin. Drug Discov. review by Jongmin Park and colleagues at Kangwon National University. As we’ve discussed recently, the 600 or so human E3 ligases are potentially valuable for targeted protein degradation applications such as PROTACs. The review focuses on “fragment-based approaches to discover ligands for tumor-specific E3 ligases.” In addition to summarizing successes against targets such as BCL6 and XIAP, it includes a list of 113 tumor-specific E3 ligases and another list of 52 E3 ligases that are overexpressed in certain tumors.

The “impact of fragment-based drug design on PROTAC degrader discovery” is also the subject of a review in Trends in Analytical Chemistry by Xiaoguang Lei and colleagues at Shenzen Bay Laboratory. Here, the focus is more on using FBLD to discover ligands for target proteins rather than for E3s. For example, the researchers describe how the fragment-derived drug navitoclax was used as a starting point for developing DT2216, a clinical-stage BCL-x_L degrader.

As Vicki Nienaber noted more than a decade ago, fragment-based drug discovery is ideally suited for targeting the central nervous system, particularly when combined with a ruthless focus on molecular properties. This is the topic of an open-access review in Front. Chem. by Michael Kassiou and collaborators at University of Sydney, CSIRO, and Vast Bioscience. After a brief summary of FBLD the researchers present case studies published since 2015, the last year this topic was reviewed. We’ve covered quite a few on Practical Fragments, including apoE4, Notum, and PDE10A.

Other

We mentioned allostery above, and in an open-access FEBS Open Bio. article Andrea Bellelli and collaborators at Sapienza University of Rome and elsewhere ask “is allostery a fuzzy concept?” Digging into half-century old publications from Jacques Monod, the researchers conclude that the concept was “born with an original sin: two definitions.” Indeed, the first mathematical model did not even apply to monomeric proteins. Most readers of this blog will probably be satisfied with the notion that an allosteric ligand is one that binds outside of an active site, but it is worth remembering that “allostery is an umbrella that covers more than a single reaction mechanism and cannot be defined by a single mathematical expression.”

Structure comes up frequently on Practical Fragments, but James Fraser (UCSF) and Mark Murcko (Disruptive Biomedical) remind us in Cell that “structure is beauty, but not always truth.” We’ve written multiple posts about getting misled by crystal structures, and in this brief commentary the authors provide “four harsh truths: a structure is a model, not experimental reality; representing wiggling and jiggling is hard; in vitro can be deceiving; drugs mingle with many different receptors.” They conclude that “truth is a molecule that transforms the practice of medicine.”

Stumbling towards truth is a little easier with the help of a good chemical probe, and in Nucleic Acids Res. Paul Workman (Institute of Cancer Research) and collaborators provide an updated description of The Chemical Probes Portal. This free community resource now contains 803 probes against 570 targets, including 28 covalent ligands and 51 degraders. Moreover, 332 of the probes have structurally related negative controls. Importantly, the Portal also includes 258 “Unsuitable” compounds that are insufficiently potent or selective to serve as chemical probes. Checking this list can save you valuable time when reading papers about unfamiliar targets.

Finally, a brief open-access interview with Nobel Laureate Katalin Karikó in Issues in Science and Technology is an inspiring reminder that “you learn more from failure,” and that the pleasure of doing science can be its own reward.

Thanks for reading. Good luck in 2025, and remember that the sun is always out there, even when you can neither feel nor see it.

23 December 2024

Covalent fragments vs BFL1: a selective chemical probe

Last week we highlighted the construction of a covalent fragment library at AstraZeneca. The first fruits of this library have recently been published as a pair of papers.

The protein BFL1 (or Bfl-1) is a member of the BCL2 family and blocks apoptosis by binding to pro-apoptotic proteins such as BIM, BID, and Noxa. Blocking these types of protein-protein interactions should increase apoptosis in cancer cells. Indeed, BCL2 itself is the target of the approved fragment-derived drug venetoclax, which took heroic measures to discover.

Finding noncovalent inhibitors of BFL1 was also expected to be difficult, but fortunately the protein contains a unique cysteine (C55) in the protein-protein binding site, facilitating both covalent attachment and selectivity. As we mentioned last week, the protein was screened against the emerging AstraZeneca covalent library, resulting in the discovery of several hits, including compound 8. Its optimization is described by Simon Lucas and colleagues in the first J. Med. Chem. paper.

Compound 8 showed promising k_inact/K_I for BFL1 as well as micromolar inhibition in a TR-FRET assay using a BIM-derived peptide. Crystallography was initially unsuccessful, but synthesis of close analogs led to compound 13, which is slightly more potent and could be co-crystallized with the protein. The structure confirmed covalent binding and revealed that one of the phenyl rings binds in a lipophilic pocket created by movement of a phenylalanine side chain.

To explore more regions of the protein-protein binding site, the researchers performed a high-concentration crystallographic screen with 384 non-covalent fragments. This yielded nine hits, four of which made hydrogen bonds with a glutamic acid side chain (E78) that had previously been targeted by others. To try to engage with this residue, the researchers modeled and synthesized a series of amine-containing molecules. Happily, one of the highest priority compounds gave a ten-fold boost in potency. Adding a methyl to the benzylic position and tweaking substituents around one of the phenyl rings ultimately led to compound (R,R,S)-26, the best molecule in this paper.

Because C55 is unique to BFL1, the hope was that compounds would be selective against other BCL2 family members, and indeed (R,R,S)-26 showed no activity against BCL-xl, BCL2, or MCL1. In vitro ADME parameters were encouraging, and the molecule also showed moderate bioavailability in mice. (R,R,S)-26 showed some cellular activity, though a mass-spectrometry assay showed only ~50% target engagement in cells after treatment at 10 µM for five hours.

The second J. Med. Chem. paper, by Adeline Palisse and colleagues, describes further optimization. Structure-based design was supported by “multiple X-ray cocrystal structures,” and as in the first paper the researchers consistently measured the half-life of new molecules against the cellularly abundant thiol glutathione to ensure they were not simply optimizing non-specific reactivity. The paper is an excellent blow-by-blow account of some of the challenges of medicinal chemistry: improving activity at the expense of stability or permeability, for example. The most potent compound has k_inact/K_I = 120,000 M^-1s^-1, but the hepatocyte stability data suggested it would be rapidly cleared.

In the end, compound 20 was chosen as the best overall molecule, with a k_inact/K_I comparable to that of the approved drug sotorasib. As with (R,R,S)-26, it showed no activity against BCL-xl, BCL2, or MCL1, and it was also clean against a panel of 48 kinases and fairly clean against a panel of other potential off-target proteins.

Among the several BCL2 family members, the protein MCL1 can also bind to BIM, thereby blunting the effects of inhibiting BFL1. Thus, the researchers performed cell assays in the presence of the MCL1 inhibitor AZD5991, whose discovery we wrote about here. In the presence of 0.5 µM AZD5991, compound 20 had an EC₅₀ = 350 nM in a cell viability assay and also activated caspase 3, as expected in apoptosis. A similar effect is also seen in combination with venetoclax.

Pharmacokinetic studies in mice revealed that compound 20 is 55% orally bioavailable, and this combined with the other properties suggest this molecule will be a useful chemical probe for exploring the biology of BIM.

02 December 2024

Mapping protein conformations with fragments

Proteins can be remarkably dynamic, and, as we noted recently, different conformational states can reveal different pockets for small molecule ligands. But how can one survey and categorize all the possibilities? In a recent J. Chem. Inf. Model. paper, Doeke Hekstra and colleagues at Harvard University present a new tool for doing so.

High-throughput crystallographic fragment screens are becoming faster and more widely accessible, and the researchers wondered whether the information from these screens could be used to map protein conformational landscapes. To do so, they built a Python program called COLAV, short for COnformational LAndscape Visualization. This open-source tool can compile data from hundreds of protein coordinate files and then, for each protein, calculate the dihedral angles between backbone atoms, the pairwise distances between the alpha-carbon atoms, and the strain.

To a first approximation, dihedral angles capture local movements, while distances between alpha-carbons capture global movements, such as the distance between the N-terminus and C-terminus. Strain measurements are also local but can reveal particularly important features such as hinge movements. Also, while dihedral and pairwise distances can be calculated for single proteins, strain measurements are calculated after first aligning multiple structures.

Having calculated these three parameters for individual protein structures, COLAV can compare them across the selected set of structures using principal component analysis (PCA). These comparisons can reveal clusters with similar dihedral angles, pairwise distances, or strain.

The researchers provide two case studies. The first is the metabolic disease target PTP1B, which we recently wrote about here. This enzyme has been pursued intensively for decades, so the researchers were able to draw on 163 individual protein structures deposited in the protein data bank (PDB) as well as 187 structures from a high-throughput crystallographic fragment screen. PTP1B contains two flexible loops, each of which adopts one of two conformations, and COLAV successfully segregated all 350 structures into four clusters. Importantly, these four clusters were found whether the structures were pulled from the PDB (representing experiments conducted across multiple labs and years) or from the fragment screen, suggesting that a single crystallographic fragment screen can identify most or all of the conformational states available to a protein. This is particularly impressive given that most of the fragments bound in allosteric sites while most of the ligands found in the PDB bound in the active site.

Next, the researchers turned to the main protease (MPro) of SARS-CoV-2, the subject of intense and successful drug discovery efforts. They used 656 structures from the PDB and 631 structures from high-throughput crystallographic screens to perform COLAV analyses. Unlike PTP1B, discrete conformational clusters were not observed; rather a continuous band was seen, suggesting that the protein can assume myriad conformations. Here too though, the fragment screens were able to sample most of the conformations observed in the PDB.

The fact that a single high-throughput crystallographic screen can capture the conformations seen in hundreds of hard-won discrete protein-ligand crystal structures is encouraging, though of course the paper only describes two case studies. Also, as the researchers note, any structure that cannot be crystallized is not sampled. Since COLAV is free to use, it will be fun to see it applied to other proteins.

11 November 2024

Poll results: fragment finding methods and structural information needed for fragment-to-lead efforts

Our most recent poll asked about fragment finding methods. The poll ran from September 21 through November 8 and received 135 responses from 20 countries. Two thirds of these were from the US, about 12% were from the UK, 4% from Germany, 3% from the Netherlands, and 2% from Australia.

The first question asked how much structural information you need to begin optimizing a fragment. In contrast to 2017, when we first asked this question, crystallography has significantly increased at the expense of the other choices.

I confess to being surprised, as I expected that by now people would be more comfortable beginning optimization in the absence of structural information, an approach that has been quite successful as discussed in a 2019 open-access Cell Chemical Biology review by Ben Davis, Wolfgang Jahnke, and me. Perhaps the increasing speed and accessibility of new methods has so lowered the bar to getting crystal structures that people have the luxury of waiting. Of course, with an online poll there is always the risk that many respondents from the same organization may skew the results.

The second question asked which methods you use to find and validate fragments. This is the fifth time we’ve run this poll, starting in 2011. As with our first question, X-ray crystallography came out on top, with nearly 80% of respondents choosing it. This was followed by SPR, at 67%, and thermal shift and ligand-detected NMR, each around 55%.

Functional screening was used by nearly half of respondents, with computational methods, protein-detected NMR, and literature starting points used by around a third. Mass spectrometry and ITC were each used by slightly more than a quarter of respondents.

For the first time we asked about cryo-EM, and nearly 20% of respondents reported using this technique.

MST and affinity-based methods each came in at 13%, with just 4% of respondents using BLI, and 5 individual respondents using other methods. I’d be curious to know what these are.

The average respondent reported using just over 5 different techniques, which is down slightly from 6 in 2019 but up from 4 in 2016. Using multiple orthogonal methods is clearly well established as best practice, even if the precise number varies.

How do these results compare with your own practices?