21 July 2025

How can we house our crystallographic data?

Three years ago we highlighted a growing debate about how and where to house crystallographic fragment data. With the recent surge in high-throughput crystallography, issues including access, accuracy, and capacity have only become more urgent. An open-access perspective in Nat. Comm. by Manfred Weiss (Helmholtz-Zentrum Berlin) and multiple coauthors, including yours truly, calls on the scientific community to make some difficult decisions. Indeed, a session at the 75th annual meeting of the American Crystallographic Association going on today is devoted to the topic.
 
High-throughput crystallography can involve soaking more than 1000 crystals with fragments, sometimes yielding hundreds of protein-ligand structures. The paper tabulates a dozen synchrotrons around the world with current or planned high-throughput capabilities. We’ve written recently about the XChem facility at the Diamond Light Source, which is currently running about 80 fragment screens per year. Assuming similar productivity at the other synchrotrons, we might soon see 1000 fragment campaigns per year worldwide. If each of these involves 1000 crystals and we get 10% hit rates, that could mean 100,000 new fragment structures annually.
 
That is a big number. For reference, 10,000 new crystal structures are currently being released by the protein data bank (PDB) each year. (Director of the RCSB PDB Stephen Burley is one of the authors of the perspective.)
 
The problem is that, as we discussed in the 2022 blog post, most fragment structures from high-throughput screens are not refined to the level required for the PDB, a process which typically takes a day or two for the researcher and up to 3 hours by a biocurator at the PDB. Moreover, fragments are often identified using PanDDA (Pan-Dataset Density Analysis, which we wrote about here), a process which makes use of the many unbound structures obtained in a dataset. Ideally, these datasets should also be made available.
 
The challenge is balancing practicality with FAIR (Findable, Accessible, Interoperable, and Reusable) principles. The paper outlines four non-exclusive options. Very briefly, these are:
 
Option One: Fully refine and deposit all protein-fragment structures just as with other structures.
 
Option Two: Partially refine structures, and possibly flag or even segregate them from other structures in the PDB.
 
Option Three: Rather than treating each protein-ligand structure independently, treat each high-throughput screen as a single experiment, and archive all of the data in its entirety, including unbound structures. These data could be housed in the PDB or elsewhere.
 
Option Four: A hybrid approach, where fully refined structures would be deposited in the PDB and the rest of the data would be stored in a separate branch of the PDB or elsewhere entirely.
 
There are pros and cons for each option. At the extremes, the first option puts a tremendous burden on experimentalists and the PDB, and potentially valuable information regarding unbound structures is lost, while option three requires setting up new repositories to store vast quantities of data.
 
The paper intentionally avoids making a specific recommendation and instead calls for discussion within the scientific community. Personally, I favor some sort of hybrid approach such as option four. As the paper notes, no one could have foreseen AlphaFold2 when the PDB was launched in 1971. Over the next decade researchers around the world are likely to generate hundreds of thousands of protein-fragment structures. I don’t pretend to know what the artificial intelligence tools of the future will be able to make of such data, but I hope they will have access.
 
What do you think?

14 July 2025

The importance of specific reactivity for covalent drugs

As we noted in our thousandth post, covalent drugs are becoming increasingly popular, particularly for tackling tough targets. But finding and optimizing covalent ligands entails unique challenges, as discussed in a new paper by Bharath Srinivasan at Cancer Research UK. (Derek Lowe also recently blogged about this.)
 
Interactions between noncovalent drugs and their targets are characterized by dissociation or inhibition constants KD or KI , where lower numbers mean stronger binding. In contrast, irreversible covalent drugs are characterized by a ratio we discussed last year, kinact/KI, where the rate constant kinact represents the covalent modification step. (Side note: although the term kinact is commonly used, covalent modulators can also be activators; my company Frontier Medicines recently announced a covalent activator of p53Y220C. Perhaps kcov would be more general?)
 
To explain kinact/KI, Srinivasan draws a useful analogy to enzymes, which are mechanistically described by the specificity constant kcat/Km in Michaelis-Menten kinetics. In both cases, higher numbers mean more rapid modification or greater catalytic efficiency. A study of several thousand enzymes found the median kcat/Km to be around 100,000 M-1s-1, with 60% between 1,000 and 1,000,000 M-1s-1. Enzymes operate by stabilizing the transition state of the reaction, which means that the affinities for the substrates do not necessarily have to be high, particularly if the structures of the substrates differ from the transition states.
 
Just as catalytic efficiency for enzymes can be increased either by increasing kcat or lowering Km, the inactivation efficiency of covalent drugs can be optimized either by increasing kinact or by decreasing KI. Historically, drug hunters have focused on the latter; we previously described the discovery of TAK-020 in which the affinity of a fragment for the kinase BTK was first optimized and then a covalent warhead was appended.
 
However, focusing on kinact can also be productive, and Srinivasan argues this is particularly true for challenging targets with shallow pockets where noncovalent affinity is difficult to obtain. As a case in point he discusses covalent KRASG12C inhibitors such as sotorasib, which I wrote about here. Just as residues within enzyme active sites stabilize the transition state of a reaction, a lysine residue in KRAS forms a hydrogen bond to the carbonyl of the acrylamide electrophile, thereby increasing its reactivity for the protein.
 
Srinivasan emphasizes that kinact is specific for each particular protein-ligand pair as well as distinct from intrinsic or chemical reactivity. This is a critical point. Newcomers to the field often worry that a high kinact value means a molecule is generically reactive and thus likely to react with many proteins, but this is not necessarily true. For example, sotorasib’s favorable kinact/KI is driven by a high kinact for KRASG12C but it is still quite specific. Indeed, Srinivasan points out that even a chemically reactive molecule may not react with a protein if the geometry isn’t right.
 
A nice way of assessing specific reactivity (which unfortunately is not cited) is the reactivity enhancement factor, or REF, as defined by Alan Armstrong, David Mann, and colleagues at Imperial College London in an (open-access) 2020 ChemBioChem paper. Akin to the kcat/kuncat ratio used to assess rate enhancement for enzymes, REF is defined as the rate of reaction for a specific protein divided by the rate of reaction for glutathione, an abundant cellular thiol. The higher the REF score, the higher the specific reactivity for the protein of interest.
 
Srinivasan also considers tradeoffs between kinact and KI as kinact/KI approaches the rate of diffusion, suggesting that above 1,000,000 M-1s-1 or so any further improvement in affinity will come at the cost of specific reactivity. While this is theoretically interesting, from a practical perspective you can have a perfectly fine drug with a kinact/KI of just 10,000 M-1s-1.
 
Covalent drugs will only become more important as we pursue increasingly hard targets that have resisted previous efforts. For these targets in particular, focusing on specific reactivity will be rewarding.

07 July 2025

Fragment events in 2025 and 2026

For better or for worse, 2025 is half-way over. There are still some good conferences coming up, and 2026 is also starting to take shape.

September 21-24FBLD 2025 will be held in the original Cambridge (UK),  where it was supposed to be held in 2020. This will mark the ninth in an illustrious series of conferences organized by scientists for scientists. You can read impressions of FBLD 2024FBLD 2018FBLD 2016FBLD 2014FBLD 2012FBLD 2010, and FBLD 2009
 
September 22-25: You'll need to make a tough choice: FBLD 2025 or CHI’s Twenty-Third Annual Discovery on Target in Boston. As the name implies this event is more target-focused than chemistry-focused, but there are always plenty of FBDD-related talks. You can read my impressions of the 2024 meeting, the 2023 meeting, the 2022 meeting, the 2021 meeting, the 2020 virtual meeting, the 2019 meeting, and the 2018 meeting.
 
November 11-13: CHI holds its second Drug Discovery Chemistry Europe in beautiful Barcelona. This will include tracks on lead generation, protein-protein interactions, degraders and glues, and machine learning, with multiple fragment talks throughout. 

2026
February 17-19:  The Twelfth NovAliX Conference will be held for the first time in San Diego! (Please note the date and location change.) You can read my impressions of the 2018 Boston event here, the 2017 Strasbourg event here, and Teddy's impressions of the 2013 event herehere, and here. 
 
April 13-16: CHI’s Fragment-Based Drug Discovery turns 21, old enough to legally drink in the US! The longest-running annual fragment event returns as always to San Diego. This is part of the larger Drug Discovery Chemistry meeting. You can read impressions of the 2025 meeting, the 2024 meeting, the 2023 meeting, the 2022 meeting, the 2021 virtual meeting, the 2020 virtual meeting, the 2019 meeting, the 2018 meeting, the 2017 meeting, the 2016 meeting; the 2015 meeting herehere, and here; the 2014 meeting here and here; the 2013 meeting here and here; the 2012 meeting; the 2011 meeting; and the 2010 meeting

September 14-16: RSC-BMCS Tenth Fragment-based Drug Discovery Meeting will be held in Cambridge, UK.  You can read my impressions of the 2024 meeting, the 2013 meeting, and the 2009 meeting.
 
Know of anything else? Please leave a comment or drop me a note.