03 October 2022

Metal-binding fragments vs glutaminyl cyclases

Metal-binding fragments have a long history in FBLD; the first mention on Practical Fragments was back in 2010. The idea is to use the strong interaction between a fragment and a protein-bound metal as an affinity anchor for further optimization. The latest example, by Jie-Young Song, Soosung Kang and collaborators at Korea Institute of Radiological & Medical Sciences, Ewha Womans University, and elsewhere was published in ACS Med. Chem. Lett.
Glutaminyl cyclases such as glutaminyl-peptide cyclotransferase (QC) and glutaminyl-peptide cyclotransferase-like protein (isoQC) convert N-terminal glutamine or glutamate residues on proteins to pyroglutamates. This modification tends to stabilize proteins, and it has been implicated in several diseases. In particular, modification of CD47 by isoQC seems to be important for the ability of cancer cells to evade the immune system.
QC and isoQC are closely related enzymes with a zinc-containing active site. Capitalizing on this, the researchers tested a library of 36 potential metal-binding fragments in a functional assay against QC. Most of the compounds tested were inactive, though 11 had IC50 values less than 0.8 mM. A few of these, including compound ab, were used to generate a second library of just half a dozen larger fragments, and compound 9 turned out to quite potent.

The researchers recognized that compound 9 has two potential zinc-binding moieties, and docking suggested the newly added amino-thiadiazole was likely responsible for the increased activity. Structure-based design ultimately led to compound 22b, with low nanomolar activity against QC and isoQC. The molecule did not seem to be generally cytotoxic, but it did increase phagocytosis of cancer cells in vitro, consistent with an effect on the “don’t eat me” function of CD47.
Unfortunately, no information is provided on the selectivity of compound 22b against other zinc-dependent enzymes. Moreover, unlike an earlier example of starting with metallophilic fragments, no ADME data are provided. But whether or not this particular series advances, it is nice to see metallophilic fragments being explored.

26 September 2022

FBLD meets DEL part two: let there be light

DNA-encoded libraries (DEL) are collections of peptides or small molecules attached to DNA tags. In a typical application, libraries are mixed with a protein of interest, non-binders are washed away and those that remain are identified by using PCR to amplify the DNA tags. Two years ago we highlighted an article in which previously identified fragments were merged with molecules identified from DEL. However, because fragments typically have low affinities, screening fragments directly by DEL would seem to be difficult. In a new open-access RSC Medicinal Chemistry paper, Rod Hubbard and collaborators at Vernalis and HitGen describe how to do so. (Rod presented some of this work in April at the CHI DDC meeting.)
To identify weak binders, the researchers turned to photoactivatable fragments that – in the presence of UV light – would bind irreversibly to a nearby protein. Specifically, they used the diazirine tag, which has proven useful in both cell-based screening as well as screens of isolated proteins. Here, the researchers generated two libraries of fragments bound to DNA, with each library member also containing a diazirine tag. The libraries were built using different chemistries and consisted of 15,804 and 23,905 members, small by DEL standards (which often range in the millions) but large by fragment standards.
The PAC-FragmentDEL libraries were incubated against two proteins: the kinase PAK4 and the bacterial enzyme 2-epimerase. Each protein was incubated with both libraries for one hour at room temperature and then treated with ultraviolet light for 10 minutes on ice. Next, the proteins were captured on an affinity resin and washed extensively under denaturing conditions to remove any non-covalently bound library members. Finally, the DNA was amplified by PCR and quantified; any library members that bind to the protein stand out over background.
Of course, there is plenty of opportunity for non-specific binding, so the researchers incorporated several controls, such as omitting the UV-crosslinking step or protein. Moreover, they repeated the experiment in the presence of known high-affinity binders and looked for fragments that were competed.
In the case of PAK4, the researchers identified 301 fragments that could be competed. Eleven of these were further examined (without the DNA tags), all of which demonstrated binding by ligand-observed NMR, and ten of them yielded crystal structures bound to the protein. The examples shown in the paper occupy the hinge-binding site, which the researchers acknowledge is a low bar for fragment screens.
The second target, 2-epimerase, has a more challenging active site, and indeed the hit rate was lower: just 21 competitive fragments were found. But all 9 of those selected for further testing confirmed by ligand-observed NMR, and 5 of them yielded crystal structures.
This paper demonstrates that DEL can be used to identify fragment hits with a fairly low false-positive rate. But do we need yet another fragment-finding method? The researchers point out that PAC-FragmentDEL is fast, with screening and sequencing analysis taking just a few weeks. This also means that fragment libraries can be much larger than for most techniques. Protein requirements are also modest, at around 250 pmol (12.5 mg for a 50 kD protein). They also note that – because of the DNA tag – less intrinsically soluble fragments can be screened, increasing chemical diversity, though one might counter that this could lead to problems down the road.
On the downside, it is not clear whether affinity information can be obtained from the primary screen. Also, the need for a competitive tool molecule could limit choice of targets, as some of the most interesting targets lack any chemical probes. Still, as the researchers note, the competitor could be a peptide or protein, and in a pinch the site of interest could be mutated.
In summary, this looks to be an interesting approach, and I look forward to seeing more applications.

19 September 2022

Crystallography first, then virtual screening: application to PKA

Fragment-based screening is often funnel-shaped: a virtual screen might identify dozens or hundreds of potential hits that are tested in various assays, eventually leading to a few chosen for crystallography. But a paper we highlighted back in 2016 argued that many assays miss genuine hits, and crystallography should be moved to the front of the line. A paper just published in J. Med. Chem. by Serghei Glinca and collaborators at CrystalsFirst, BioSolveIT, Enamine, and elsewhere provides a proof of concept.
The researchers started with a set of 19 crystal structures of fragments bound to Protein Kinase A (PKA) from a campaign we wrote about in 2020. Four diverse fragments were chosen for further study. Importantly, the affinities of these fragments had not been measured; selection was based on the diversity of chemical structures and binding modes.
Next, the crystal structures of the four fragments were used as starting points for four virtual screens using 208,293 Enamine REAL Space fragments (see here for more on these). These were docked using BioSolveIT’s FlexX algorithm, and 50 from each of the four screens were then computationally grown. Just over half a million of these elaborated molecules were then docked, and after clustering, triaging, and visual selection, 106 were chosen for synthesis, of which 93 were delivered and 75 were soluble at 200 mM in DMSO.
The soluble fragments were tested in a functional assay, and 30 of these showed inhibition. Most were weak (double digit micromolar or higher) but fragment EN093 (derived from Frag2) was a low micromolar inhibitor. All of the initial fragments were very weak inhibitors, with at best millimolar activity.

The 75 soluble compounds were also tested in a thermal shift assay (each at 2.5 mM), revealing 29 hits, of which 19 were also active in the functional assay. These included EN093. Interestingly, only one of the initial fragments (not Frag2) showed any activity in the thermal shift assay.
To assess how well the docking performed, 13 of the most active compounds were tested in co-crystallization experiments, yielding 6 high-quality bound structures. These confirmed the virtual screens, with the rmsd for EN093 being 0.74 Å.
Impressively, the whole study, including compound synthesis and crystallographic validation, took just 9 weeks.
This “Crystal Structure First” is conceptually similar to the V-SYNTHES approach we discussed earlier this year, with the difference being that while V-SYNTHES is entirely virtual, Crystal Structure First starts with an actual structure. As the researchers state, “using crystallographically validated fragments and bound ligands for template-based docking can be thought of as introducing a ‘magnet’ to help find the needle in an ever-growing haystack in a more targeted way.”
This is a nice case study, and intuitively it makes sense to start with an experimentally determined structure. Indeed, the increasing number of publicly available fragment structures should be a boon for this approach. That said, it is interesting that most of the molecules made and tested are quite weak, and only two have ligand efficiencies equal to or greater than 0.3 kcal/mol per heavy atom. As we suggested earlier this year, crystallography may find ligands that are just too weak to be useful. Perhaps adding a functional screen before computational elaboration could lead to even more and better binders.

12 September 2022

Growing fragments in silico with FastGrow

Growing fragments is probably the most common approach to improving affinity, and it is immeasurably faster to do this virtually than experimentally. But as anyone who has ever tried can attest, this is often easier said than done. In a new open-access J. Comput. Aided Mol. Des. paper, Matthias Rarey and collaborators at Universität Hamburg, Servier, and BioSolveIT describe a free tool to help.
The application is called FastGrow, and it can be accessed through this web server or the SeeSAR 3D software package. It relies on the “Ray Volume Matrix (RVM) shape descriptor,” which simplifies chemical fragments and protein binding pockets into three-dimensional shapes. This allows extremely rapid assessments of whether a given fragment can fit into a binding pocket. A scoring function called JAMDA assesses interactions beyond simple shapes, such as hydrogen bonds and hydrophobic contacts, and also allows fragments to shift slightly to optimize complementarity with the protein.
One nice feature of FastGrow is that users can input fragments into multiple binding sites with different amino acid conformations, allowing for protein flexibility. You can also specify an important interaction, such as a critical hydrogen-bond, that you prefer to maintain.
To validate the approach, the researchers turned to the database PDBbind and looked for examples in which two ligands with identical cores but different substituents bound to the same protein. They chopped off the substituents from the first ligand and used the resulting fragment as a starting point to try to grow the second ligand. Running 425 of these took just 3 and a half hours and successfully recapitulated the binding mode 71% of the time. This was higher than the popular program DOCK (version 6.9), which seemed to be a pleasant surprise. They attribute the difference to a higher clash tolerance for FastGrow in the initial stages.
For additional validation, the researchers turned to real-world examples of fragment-growing for the kinases DYRK1A/B, which we highlighted last year (here and here). Here too FastGrow outperformed DOCK and was also about five-fold faster when using JAMDA (and 600-times faster without JAMDA, though at some cost in performance).
FastGrow looks to be a valuable tool, and indeed the researchers note that it is currently in use at Servier. There is a lot more detail in the paper and supplementary materials, including the full code for the FastGrow web server and all the underlying data. It would be interesting to compare its performance to the V-SYNTHES approach we highlighted earlier this year.
If you have experience using FastGrow, please leave a comment!

05 September 2022

Is phenotypic fragment screening worthwhile?

Fragment-based drug discovery is almost always target-based. Indeed, not until the development of powerful biophysical techniques such as protein-labeled NMR did FBLD really began in earnest. Phenotypic fragment screens against cells, tissues, or animals are uncommon. In an open-access Front. Pharmacol. paper, Chris Lipinski and Andrew Reaume (Melior Discovery) argue that they should be used more often.
The researchers analyzed all 184,139,678 compounds in the CAS registry with molecular weights between 100 and 999 Da. These were divided into 18 bins (100-149 Da, 150-199 Da, etc.) Next, they calculated the percentage of molecules within each bin with any biological data as evidenced by the “biological study” tag in SciFinder-n.
In terms of raw numbers, fragments are well-represented, with the 250-299 Da bin containing close to 40 million molecules. However, only about 4% of these had any biological data. Molecules with molecular weights between 300 and 549 were abundant and also had considerably more biological data – up to roughly 50% of compounds in the 500-549 Da bin. In other words, people don’t seem to be screening lower molecular weight compounds in biological assays as often as they are screening larger molecules.
The assumption may be that small fragments are not biologically active, but the researchers revisit a classic In the Pipeline post in which Derek Lowe lists 56 drugs with molecular weights equal to or less than that of aspirin (180 Da). Most of these are old drugs, with all but three first reported in the chemical literature before 1980.
The researchers suggest that more effort should go into exploring the biology of smaller molecules, particularly those for which some activity is already reported. They also draw an interesting distinction between two uses of the word pleiotropic. People often say that a drug has pleiotropic effects if it acts on multiple targets; a classic example is imatinib, which hits several kinases in addition to the target BCR-ABL. However, the term pleiotropic originates in genetics and initially referred to one gene having multiple effects. Thus, a drug that acts on a single protein can have multiple effects, as in the case of the PDE5 inhibitor sildenafil.
As an example of a pleiotropic fragment, the researchers discuss MLR-1023, a fragment-sized molecule first discovered in a phenotypic screen at Pfizer in the 1970s. The molecule has shown promise in disease models ranging from atherosclerosis to myeloproliferative neoplasms and was taken into the clinic by Melior in 2014 as an anti-diabetic agent. All of these varied effects seem to stem from the ability of the compound to act as an activator of Lyn kinase. With just 15 non-hydrogen atoms and a molecular weight of 202 Da MLR-1023 is comfortably within rule of three space. Despite its small size, the molecule is a potent activator of Lyn, with an EC50 around 50 nM, giving it a ligand efficiency of 0.66 kcal/mol per heavy atom.
Is MLR-1023 an outlier or an example of an underexplored pool of pharmacological riches? My suspicion is the former. It is rare to find fragments with EC50s < 1 µM, let alone < 100 nM. Moreover, I suspect that many proteins are so difficult to drug that a molecule will need to be well beyond fragment-space – and even rule-of-five space – to have an effect. The protein-protein interaction targeted by venetoclax (MW = 868 Da) immediately comes to mind.
That said, the idea that a large group of tiny molecules is underexploited is worth exploring. For some types of drugs perhaps we don’t need extreme potency: Mike Hann noted a decade ago that the EC50 values of approved drugs average 20-200 nM and cautioned against an “addiction to potency.” And because fragments are likely to have low affinities towards most proteins, they may even be more specific than larger drugs. It will be fun to discover how much room there really is at the bottom.

29 August 2022

Diverse function – not structure – in fragment libraries

Successful fragment-based lead discovery typically starts with a good library. But what is “good”? Given that most fragment libraries are small, diversity is generally prized. The idea is to cover as much chemical space as possible with the fewest molecules. When most chemists hear the word diversity they think of structural diversity; tetrahydrofuran looks quite different from pyridine, for example. Functionally though, both contain a hydrogen bond acceptor. In a paper recently published (open access) in J. Med. Chem., Charlotte Deane and collaborators at University of Oxford and Diamond Light Source argue that functional diversity is more important.
Frank von Delft and his XChem colleagues at the Diamond Light Source have been screening dozens of targets crystallographically, many of them using the DSI-poised library, designed to enable rapid elaboration of hits. (We described it here). For the present analysis, the researchers considered ten diverse proteins (maximum pairwise sequence identity of 27%) that had all been screened against 520 fragments. Of these, 225 bound to at least one target.
The researchers considered what types of interactions the bound fragments made with the protein at either the residue or atomic level. For example, a fragment might serve as a hydrogen bond acceptor to the hydroxyl group of a serine residue. These interaction fingerprints, or IFPs, were calculated and compared.
Interestingly, there was no correlation between fragments that made similar IFPs and their structural similarity. In other words, “structurally dissimilar compounds can exploit the same interactions.” Moreover, many different fragments made similar or identical interactions: “structurally diverse fragments can be described as functionally redundant.”
In fact, just 135 fragments could make all the interactions observed for the 225 fragments. Some made more novel interactions than others, with “promiscuous” fragments that bound to multiple targets tending to be more informative.
The top 100 of these 135 functionally diverse fragments tended to have molecular weights between 175 and 240 Da and 12 to 16 non-hydrogen atoms, putting them comfortably within rule of three space. Interestingly, fragments that never hit any target skewed smaller, with many having molecular weights less than 175 Da and fewer than 12 non-hydrogen atoms; this is slightly at odds with work from Astex which found many tiny fragment hits.
The researchers considered sub-libraries consisting of either these functionally diverse fragments, randomly selected fragments, or structurally diverse fragments. The number of interactions discovered was significantly higher for the functionally diverse sets of fragments than for the other sets.
On one level the findings are not surprising: the whole concept of bioisosterism relies on the fact that different functional groups can make the same interactions, meaning that structurally disparate fragments can be functionally redundant. This suggests that libraries could be optimized to capture more information with fewer molecules. How to do so prospectively is not entirely clear, but laudably the researchers have provided chemical structures for all the fragment hits in the Supporting Information. It may be worth adding some of the functionally diverse fragments to your library; perhaps some enterprising vendor will start selling the top 100 as a set.

22 August 2022

Fragments vs human Adensoine 2a Receptor using SPR

Last week we highlighted the use of surface plasmon resonance (SPR) to find ligands against RNA. Although RNA is not a typical protein target, it is at least normally free in solution. Targets such as GPCRs are more technically challenging because they are bound within membranes. Challenging, but not impossible, as illustrated by this post from 2012. A new ACS Med. Chem. Lett. paper by Reid Olsen, Iva Navratilova, and colleagues at Exscientia, University of Dundee, and AstraZeneca provides the latest example.
Navratilova and colleagues previously described using SPR to screen the β2 adrenergic receptor. In the new paper, the researchers studied the human adenosine 2a receptor (hA2AR), a “rheostat for energy homeostasis” that also plays a role in cancer immunotherapy. hA2AR is one member of a small family of adenosine receptors, and the researchers expressed all four of them, each with a polyhistidine tag that could be captured in the SPR instrument using a nickel-NTA sensor chip. Other labs (such as Heptares) have used mutant, stabilized forms of GPCRs, but here the researchers used native proteins and stabilized them by crosslinking them to the surface of the chip. They confirmed that these GPCRs bound known ligands with similar affinities to those reported in the literature.
Next the researchers screened a library of 656 fragments, each at 50 µM, against hA2AR. This led to 72 potential hits taken into dose-response experiments, of which 17 confirmed with affinities ranging from 1.1 to 410 µM. All the sensorgrams are shown, as are the structures of the fragment hits. These confirmed hits were also screened against A1, A2B, and A3; most of the fragments bound to all the receptors, though two were selective for hA2AR.
To assess where the fragments bind, the researchers added a known high-affinity ligand; ten of the fragments could be competed, while seven showed less or no competition, suggesting that they may bind to an allosteric site.
GPCRs biology is complicated, and just because a ligand binds does not mean it will have any effect on signaling. In cell experiments, none of the fragments behaved as agonists, but five fragments could act as antagonists of a known agonist. Another fragment seemed to increase the signal, suggesting it is an allosteric modulator. As the researchers conclude, “while SPR can screen fragment-like molecules that allow for extrapolation of extremely large and diverse chemical spaces, it cannot predict the biological activity of these binders."
Nonetheless, this paper provides a nice guide on how to use SPR, with its low protein requirements, to screen GPCRs. And the fragments disclosed could be interesting starting points for medicinal chemistry.

15 August 2022

Fragments vs RNA with SPR: A guide

Fragment-based lead discovery on RNA has a long history: the first mention on Practical Fragments was in 2009. Most often, various NMR methods have been used (see this example from last year), though isothermal titration calorimetry (ITC) is also effective. However, both of these techniques generally require considerable amounts of RNA. In a recent Biochemistry paper, J. Winston Arney and Kevin Weeks describe using SPR, which could increase the speed and ease of screening RNA.
Non-specific binding is a significant problem in characterizing RNA ligands. RNA is negatively charged, and many ligands are positively charged, leading to non-specific interactions. In a typical SPR experiment, the target is bound to a surface and the analyte is allowed to flow over the immobilized target; binding causes a change in refractive index that can be detected. However, if the analyte interacts non-specifically with the target, this will also be detected. For high affinity ligands the non-specific interactions may be minimal at low concentrations, but for low-affinity ligands such as fragments, it can be difficult to differentiate specific from non-specific binding.
SPR experiments generally use a reference cell, in which the analyte is allowed to flow over the surface in the absence of target; this signal is then subtracted from the target channel. Arney and Weeks decided to use a reference cell containing mutant RNA not expected to bind to the ligand.
The researchers developed their approach using two different riboswitches, each with known high-nanomolar ligands. Immobilizing the riboswitches to the chip and flowing ligand led to non-specific binding at concentrations of 100 µM or so. However, when the reference cell contained a mutant riboswitch designed not to bind to the ligands, this non-specific binding could easily be subtracted, leading to simple single-site binding models.
Of course, creating a mutant RNA assumes you already know where your ligand binds, which is not true if you are looking for ligands to a new target. To increase the generality of their approach, the researchers used a different riboswitch or a completely arbitrary RNA for the reference. These also worked, though not quite as well as the targeted mutants.
Finally, the researchers tested a dozen RNA-ligand pairs that had previously been rigorously characterized. Importantly, these varied considerably in affinity, from 8 nM to 2 mM. Most of them were also fragment-sized, with molecular weights as low as 119 Da. The correlation between SPR dissociation constants and those reported in the literature was excellent.
The technique does have limitations. First, the RNA-bound surfaces do seem somewhat unstable over a period of days. Also, larger RNAs present technical challenges, though the researchers do state that they have been able to examine molecules as large as 300 nucleotides. Overall this looks like a nice approach for measuring RNA-ligand affinities.

08 August 2022

Solving structures with selective labeling and NMR2

Protein-detected NMR first enabled fragment-based lead discovery way back in 1996, but improvements in crystallography have now allowed synchrotrons to surpass big magnets as preeminent tools to determine how fragments bind to proteins. One of the major challenges in NMR is assigning the chemical shift values of atoms in all the individual amino acid residues. A technique called NMR Molecular Replacement (NMR2) sidesteps the need for this tedious, time-consuming process. A refinement to this technique, making it more broadly applicable, has just been published (open-access) in Sci. Reports by Julien Orts (University of Vienna), Martin Scanlon (Monash University) and collaborators.
As we discussed previously, NMR2 relies on intensive calculations using experimental intermolecular NOEs between a protein and a ligand to generate a model. Although the method does not require assignment of backbone or side chain chemical shifts, it does require high-quality spectra. For example, if the spectra of several amino acid residues overlap it is impossible to distinguish them (this applies to conventional NMR methods too). The researchers realized that one way to simplify the spectra is through selective labeling, in which the methyl groups of the amino acid residues alanine, isoleucine, leucine, valine, and threonine are isotopically labeled with 13C. Going one step further, the entire protein can be deuterated (rendering most of the protein invisible to NMR), while these methyl groups retain ordinary hydrogen atoms.
For the present study, the researchers focused on the protein EcDsbA, an antibacterial target we’ve written about previously. They selectively labeled methyl groups so that, in isoleucine, leucine, and valine, only one of the two methyl groups was labeled. That reduced the total number of protons to just 6% of the unlabeled protein.
The researchers then solved the structure of EcDsbA with a previously identified ligand. At 23 heavy atoms the ligand is on the large side, though with an affinity of just 0.9 mM it presents a difficult test case. A total of twelve intermolecular NOEs were used in NMR2 to build a model of the complex. One challenge with NMR2 is that there may not be a single solution. For example, if two methionine methyl groups are both near a ligand, it may be impossible to determine a unique binding mode. This turned out to be the case, and the top two structures had different positions for a carboxylic acid group and a phenyl in the ligand.
To benchmark NMR2, the protein-ligand complex was also determined using conventional two-dimensional techniques (HADDOCK and CYANA, which made use of assigned chemical shifts) as well as X-ray crystallography. These all agreed with the NMR2 model in placing a phenylpropyl moiety from the ligand in a hydrophobic groove, but they differed in the placement of the carboxylic acid and the other phenyl moiety: the top scoring NMR2 model agreed with the crystal structure and the CYANA NMR structure but differed from the HADDOCK structure, which was similar to the second-best NMR2 model. Before assuming that the crystallographic structure is correct, though, it is worth noting that the ligand makes crystal contacts with a neighboring protein, and the electron density around the ambiguous phenyl is weak.
This is a nice demonstration of the utility of NMR2. It seems to provide similar information as classic NMR methods, but the time taken is “orders of magnitude” less. And selective labeling should make NMR2 applicable to even larger proteins. I look forward to seeing more people use this strategy.

01 August 2022

What rings are found in drugs?

Recently we highlighted the “Ring Replacement Recommender,” which provides suggestions for how to improve affinity by replacing one ring with another. The recommendations are based on an analysis of hundreds of thousands of molecules. But what about the rings found in actual drugs? This is the focus of a J. Med. Chem. paper by Richard Taylor and collaborators at UCB and Bohicket Pharma Consulting.
The researchers examined FDA-approved and investigational drugs with disclosed structures as of January 2020. These were fragmented into component “ring systems” for analysis. (Ring systems include not just monocycles but fused rings, such as purine. For example, sotorasib consists of four ring systems: benzene, pyridine, piperazine, and pyrido[2,3-d]pyrimidin-2-one.) More than 90% of drugs contain at least one ring.
Approved drugs have just 378 unique ring systems in total – a small increase from when the researchers examined approved drugs in 2014. The phenyl ring is found 727 times, with pyridyl (86 examples) a distant second, followed by piperidine (76 examples) piperazine (65 examples) and cyclohexane (47 examples). After that the numbers drop off sharply, with pyrazine in 50th place with just six examples and fluorene in 100th place with three examples.
Investigational drugs at first appear to be more diverse, with 450 unique ring systems, 280 of which are not found in approved drugs. Of these 280, pyridazine is the most common, with nine examples, followed by oxetane, with seven, but things quickly become less common from there, with 271 of the ring systems found just once. In contrast, ring systems found in drugs are found in multiple compounds, and in fact two thirds of investigational drugs only contain previously used ring systems.
Many of the new ring systems are closely related to those found in approved drugs, with nearly half differing by at most two atoms. Perhaps because of this the overall properties of the ring systems are similar between approved and investigational drugs, with no significant differences in heteroatom ratio, percentage of sp3 centers, or number of rings per system.
What new opportunities exist? The researchers identified nearly half a million synthetically accessible ring systems and winnowed these down to 3902 ring systems that have similar heteroatom ratios to those found in drugs and differ by at most two atoms. This attempt to explore new chemical space is similar to earlier work from the same group (here) as well as that from others (here, here and here).
The researchers also examined growth vectors and combinations of rings, the latter by using graph theory. These analyses suggest that investigational drugs do have greater variety. In other words, even if the component rings are shared with approved drugs, they might be combined in new ways.
Whether certain ring systems are more likely to fail in the clinic was intentionally not addressed, due to the difficulty of assessing why the failures occurred. For example, drugs can fail for commercial reasons; a company may choose to drop a drug against a particular target rather than be tenth to market. And even when the failure is due to the science, it might not be an indictment of the drug itself. Verubecestat did lower β-amyloid levels in people as designed, but had no effect on Alzheimer’s disease.
This paper is a fun read, and it will likely provide ideas for scaffold hopping and library design. It is also a reminder of how much chemical space remains to be explored.

25 July 2022

Fragments vs TEAD: noncovalent this time

Last week we described a fragment-derived covalent probe that targets the four closely related TEAD transcription factors, which are part of the Hippo signaling pathway implicated in some cancers. A new paper in J. Med. Chem. by Timo Heinrich and collaborators at Merck KGa, iBET, and Cancer Research Horizons brings us another fragment-derived probe, this one noncovalent.
The researchers started by screening 1930 fragments, each at 2 mM, against TEAD1 and TEAD3 using SPR. Perhaps not surprisingly given the high concentration used, this led to a whopping 560 hits. These were then tested in dose-response format against TEAD1 with or without the coactivator YAP; 254 compounds showed differential affinity, among them compound 1. This molecule was crystallized bound to TEAD3, which revealed that it binds to the hydrophobic pocket normally occupied by a covalently-bound palmitoyl group required for activity. Despite being a fragment, compound 1 was active in a cell reporter assay, and the researchers state that further optimization was done using cellular assays rather than biophysical or biochemical experiments.

Analysis of the crystal structure suggested that enlarging the cyclopentyl moiety could fit more snugly into a hydrophobic pocket, while adding a small propyl moiety could extend into a separate pocket, leading to compound 6, with a 10-fold boost in activity. Replacing the propyl with an additional ring led to sub-micromolar compound 9. Finally, replacing the saturated ring with a substituted phenyl moiety led to MSC-4106, with low nanomolar activity in the cell reporter assay.
Thermal stabilization (specifically, nanoDSF) assays showed that MSC-4106 stabilized TEAD1 and TEAD3 but not TEAD2 or TEAD4. Palmitoylation assays confirmed this selectivity profile. The paper also includes a nice table comparing experimental selectivities of seven other non-covalent TEAD inhibitors, which vary from having activity only against TEAD1 to activity against all four homologs.
MSC-4106 was clean when tested at 10 µM against a panel of 58 receptors and 1 µM against nearly 400 kinases. It did not inhibit hERG or any of the common CYP450s. Finally, PK studies in mice, rats, and dogs showed that the compound is orally bioavailable with a long half-life. Given these favorable properties it was taken into xenograft studies, where it showed tumor growth inhibition at 5 mg/kg and tumor regression at 100 mg/kg. Analysis of tumor tissue showed downregulation of a TEAD-regulated gene, Cyr61.
Can we draw any lessons from comparing covalent MYF-03-176 (discussed last week) with non-covalent MSC-4106? Probably not, given that the former hits all TEAD homologs while the latter is selective for TEAD1 and TEAD3. Both molecules look to be excellent chemical probes for further dissecting Hippo signaling. I look forward to seeing how TEAD inhibitors ultimately fare in the clinic.

18 July 2022

From covalent fragment to lead against TEAD

As noted just last month, covalent fragment-based drug discovery is becoming ever more popular. However, many papers report relatively weak hits with little or no optimization. A new preprint posted to bioRxiv (HT Covalent Modifiers) by Tinghu Zhang, Nathanael Gray, and collaborators at Stanford and elsewhere describes a fragment-to-lead story for the TEAD family of transcription factors.
The four highly homologous members of the TEAD family play a role in the Hippo signaling pathway. When spurred by the coactivator YAP they cause gene expression that has been implicated in certain cancers, particularly mesothelioma. To bind YAP, TEAD needs to be palmitoylated on a specific cysteine residue. A covalent inhibitor that binds to this cysteine could prevent palmitoylation and thus block Hippo signaling.
Multiple academic and industrial groups have been pursuing this target, and one previously reported inhibitor is flufenamic acid. This molecule was used in the new paper to design a small library of analogs each functionalized with an acrylamide moiety. These were screened against TEAD2 and analyzed by mass spectrometry; MYF-01-37 modified the protein (though unfortunately time and exact concentrations are not specified). Proteolysis and tandem mass spectrometry confirmed that the molecule binds to C380, the site of palmitoylation.
Analysis of previously published crystal structures revealed a side pocket off the main hydrophobic channel that normally binds the palmitoyl group. The researchers created a focused library of analogs to try to access this pocket, which led to molecules such as MYF-03-69. This compound was active in a biochemical assay and showed rapid labeling of the protein as assessed by mass spectrometry. A crystal structure of the compound bound to TEAD1 confirmed the molecule forms a covalent bond to the target cysteine and does in fact bind in both pockets. 

MYF-03-69 inhibited palmitoylation of all four TEAD paralogs in biochemical assays. More importantly, it showed activity in several cell assays, including blocking palmitoylation and disrupting the interaction between TEAD and YAP. The molecule downregulated YAP-TEAD transcription in reporter gene assays as well as RNA sequencing assays. Finally, MYF-03-69 showed mid-nanomolar antiproliferative activity in mesothelioma cells but not in non-cancerous cell lines.
Despite this promising activity, MYF-03-69 lacked acceptable oral bioavailability. Further medicinal chemistry led to MYF-03-176, which has improved bioavailability and showed even better activity in reporter gene assays and better antiproliferative activity in mesothelioma cell lines. The molecule also led to tumor regression in a mouse xenograft model when dosed orally.
This is a nice story with lots of information, though were I a reviewer I would ask for the kinact/KI values for the molecules. This ratio describes the rate of covalent modification and is time and concentration independent, which makes comparisons with other molecules more straightforward (see this 2017 open-access paper for a good discussion). Since this is a preprint hopefully the final published paper will include these values. 
Regardless, MYF-03-176 looks like an excellent chemical probe for studying the effect of irreversible inhibition of Hippo signaling.

11 July 2022

Fragments in the clinic: HTL9936

Of the 50+ fragment-derived drugs that have entered the clinic, only two (both from Sosei Heptares) target transmembrane proteins, reflecting the difficulty of structure-based design for this hard-to-crystallize class of proteins. The story behind one of them was published late last year in Cell by Malcom Weir, Andrew Tobin, and a large group of collaborators.
The researchers were interested in the M1 muscarinic acetylcholine receptor, which is involved in memory and learning. By activating the receptor the hope is to be able to treat symptoms associated with Alzheimer’s disease. The M1 receptor has been a long-standing target for this disease, but previous drugs have caused side effects ranging from salivation and sweating to gastrointestinal distress and seizures. The M1 receptor is one of five closely related subtypes, and some of the side effects have been attributed to hitting the M2 and M3 receptors. However, the M1 receptor itself may also not be entirely innocent, so the goal was to develop a partial agonist, the idea being that this may be more effective in the brain, where the M1 receptor is highly expressed, while sparing other tissues where the M1 receptor is rarer.
The campaign began with a virtual screen of 1.6 million molecules (with molecular weights up to 400 Da) against a homology model of the human M1 receptor bound to a known agonist. This led to the purchase of 322 compounds, of which 16 were active in a cell-based functional assay, including compound 4. Fragment growing led to compound 6 and ultimately to HTL9936, which is selective for M1 over M2, M3, and M4 receptors. It also showed no significant agonism against a panel of 62 GPCRs even at 10 µM concentration.

Sosei Heptares pioneered the use of mutagenesis to stabilize specific conformational states of GPCRs, and this process was used to produce co-crystals with HTL9936 to understand its binding mode. Like other reported agonists, which were also characterized crystallographically, HTL9936 binds in the orthosteric site of the M1 receptor, but the increased size of the homopiperidine ring relative to other ligands provides selectivity over other receptors such as M2.
HTL9936 was tested in mice, rats, dogs, and cynomolgus monkeys, and in general showed good safety and brain penetration. The molecule even showed cognitive benefits in a mouse model of neurodegeneration and in aged beagles. It did cause an increase in heart rate and blood pressure in dogs, and there was a single convulsive episode, but only at a very high dose.
The paper also summarizes the results of human clinical trials which demonstrated that HTL9936 is well tolerated up to 100 mg doses, though at higher doses sweating, salivation, and changes in heart rate and blood pressure were observed. A small trial in healthy elderly people did not show any improvement in memory tasks, though functional magnetic resonance imaging studies did show that the molecule activated regions of the brain associated with cognition.
And that’s where the story ends. The Sosei Heptares website does not list HTL9936, though a different M1 receptor agonist (HTL0018318) is described. This paper also illustrates the long gap that can occur between research and publication: ClinicalTrials.gov lists three Phase 1 studies for HTL0009936, one of which began in 2013, and all of which ended by early 2017. Like most approaches to Alzheimer’s disease that have been tested, perhaps targeting the M1 receptor is a dead end. But reaching that conclusion requires highly selective chemical probes. Kudos to the team at Sosei Hetpares for their efforts.

03 July 2022

What belongs in the Protein Data Bank?

The rise of high-throughput crystallography is among the most exciting recent developments for fragment finding. Historically deemed too slow for primary screening, crystallography was reserved for select hits from an assay cascade. Now crystallographic screens up-front sometimes yield hundreds of hits. Many have been deposited in the Protein Data Bank (PDB). In a recent (open access) Protein Sci. commentary, Mariusz Jaskolski (Mickiewicz University), Bernhard Rupp (Medical University Innsbruck), and collaborators in the US question this practice.
In particular, the researchers ask whether molecules processed using Pan-Dataset Density Analysis (PanDDA) belong in the PDB. The method, which we described here, is typically used when hundreds of compounds have been soaked into crystals of the same protein. Most molecules will not bind, and these empty structures can be averaged to provide a background map to better identify weakly-bound ligands that may have only partial occupancy.
The researchers seem suspicious of this technique, referring to “supposed ligands” that may “confuse most biomedical researchers” and “degrade the PDB integrity,” the effect of which “could be disastrous.” To support their argument, they provide two examples from the PDB where the atomic models diverge from the electron density calculated using conventional methods and one with wonky statistics.
To avoid “contamination of the PDB by suboptimal structures,” the researchers suggest depositing structures from large-scale crystallographic screens in a separate database. Alternatively, they suggest clearer annotation. (To be fair, all three of the examples cited are already prominently marked “PanDDA analysis group deposition.”)
Needless to say, this is controversial. In a bioRxiv preprint, Manfred Weiss (Helmholtz-Zentrum Berlin) and collaborators in the US, Germany, Sweden, and the Netherlands, some of whom co-developed PanDDA, take a different view.
The researchers agree that group depositions need to be marked clearly, but they argue that they squarely belong in the PDB rather than in a separate repository. Moreover, “commentaries that underestimate the knowledge of PDB users, that ignore the opportunities present in heterogenous crystallographic data, and that miss out on chances for education on structure quality do more harm than good.”
The three examples described by Jaskolski and colleagues are re-examined, and while it is true that two of them do show poor occupancy using conventional methods, the ligands are clearly visible when PanDDA is used. (In the third case, there was an error in the resolution cutoff during automated processing, but the data could be successfully reprocessed manually.)
PanDDA was developed specifically to identify small, low occupancy ligands, so the researchers argue that these entries “cannot and should not be treated in the same way” as other ligands. Banning them from the PDB would potentially impede future research.
Weiss and colleagues refer to the Structural Genomics campaign of the late 1990s and early 2000s to solve myriad structures of diverse proteins, most of which were not being otherwise studied. At the time some commentators derided this effort as “stamp collecting.” Yet the number and diversity of structures thus deposited into the PDB likely contributed to the success of automated protein folding algorithms such as AlphaFold2.
Similarly, including structures from PanDDA processing could lead to unforeseen advances. For example, Weiss and colleagues suggest we may be able to “extract all aspects of conformational as well as of compositional heterogeneity out of all these data sets.” A better understanding of the role of protein dynamics in ligand binding is likely to require thousands of similar datasets of the kind being uploaded.
Personally, I believe that scientists should be wary of all published information. As the old saying goes, trust, but verify. As evidenced by my five-part series “Getting misled by crystal structures,” even conventional structures in the PDB should not necessarily be taken at face value. With that precaution, I’ll hold with the conclusion of Weiss and colleagues: “As long as the data is there, let’s embrace it and make it available!”

27 June 2022

CovPDB: a free, searchable database of covalent protein-ligand structures

Last week we highlighted KinaFrag, a database of kinase-fragment complexes. Continuing the theme, this week brings us CovPDB, a database of high-resolution covalent protein-ligand structures. The database was described by Stefan Günther and colleagues at Albert-Ludwigs-Universität Freiburg in an open-access Nucleic Acids. Res. paper earlier this year.
The researchers downloaded all structures from the protein data bank (PDB) as of 31 August 2020 and extracted those with covalently bound ligands refined to at least 2.5 Å resolution. These were then manually curated to remove cofactors (such as retinal) and crosslinkers. Next, the chemical structures of the pre-reacted ligands were extracted from the primary citations. Everything was then combined into an easy-to-use database, and all the contents can also be downloaded.
CovPDB contains 2,294 unique protein-ligand complexes, with 733 different proteins and 1501 different ligands. A total of 93 different types of warheads are represented, from exotic (arsine oxide) to conventional (vinyl carbonyl, including acrylamides). These are further grouped into 21 covalent mechanisms. 
As expected, covalent bonds to cysteine and serine are most common, with 959 and 830 examples, respectively. Lysine, with 205 representatives, is a distant third, but I was surprised that various unreactive amino acid residues such as glycine, valine, and proline also showed up. Closer inspection revealed that these are N-terminal residues; the ligand reacts with the free amine. Though these sorts of bonds occur with several drugs, including carfilzomib and voxelotor, it might be nice to have separate annotations to keep these from being confused with residues that react exclusively at the side chain.
Browsing by ligand, protein, complex, warhead, covalent mechanism, or targeted residue is straightforward, as is searching by multiple methods, including ligand similarity and substructure. Each entry has its own page with a wealth of information, including an interactive 3D-viewer. Here’s the entry for one of the Tethering hits that ultimately led to sotorasib.

CovPDB should be especially useful to computational folks looking to build models based on high-quality data, but it's also fun to browse for new ideas and inspiration.
Importantly, the researchers state that they will update this database annually. As covalent drug discovery (including with fragments) becomes increasingly prominent, I expect the size of CovPDB to grow rapidly.

20 June 2022

KinaFrag: a free, searchable database of kinase fragments

Four of the six approved fragment-derived drugs are kinase inhibitors, and three of these bind in the active site. Despite these successes, there are plenty of opportunities for new kinase-directed drugs, particularly those targeting cancer resistance mutations. In a recent Brief Bioinform. article, Guang-Fu Yang and colleagues at Central China Normal University describe a new tool to facilitate these discoveries.
The researchers started by trawling multiple databases such as kinase.com, DrugBank, ChEMBL, and the Protein Data Bank for kinase inhibitors. The results were combined and collated to yield a set of 7783 kinase-inhibitor fragment complexes, with more than 3000 unique fragments. Most of these bind in the “front cleft” of the active site, where the adenine of ATP normally binds, but several hundred also sit in the so-called back pocket or the intervening area.
What’s nice is that all this information is available on a free website called KinaFrag. You can download the structures yourself, but the site can also be browsed or searched. Fragments are annotated with links to various databases; here’s an example.
There are some bugs. While I was able to search by physicochemical parameters such as molecular weight and number of hydrogen bond donors, I could not get the substructure search to work. I’d be curious as to whether readers could do so.
To demonstrate the utility of KinaFrag, the researchers describe a case study in which they started with the anticancer drug larotrectinib, which inhibits TRK family kinases. However, the molecule is less effective against several mutations observed in the clinic. Examining the bound structure revealed that the mutations introduce steric clashes. Retaining the hinge-binding fragment while performing virtual screening of fragments from KinaFrag led to molecules such as YT3, potent against both wild type TRKA and two resistance mutants, and further optimization resulted in YT9. 

Not only was YT9 active against the wild type and mutant forms of TRKA, it showed good oral bioavailability and pharmacokinetics in rats. Encouragingly, the molecule slowed tumor growth in both wild type and mutant TRKA mouse xenograft models.
One could debate whether this is an example of FBLD; the discovery of YT9 could also be considered a classic case of scaffold hopping. But semantics aside, this is a nice example of thinking in terms of fragmenting molecules. More broadly, KinaFrag looks like a useful tool for work on kinases – especially if the substructure search works.

13 June 2022

Fragments vs HIV-1 Protease: Pocket-to-Lead

The drugging of HIV-1 protease is a classic structure-based design success story, as discussed in a guest post by Glyn Williams from the early days of the SARS-CoV-2 pandemic. The peptide origins of approved inhibitors such as saquinavir are obvious, and the residual structural features can present problems for oral bioavailability. Although there have been fragment screens against the enzyme, the hits do not seem to have been pursued, perhaps in part to the number of approved drugs. But viruses never stop mutating, and developing new chemical matter is prudent. In a recent J. Med. Chem. paper, Yuki Tachibana and colleagues at Shionogi describe a fragment-based approach.
The researchers started by performing a virtual screen, but none of the hits were active when tested in a biochemical assay. The active site of HIV-1 protease contains four hydrophobic subsites, and none of the virtual hits filled all four of them. Thus, the researchers chose to focus on fragments that could make some of the interactions while providing growth vectors to additional subsites. They call this a “pocket-to-lead” strategy.
Fragment 5 docked nicely into the active site; the hydroxyl group makes interactions with the catalytic aspartic acid residues, while the phenyl ring tucks into the S2 pocket. Growing into the S2’ and S1’ pockets led to molecules such as compound 9, which showed weak but detectable activity. (Astute readers will notice that the stereochemistry around the hydroxyl moiety has changed; both diastereomers are active.) A crystal structure of compound 9 bound to HIV-1 protease confirmed the predicted binding mode

Examination of the crystal structure revealed that the parafluorobenzyl substituent was not completely filling the S1’ pocket, and was also in a strained conformation. Replacing this with an alkyl substituent led to low micromolar compound 12. Finally, growing into the S1 subsite led to compound 14, a low nanomolar inhibitor with sub-micromolar antiviral activity.
This is a nice example of structure-guided, computationally-enabled fragment-based lead discovery that bears some similarity to the V-SYNTHES method we highlighted earlier this year. As the researchers note, the cyclic lactam found in fragment 5 had been used previously in HIV-1 protease inhibitors. It might have been possible to get to something similar to compound 14 from that earlier molecule. But regardless, compound 14 is emphatically non-peptidic. Whether it will lead to superior drugs remains to be seen, but the paper does say that further optimization is underway.

06 June 2022

What to make first? A new “Ring Replacement Recommender” provides suggestions

So you’ve run a fragment screen, gotten some hits, and validated them. What then? Looking for in-house or commercial analogs is always a good idea, but if you’re serious about a project you’ll eventually need to do chemistry, for example replacing one ring with another (say, a pyridyl for a phenyl). The possibilities are almost endless, especially if you don’t know how your fragment binds. In a new Eur. J. Med. Chem. paper, Peter Ertl and colleagues at Novartis describe a “Ring Replacement Recommender” to rapidly improve biological activity.
To determine which replacements are likely to improve affinity, the researchers turned to ChEMBL, a database of more than 2 million molecules and associated biological activity extracted from tens of thousands of publications. From these, more than 68,000 chemical series were chosen for analysis. Each series had on average 16 members, and at least three. The biological activity of each member of a series was compared with other members of the same series. (Importantly, the researchers intentionally excluded anti-targets such as hERG and CYPs so the tool wouldn’t inadvertently improve binding to these.) Focusing only on ring replacements that were reported in at least five publications led to a set of 26,762 changes. Changes could be as modest as adding a methyl substituent or more elaborate such as changing a single aromatic ring to a fused aromatic-aliphatic ring system.
One would think that most changes would have little effect, as had previously been seen in the case of methyl additions. Indeed about 65% of the replacements caused shifts in potency of 2-fold or less, which is probably within experimental error. However, 2860 replacements of 245 rings improved affinity at least 2-fold (averaging 3.5-fold), with 223 cases yielding greater than ten-fold improvements.
Analyzing the data further, the researchers found 80 ring systems that frequently led to improvements in affinity, and they suggest these could be used as “universal” or privileged building blocks. Strikingly, 74 of these are aromatic, confirming work from Cohen we highlighted in 2020 that proteins may favor “flat” rather than shapely molecules.
The researchers also extracted 9515 drugs and clinical compounds from ChEMBL and examined the component fragments. Of the 80 ring systems in the universal set, 19 are found in 50 or more drugs, with another 37 found in at least 5 drugs. This set may be a particularly attractive go-to list.
Importantly, not only are all the replacements available in the Supporting Information, the researchers have created a handy and free online tool. Just click on a ring of interest and the Ring Replacement Recommender provides suggestions, along with the average fold improvement observed and the number of publications used for the calculation.
To see how well it works, I looked at a couple recent examples which entailed ring changes. The indole to indazole replacement used in the TLR7/8 work described last month was not suggested by the Recommender, though in that case the researchers had the benefit of a crystal structure. On the other hand, a cyclobutyl to phenyl substitution for SARS-CoV-2-3CLp was correctly predicted to be beneficial.
Of course, as we’ve said repeatedly, affinity is only part of the battle in drug discovery, and the researchers emphasize that their recommendations may not improve physicochemical or pharmacokinetic properties. But for the earliest stage of a program, and especially in the absence of other data, it’s worth giving the Recommender a try.

30 May 2022

Covalent fragments vs Rgl2

Just over a year ago the FDA granted accelerated approval to sotorasib, the first marketed inhibitor of KRAS and the first approved fragment-derived covalent drug. In a recent ChemMedChem paper, Samy Meroueh and colleagues at Indiana University School of Medicine describe their efforts against a protein in a related pathway.
KRAS is a GTPase which cycles between an “on” state, where GTP is bound, and an “off” state, where GTP is hydrolyzed to GDP. KRAS is just one member of a superfamily of GTPases. Two other members also associated with cancer include RalA and RalB. Sotorasib acts by binding to a mutant form of KRAS in which a glycine is replaced by a cysteine, but this mutation does not occur in RalA or RalB. An alternative approach to targeting GTPases is to prevent them from becoming activated by guanine exchange factors (GEFs), which help exchange GDP to GTP. We’ve previously written about how fragments have led to noncovalent inhibitors of the GEF SOS1, which activates RAS proteins.
To sum up, there’s more than one way to block GTPase activity: directly, or by preventing activation by an associated GEF. The new paper focuses on Rgl2, a GEF that serves RalA and RalB.
Rgl2 sports four surface-exposed cysteine residues, so the researchers screened the protein against a library of 1260 electrophilic fragments at 75 µM for 24 hours at 4 °C and then assessed whether it could still activate RalB. 50 fragments inhibited guanine nucleotide exchange by at least 30%, and a dozen were studied in detail. All were time-dependent inhibitors and had EC50 values from 2.6 to 120 µM at 24 hours.
Next, the researchers mutated each of the four surface-exposed cysteine residues to serine. The twelve fragments still inhibited all the mutants except C284S. SOS1 does not contain a cysteine at the position corresponding to C284, and indeed none of the twelve fragments significantly inhibited SOS1 activation of KRAS. All this suggests the fragments act via modification of C284.
The easiest and most direct measurement of covalent binding is with intact protein mass spectrometry, and the researchers confirmed that 10 of the 12 fragments did in fact form adducts. Interestingly, Rgl2 was modified two or three times by each fragment, which is perhaps not surprising given that they had relatively reactive warheads (chloroacetamides or propiolamides). Mass-spec studies with the mutants revealed that most of the modifications were at C284 and C508.
Whether or not these fragments are advanceable, the discovery that modification of C284 inhibits Rgl2 is useful. Interestingly, C284 is near but not at the Ral binding interface, and the researchers suggest that their fragments block protein activity allosterically. I believe such allosteric sites are common throughout the proteome, and readily addressable using covalent approaches. Watch this space!