29 November 2021

DeepFrag: fragment optimization by machine learning

Machine learning is becoming increasingly common in drug discovery. Just a few months ago we highlighted its use to design a library of privileged fragments. However, constructing a library is usually done infrequently (though continued renovation of a library is always a good idea). In two papers from earlier this year, Jacob Durrant and colleagues at University of Pittsburgh use machine learning to tackle the more common task of lead optimization.
The first paper, in Chem. Sci., describes DeepFrag, a “deep convolutional neural network for fragment-based lead optimization.” The researchers started with the Binding MOAD database, a collection of nearly 39,000 high-quality protein-ligand complex structures from the Protein Data Bank. Ligands were computationally fragmented by chopping off terminal appendages less than 150 Da. The fragments were then converted into molecular fingerprints encoding their structures. Meanwhile, the protein region around each ligand was converted into a three-dimensional grid of voxels, akin to how images used for computer vision training are processed.
The researchers describe the goal as follows. “We propose a new ‘fragment reconstruction’ task where we take a ligand/receptor complex, remove a portion of the ligand, and ask the question ‘what molecular fragment should go here.’”
About 60% of the data were used in a training model for the machine learning algorithm. This was then evaluated on 20% of the data and further refined before the final evaluation on the remaining 20% of the data. The details are beyond the scope of this post (and frankly beyond me as well) but DeepFrag recapitulated known fragments about 60% of the time. Importantly, the model worked for diverse types of fragments, including both polar and hydrophobic examples. Even “wrong” answers were often similar to the “correct” responses, for example a methyl group instead of a chlorine atom. In some cases where DeepFrag’s predictions differed from the original ligand the researchers note that these may be acceptable alternatives, a hypothesis supported by subsequent molecular docking studies.
Of course, the goal for most of us is not to recapitulate known ligands but to optimize them, so the researchers applied DeepFrag to crystallographically identified ligands of the main protease from SARS-CoV-2. Many of them docked well, though they have yet to be synthesized and tested.
Laudably, the model and source code have been released and can be accessed here. However, as these require a certain amount of computer savvy to use, Harrison Green and Jacob Durrant have also created an open-source browser app which is described in an open-access application note in J. Chem. Inf. Mod.
The browser app runs entirely on a local computer, without requiring users to upload possibly sensitive data. The application note describes using the app to recapitulate an example from the original paper. It also describes using it on a fragment bound to antibacterial target GyrB, a fragment-to-lead success story we blogged about last year. DeepFrag correctly predicted some of the same fragment additions that were described in that paper.
The app is incredibly easy to use: just load a protein and ligand (from a pdb file, for example) and the structure appears in a viewer. Click the “Select Atom as Growing Point” button, choose an atom, and hit “Start DeepFrag.” The ranked results are provided as SMILES strings and chemical structures, and the coordinates can also be downloaded. You can also delete atoms before growing if you would like to replace a fragment.
In my own cursory evaluation, DeepFrag correctly suggested adding a second hydroxyl to the ethamivan fragment bound to Hsp90 (see here). It did not suggest an isopropyl replacement for the methoxy group, but it did suggest methyl. Trying a newer example unlikely to have been part of the training set did not recapitulate the ethoxy in the BTK ligand compound 18 (see here), but did suggest a number of interesting and plausible rings. Calculations took a few minutes on my aging personal Windows laptop using Firefox.
In contrast to the hyperbolic claims too often seen in the field, the researchers conclude the Chem. Sci. paper modestly: “though not a substitute for a trained medicinal chemist, DeepFrag is highly effective for hypothesis generation.”
Indeed – I recommend playing around with it. We may still be some way from SkyFragNet, but we’re making progress.

22 November 2021

Selective fragments vs GPCRs, guided by modeling

Earlier this year we highlighted a fragment optimization success story against a G protein-coupled receptor (GPCR) which made no use of structural information. Due to the difficulty of crystallizing these membrane-bound proteins, structures have been rare for this large class of drug targets. Advances in crystallography are starting to change that. In a recent open-access Chem. Commun. paper, Jens Carlsson and collaborators at Uppsala University and the US National Institutes of Health make use of the increasing availability of such structures to develop potent, selective inhibitors.
The researchers were interested in A1 and A2A adenosine receptors (A1AR and A2AAR), targets for a variety of ailments from cancer to cardiovascular diseases. (A2AAR was the subject of this blog post a few months ago.) In the current study, the researchers wanted to know whether structures and molecular dynamics (MD) simulations could guide production of selective inhibitors.
Previous computational and experimental work from the authors had yielded compound 1, with low micromolar activity against A1AR and 7-fold selectivity over A2AAR. Crystal structures of both these proteins are available, though not bound to the small molecule. Docking studies suggested that the ligand would make similar interactions to both proteins, but that there might be an opportunity for increased selectivity towards A1AR due to the presence of a smaller threonine residue compared with a methionine in A2AAR. Nine analogs were designed to grow into this lipophilic pocket, and free energy perturbation and MD simulations suggested that they would have improved affinity for A1AR. This turned out to be the case when the molecules were made and tested in radioligand binding assays.

Although compounds 5 and 9 were more potent, selectivity was not improved. MD simulations suggested this might be due to the small size of the fragments, which could be accommodated in A2AAR by slight shifts in the binding modes. To try to anchor compounds within the pocket, the researchers grew off the phenyl ring, leading to molecules such as compound 15. Borrowing from this molecule and compound 9 led to compound 22, the most potent and selective molecule in the series. (A separate effort led to a somewhat weaker but A2AAR-selective ligand.) Both molecules were found to be antagonists when tested in cells, which was expected given that the crystal structures used for modeling were in the inactive conformation.
The correlation between predicted and measured binding energies was respectable, with a mean unsigned error (MUE) of 1.08 kcal/mol and Spearman’s rank correlation coefficient (ρ) of 0.8 for 24 compounds. Selectivity predictions were also impressive at MUE = 0.48 kcal/mol and ρ = 0.85.
This is a nice illustration of using computational methods to improve the affinity of a fragment by more than three orders of magnitude while also increasing selectivity. This particular system is probably on the easier side; we blogged about previous research from this group on A2AAR back in 2013. The researchers note that proteins with larger binding sites and weaker ligands are likely to be more challenging. It will be fun to see efforts towards Class B GPCRs, for example.

15 November 2021

Fragments vs SETD2: a chemical probe

Among the various epigenetic “writers,” only one is capable of trimethylating lysine 36 of histone H3. SET domain-containing protein 2 (SETD2) is thought to be a tumor suppressor, but some evidence suggests it may have the opposite effect in certain cancers. A chemical probe would be useful to resolve these conflicting ideas, and in an (open access) ACS Med. Chem. Lett. paper Neil Farrow and colleagues at Epizyme describe one.
Epizyme has been pursuing epigenetic targets for years and has built a methyltransfersase-biased compound collection. A radiometric screen of this library yielded compound 1 and a related molecule. Both were weak inhibitors, but a co-crystal structure with the enzyme revealed the indole buried deep in the substrate binding pocket. Tweaking this led to compound 4, with low micromolar activity.
Substitution off the indole and phenyl moieties ultimately led to compound 25, with low nanomolar biochemical and cell activity. However, this molecule also had low aqueous solubility and poor pharmacokinetics in mice. Recognizing that the lipophilic and aromatic nature of the molecule were likely responsible, the researchers returned to the initial hit. Replacing the phenyl with a cyclohexyl moiety and making a few more modifications ultimately led to EPX-719.
The pharmacokinetics of EPX-719 in mice are reasonable, and the molecule is >8000-fold selective against a panel of 14 other histone methyltransferases. It is also fairly clean against a panel of 47 off-targets and 45 kinases. EPX-719 showed antiproliferative activity in two multiple myeloma cell lines, and more detailed biological studies are promised in a future paper.
This is a nice hit to lead story. As the researchers note, “close attention to the physical chemical properties of the inhibitors, in particular basicity, lipophilicity, and aromatic character, led to compounds with attractive cellular activities and in vivo exposures.” Interestingly though, the word “fragment” does not appear once in the paper. Although compounds 1 and 4 venture a bit beyond the rule of three, I would argue that starting with small, low affinity binders and focusing closely on molecular properties is the very definition of fragment-based lead discovery.
A quarter-century of FBLD has influenced the scientific zeitgeist, and a fragment by any other name is still as sweet.

08 November 2021

Fragments in the clinic: 2021 edition

Since our last clinical update in 2020 two new fragment-derived drugs, asciminib and sotorasib, have been approved, bringing the total to six.

The current list contains 52 molecules, with 22 approved or in active trials. As always, this table includes compounds whether or not they are still in development (indeed, some of the companies no longer even exist). Because of this, the Phase 1 list contains a higher proportion of compounds that are no longer progressing. 
Drugs reported as still active in clinicaltrials.gov, company websites, or other sources are in bold, and those that have been discussed on Practical Fragments are hyperlinked to the most relevant post. The list is almost certainly incomplete, particularly for Phase 1 compounds. If you know of any others (and can mention them) please leave a comment.


PexidartinibPlexxikonCSF1R, KIT
Amgen KRASG12C
VenetoclaxAbbVie/GenentechSelective BCL-2
Phase 3

Pelabresib (CP-0610)
Phase 2

AT9283 AstexAurora, JAK2
IndeglitazarPlexxikonpan-PPAR agonist
MAK683NovartisPRC2 EED
Navitoclax (ABT-263)AbbottBCL-2/BCLxL
Cullinan Oncology / Wistar
Phase 1

ABBV-744AbbottBD2-selective BET
ABT-518AbbottMMP-2 & 9
AT13148AstexAKT, p70S6K, ROCK
AZD5099AstraZenecaBacterial topoisomerase II
BI 691751Boehringer IngelheimLTA4H
HTL0014242Sosei HeptaresmGlu5 NAM
NavoximodNew Link/GenentechIDO1

With only two phase 3 molecules in active development it may be some time before the next fragment-derived drug is approved. Then again, in 2020 sotorasib was only in phase 2. While long timelines are common in our industry, good drugs can make remarkably rapid progress.