27 June 2022

CovPDB: a free, searchable database of covalent protein-ligand structures

Last week we highlighted KinaFrag, a database of kinase-fragment complexes. Continuing the theme, this week brings us CovPDB, a database of high-resolution covalent protein-ligand structures. The database was described by Stefan Günther and colleagues at Albert-Ludwigs-Universität Freiburg in an open-access Nucleic Acids. Res. paper earlier this year.
 
The researchers downloaded all structures from the protein data bank (PDB) as of 31 August 2020 and extracted those with covalently bound ligands refined to at least 2.5 Å resolution. These were then manually curated to remove cofactors (such as retinal) and crosslinkers. Next, the chemical structures of the pre-reacted ligands were extracted from the primary citations. Everything was then combined into an easy-to-use database, and all the contents can also be downloaded.
 
CovPDB contains 2,294 unique protein-ligand complexes, with 733 different proteins and 1501 different ligands. A total of 93 different types of warheads are represented, from exotic (arsine oxide) to conventional (vinyl carbonyl, including acrylamides). These are further grouped into 21 covalent mechanisms. 
 
As expected, covalent bonds to cysteine and serine are most common, with 959 and 830 examples, respectively. Lysine, with 205 representatives, is a distant third, but I was surprised that various unreactive amino acid residues such as glycine, valine, and proline also showed up. Closer inspection revealed that these are N-terminal residues; the ligand reacts with the free amine. Though these sorts of bonds occur with several drugs, including carfilzomib and voxelotor, it might be nice to have separate annotations to keep these from being confused with residues that react exclusively at the side chain.
 
Browsing by ligand, protein, complex, warhead, covalent mechanism, or targeted residue is straightforward, as is searching by multiple methods, including ligand similarity and substructure. Each entry has its own page with a wealth of information, including an interactive 3D-viewer. Here’s the entry for one of the Tethering hits that ultimately led to sotorasib.
 



 
CovPDB should be especially useful to computational folks looking to build models based on high-quality data, but it's also fun to browse for new ideas and inspiration.
 
Importantly, the researchers state that they will update this database annually. As covalent drug discovery (including with fragments) becomes increasingly prominent, I expect the size of CovPDB to grow rapidly.

20 June 2022

KinaFrag: a free, searchable database of kinase fragments

Four of the six approved fragment-derived drugs are kinase inhibitors, and three of these bind in the active site. Despite these successes, there are plenty of opportunities for new kinase-directed drugs, particularly those targeting cancer resistance mutations. In a recent Brief Bioinform. article, Guang-Fu Yang and colleagues at Central China Normal University describe a new tool to facilitate these discoveries.
 
The researchers started by trawling multiple databases such as kinase.com, DrugBank, ChEMBL, and the Protein Data Bank for kinase inhibitors. The results were combined and collated to yield a set of 7783 kinase-inhibitor fragment complexes, with more than 3000 unique fragments. Most of these bind in the “front cleft” of the active site, where the adenine of ATP normally binds, but several hundred also sit in the so-called back pocket or the intervening area.
 
What’s nice is that all this information is available on a free website called KinaFrag. You can download the structures yourself, but the site can also be browsed or searched. Fragments are annotated with links to various databases; here’s an example.
 
 
There are some bugs. While I was able to search by physicochemical parameters such as molecular weight and number of hydrogen bond donors, I could not get the substructure search to work. I’d be curious as to whether readers could do so.
 
To demonstrate the utility of KinaFrag, the researchers describe a case study in which they started with the anticancer drug larotrectinib, which inhibits TRK family kinases. However, the molecule is less effective against several mutations observed in the clinic. Examining the bound structure revealed that the mutations introduce steric clashes. Retaining the hinge-binding fragment while performing virtual screening of fragments from KinaFrag led to molecules such as YT3, potent against both wild type TRKA and two resistance mutants, and further optimization resulted in YT9. 
 

Not only was YT9 active against the wild type and mutant forms of TRKA, it showed good oral bioavailability and pharmacokinetics in rats. Encouragingly, the molecule slowed tumor growth in both wild type and mutant TRKA mouse xenograft models.
 
One could debate whether this is an example of FBLD; the discovery of YT9 could also be considered a classic case of scaffold hopping. But semantics aside, this is a nice example of thinking in terms of fragmenting molecules. More broadly, KinaFrag looks like a useful tool for work on kinases – especially if the substructure search works.

13 June 2022

Fragments vs HIV-1 Protease: Pocket-to-Lead

The drugging of HIV-1 protease is a classic structure-based design success story, as discussed in a guest post by Glyn Williams from the early days of the SARS-CoV-2 pandemic. The peptide origins of approved inhibitors such as saquinavir are obvious, and the residual structural features can present problems for oral bioavailability. Although there have been fragment screens against the enzyme, the hits do not seem to have been pursued, perhaps in part to the number of approved drugs. But viruses never stop mutating, and developing new chemical matter is prudent. In a recent J. Med. Chem. paper, Yuki Tachibana and colleagues at Shionogi describe a fragment-based approach.
 
The researchers started by performing a virtual screen, but none of the hits were active when tested in a biochemical assay. The active site of HIV-1 protease contains four hydrophobic subsites, and none of the virtual hits filled all four of them. Thus, the researchers chose to focus on fragments that could make some of the interactions while providing growth vectors to additional subsites. They call this a “pocket-to-lead” strategy.
 
Fragment 5 docked nicely into the active site; the hydroxyl group makes interactions with the catalytic aspartic acid residues, while the phenyl ring tucks into the S2 pocket. Growing into the S2’ and S1’ pockets led to molecules such as compound 9, which showed weak but detectable activity. (Astute readers will notice that the stereochemistry around the hydroxyl moiety has changed; both diastereomers are active.) A crystal structure of compound 9 bound to HIV-1 protease confirmed the predicted binding mode

 
Examination of the crystal structure revealed that the parafluorobenzyl substituent was not completely filling the S1’ pocket, and was also in a strained conformation. Replacing this with an alkyl substituent led to low micromolar compound 12. Finally, growing into the S1 subsite led to compound 14, a low nanomolar inhibitor with sub-micromolar antiviral activity.
 
This is a nice example of structure-guided, computationally-enabled fragment-based lead discovery that bears some similarity to the V-SYNTHES method we highlighted earlier this year. As the researchers note, the cyclic lactam found in fragment 5 had been used previously in HIV-1 protease inhibitors. It might have been possible to get to something similar to compound 14 from that earlier molecule. But regardless, compound 14 is emphatically non-peptidic. Whether it will lead to superior drugs remains to be seen, but the paper does say that further optimization is underway.

06 June 2022

What to make first? A new “Ring Replacement Recommender” provides suggestions

So you’ve run a fragment screen, gotten some hits, and validated them. What then? Looking for in-house or commercial analogs is always a good idea, but if you’re serious about a project you’ll eventually need to do chemistry, for example replacing one ring with another (say, a pyridyl for a phenyl). The possibilities are almost endless, especially if you don’t know how your fragment binds. In a new Eur. J. Med. Chem. paper, Peter Ertl and colleagues at Novartis describe a “Ring Replacement Recommender” to rapidly improve biological activity.
 
To determine which replacements are likely to improve affinity, the researchers turned to ChEMBL, a database of more than 2 million molecules and associated biological activity extracted from tens of thousands of publications. From these, more than 68,000 chemical series were chosen for analysis. Each series had on average 16 members, and at least three. The biological activity of each member of a series was compared with other members of the same series. (Importantly, the researchers intentionally excluded anti-targets such as hERG and CYPs so the tool wouldn’t inadvertently improve binding to these.) Focusing only on ring replacements that were reported in at least five publications led to a set of 26,762 changes. Changes could be as modest as adding a methyl substituent or more elaborate such as changing a single aromatic ring to a fused aromatic-aliphatic ring system.
 
One would think that most changes would have little effect, as had previously been seen in the case of methyl additions. Indeed about 65% of the replacements caused shifts in potency of 2-fold or less, which is probably within experimental error. However, 2860 replacements of 245 rings improved affinity at least 2-fold (averaging 3.5-fold), with 223 cases yielding greater than ten-fold improvements.
 
Analyzing the data further, the researchers found 80 ring systems that frequently led to improvements in affinity, and they suggest these could be used as “universal” or privileged building blocks. Strikingly, 74 of these are aromatic, confirming work from Cohen we highlighted in 2020 that proteins may favor “flat” rather than shapely molecules.
 
The researchers also extracted 9515 drugs and clinical compounds from ChEMBL and examined the component fragments. Of the 80 ring systems in the universal set, 19 are found in 50 or more drugs, with another 37 found in at least 5 drugs. This set may be a particularly attractive go-to list.
 
Importantly, not only are all the replacements available in the Supporting Information, the researchers have created a handy and free online tool. Just click on a ring of interest and the Ring Replacement Recommender provides suggestions, along with the average fold improvement observed and the number of publications used for the calculation.
 
To see how well it works, I looked at a couple recent examples which entailed ring changes. The indole to indazole replacement used in the TLR7/8 work described last month was not suggested by the Recommender, though in that case the researchers had the benefit of a crystal structure. On the other hand, a cyclobutyl to phenyl substitution for SARS-CoV-2-3CLp was correctly predicted to be beneficial.
 
Of course, as we’ve said repeatedly, affinity is only part of the battle in drug discovery, and the researchers emphasize that their recommendations may not improve physicochemical or pharmacokinetic properties. But for the earliest stage of a program, and especially in the absence of other data, it’s worth giving the Recommender a try.