27 February 2017

When do ligands change their binding modes?

Elaborating a fragment to improve its affinity relies on the assumption that the fragment will maintain its position and orientation during optimization. Although this is usually the case, exceptions are common, and when flips go unrecognized the resulting SAR can be confusing. Is it possible to predict which ligands are most likely to change their binding mode? This question is addressed in a recent J. Med. Chem. paper by Shipra Malhotra and John Karanicolas of the Fox Chase Cancer Center and the University of Kansas.

The researchers scoured the Protein Data Bank (PDB) to find pairs of molecules bound to the same protein where one ligand was a substructure of the other. (In most cases these were not actually from fragment-based efforts, and the two structures were often solved by different research groups.) This generated 297 pairs of crystal structures. Computational and manual analyses revealed 41 instances (14%) in which the larger ligand had a significantly different binding mode than the smaller ligand. Careful inspection revealed that these observations were probably not due to crystallographic artifacts or differences in experimental conditions. The researchers then examined well over a dozen parameters to look for correlations with changes in binding mode.

Size matters: for the 73 rule-of-three compliant smaller ligands, the binding modes were not conserved in the larger ligands 23% of the time. Binding modes changed 30% of the time when the smaller ligand was ~100 Da, but only 5% of the time when the smaller ligand was ~400 Da.

Potency also matters: as might be expected, weaker ligands were statistically less likely to preserve their binding mode. (Of course, as the researchers observe, potency often correlates with size.) More polar ligands, as assessed by clogP, were also less likely to maintain their binding modes.

Looking beyond molecular properties to those of the initial complex, ligands binding to a small pocket were less likely to maintain their binding modes. Also, ligands for which a large amount of solvent-accessible surface area was buried upon binding to the protein were more likely to maintain their binding modes.

Many other properties showed no statistically significant correlation with binding modes. These included ligand efficiency, fraction of the ligand buried, and various descriptions of the protein binding site, such as hydrophobicity and the fraction of polar or aromatic amino acid residues.

The open-access hot-spot finding software FTMap has previously been used to assess when ligands change their conformation, and it performed well on this set of molecules, although as it requires structures of both the larger and smaller ligands it has limited predictive value. The researchers also introduced another computational tool, RMAC (RMSD after Minimization of the Aligned Complex) which did even better.

This paper is a fun read, and there’s lots more than can be summarized here, including detailed analyses of specific examples. The researchers have done a great job collecting and synthesizing a huge amount of data. Admirably, all of the calculated properties are available in the supporting information. Of course – in view of the title of this blog – we have to ask, how practical is it? For any given fragment to lead program, it is still impossible to predict whether the binding mode will shift. But if you’ve got a particularly small, weak fragment binding to a little dimple on a protein surface, you should probably expect surprises.

20 February 2017

Many measures of molecular complexity

Molecular complexity is a fundamental concept underlying fragment-based lead discovery: fragments, being simple, can bind to more sites on proteins and thus give higher hit rates than larger, more complex molecules. The ultimate example of this is water, which – at 55 M concentration – binds to lots of sites on proteins. But although the concept is easy to describe, it is much harder to quantify: everyone can agree that palytoxin is more complex than methane, but by how much? And if complexity could be measured, could it help in optimizing libraries? This is the subject of a review by Oscar Méndez-Lucio and José Medina-Franco at the Universidad Nacional Autónoma de México published recently in Drug Discovery Today.

There are many ways to measure molecular complexity. Two of the simplest to calculate are the fraction of chiral centers (FCC) and the fraction of sp3 carbons (Fsp3). These range from 0 to 1, and larger numbers imply a higher number of unique molecules with the same formula.

More complicated methods to measure complexity abound, but many of these require specialized software. Two that are publicly available are PubChem complexity and DataWarrior complexity. In PubChem, complexity incorporates the number of elements as well as structural features such as symmetry, though stereochemistry is not explicitly considered, and aromaticity is scaled such that both benzene and cyclohexane have the same complexity – a sharp contrast to FCC and Fsp3. DataWarrior uses its own metric, though I couldn’t find the definition. (Ironically, though the software itself is open source, the paper describing it is not.)

So, do more complex molecules have lower hit rates? The researchers looked at several public databases of screening data for dozens of assays against thousands of molecules. Using each of the four metrics, they classified molecules as “simple,” “intermediate,” or “complex”. For FCC and Fsp3, simple compounds did appear to be more promiscuous, in line with theory and with previous findings. However, for PubChem and DataWarrior, the trends were not clear – and even reversed in some cases. The researchers note that the median complexity of molecules in each dataset may vary, and as Pete has also observed simple binning strategies can be misleading.

Do these different definitions of complexity even measure the same thing? The researchers plotted each pair-wise measurement of complexity for >400,000 molecules – for example, Fsp3 vs DataWarrior. Not only are there no universal correlations, those that do exist are conflicting. "For example," the authors write, “compounds with high FCC values are associated with low PubChem complexity values, whereas the same molecules have high DataWarrior complexity." 

Teddy has previously invoked Justice Potter Stewart and his famous “I know it when I see it” expression, and I think that just about sums up where things stand in terms of molecular complexity. From a practical standpoint this probably doesn’t matter; a complex molecule is not even necessarily more difficult to make, as evidenced by the ease of oligonucleotide and peptide synthesis. Still, it would be nice if someone could come up with a reliable measurement for such a fundamental property – or even demonstrate whether or not such measures are possible.

13 February 2017

Fragments in Cell(s)

Last year we highlighted work out of Ben Cravatt’s group at Scripps on screening covalent fragments in cells. Now, in a new Cell paper, his group and collaborators at the École Polytechnique Fédérale de Lausanne, Bristol-Myers Squibb, and the Salk Institute have gone further by screening for non-covalent fragments in cells.

The researchers started by synthesizing a collection of just 14 “fully functionalized” fragments (FFF). In addition to the variable fragment (averaging 176 Da), each FFF probe contains a diazirine group, which, when exposed to UV light, generates a highly reactive species that can covalently react with proteins (or anything else) in close proximity. (In contrast, covalent fragments typically work via chemistries in which bonds form without requiring UV exposure.) The FFF probes in the current study also contain a "clickable" tag: an alkyne moiety that can react with azide-containing molecules using copper-catalyzed azide alkyne cycloaddition.

Cells were incubated with 20 µM of each fragment for 30 minutes, then exposed to UV light for 10 minutes on ice to capture non-covalent interactions. The cells were then lysed, treated with an azide-containing flurorescent dye in the presence of copper, and analyzed by gel electrophoresis to visualize those proteins with bound fragments. The results were striking: lots of proteins were labeled, and each fragment labeled a different set of proteins. This is what you would expect for low-complexity molecules, but it is nice to see reality match predictions.

Not content to look at gels, the researchers switched to mass spectrometry for a more global analysis using “stable isotope labeling with amino acids in cell culture” (SILAC). In this approach, one population of cells grown under normal conditions was treated with one of the FFF probes, while a second population of cells containing isotopically labeled proteins was treated with a control probe containing just a methyl group instead of the variable fragment. The resulting cell lysates could then be proteolyzed and analyzed by mass spectrometry; most peptides would show two peaks of similar intensities, one from each isotopically distinct population of cells. However, if an FFF probe bound preferentially to a protein compared with the control probe, the resulting peptide would be enriched.

Examining 11 FFF probes at 200 µM concentration allowed the researchers to identify an impressive 2000 different protein targets. Both soluble and membrane proteins were found, with expression levels ranging over 100,000-fold (i.e. the technique seems to work for both rare and abundant proteins). Remarkably, only about 17% had known ligands. There was also little overlap with the proteins targeted by the researchers’ previously described covalent fragments.

Where did the FFF probes bind? An analysis of 186 proteins whose crystal structures had previously been reported showed that about 80% of the modified peptides were close to a computationally predicted pocket, as might be expected.

Next, the researchers made or purchased analogs of some of their FFF probes. When added to screens, these decreased labelling of hundreds of targets; this competition assay both suggests the FFF probes make specific interactions while also providing more potent analogs. Two proteins – the enzyme PTGR2 and the transporter SLC25A20 – were studied in some detail. Probe 8 modified two peptides near the active site of PTGR2 and could be competed by compound 20. Compound 20 was also a modest inhibitor of the enzyme. Further modification led to compound 22, with nanomolar activity against the isolated enzyme and in cells. Since this protein previously lacked any good chemical probes, this could be useful.

This approach also lends itself well to phenotypic screening, so the researchers expanded their FFF probes to 465 members, increasing the size of the variable fragment portion to an average of 267 Da. They also made competitor molecules for most of these, which contained the fragment but not the alkyne or photoreactive group.

The researchers screened about 300 of their new FFF probes (at 50 µM each) to look for molecules that would increase the differentiation of mouse preadipocytes to adipocytes. This led to nine hits, one of which was active at 10 µM. SILAC experiments revealed the target of this to be a somewhat obscure membrane protein called PGRMC2. Subsequent experiments suggested that PGRMC2 is a positive regulator of adipogenesis, and that the identified compound is an activator.

This is a remarkable paper, and it is impressive that the researchers have described not just the approach but several success stories – each of which could well form a stand-alone publication. The covalent platform described last year has already led to a company - Vividion - which recently raised $50 million, and I’m sure the new approach will find use in both academia and industry.

06 February 2017

Beware self-reacting fragments

Long-time readers will know that I have a peculiar fascination for artifacts of all kinds, particularly when they provide learning opportunities. A lovely example by Gerhard Klebe (Philipps-Universität Marburg) and collaborators has just appeared in Angew. Chem. Int. Ed.

The researchers have long been using the aspartic protease endothiapepsin (EP) as a model protein: we previously discussed how they compared half a dozen fragment-finding methods against EP, more recently arguing that crystallography is the best of the bunch. The new paper focuses on Compound 1. This molecule was a hit in five out of six fragment screens, each employing a different method, and was among the top ten hits in four of the screens. It produced the highest thermal shift (+3.4 °C), strongly inhibited the enzyme in two different biochemical assays, and even showed a dissociation constant of 115 µM by isothermal titration calorimetry (ITC).


Crystallography, though, told a different story. The researchers obtained high resolution structures (initially 1.25 Å, and ultimately 1.03 Å!) These revealed that the bound ligand was actually compound 2, which is composed of three molecules of compound 1. A variety of experiments, including anomalous scattering, high resolution mass spectrometry (HR-MS), and MS/MS fragmentation supports this assignment.

So what’s going on? The team notes that, although compound 1 does not aggregate and is not a PAINS compound, high-level quantum mechanical modeling suggested that the chlorine is susceptible to nucleophilic displacement, which probably wouldn’t surprise many medicinal chemists. Rearrangement of the resulting dimeric molecule produces compound 4, which could then react with yet another molecule of compound 1 through a radical mechanism to produce compound 2.

Allowing compound 1 to sit in buffer or methanol provided support for this mechanism and allowed the isolation of compound 4 and other degradation products, though compound 2 itself could not be detected. The researchers suggest it is particularly reactive and only stable when surrounded by the protein.

I applaud the investigators for pursuing this fascinating bit of science. This is academic research in the best sense of the phrase.

This is also the kind of investigation that would fall outside the scope of most industrial researchers, where the mandate is to discover promising drug leads as quickly as possible. More somberly, this story could have ended in embarrassment or worse had the researchers been less rigorous. The difficulty (and unlikelihood) of such lengthy investigations is why triaging shortcuts such as PAINS filters have been introduced, and why scientists using these tools must still be cautious: even molecules that aren’t PAINS can act through pathological mechanisms.

This is also why I believe that arguments that PAINS filters are inadequately defined and should thus be discarded are misguided. Sure, some PAINS molecules are drugs, and any rubric can be improved. But the nice thing about fragment screens is that they often produce a plethora of hits to pursue. A flawed triaging scheme will jettison some pearls among the pebbles, but without triage, far more resources will be lost chasing will-o'-the-wisps.