03 May 2009

More on DOCKing fragments and sampling chemical space

A few weeks ago, we highlighted a paper from Brian Shoichet’s group at UCSF demonstrating that computational screening could successfully identify fragments binding to a protein target, and that the binding modes predicted were actually observed experimentally. A companion paper just published online in PNAS now extends these results, and also beautifully illustrates that it is possible to cover much more chemical space with fragments than with lead-like molecules.

Denise Teotico, Shoichet, and colleagues used the program DOCK 3.5.54 to screen 137,639 fragments against AmpC beta-lactamase, a bacterial protein responsible for antibiotic resistance. The protein had previously been the target of HTS and computational screens of drug-like like molecules. The computational screens had modest success rates (2-7%), but the HTS screen was a total bust: of the more than 1200 hits from the 70,000+ compound screening collection, more than 95% of these turned out to be false positives, mostly aggregators, with just a few dozen true inhibitors, all of which turned out to be covalent (irreversible).

In contrast, of the 48 high-scoring fragments that were experimentally tested, 23 had Ki values better than 10 mM, for a hit rate of 48%. The authors also assessed potential for false negatives by choosing 20 random fragments and testing these for inhibition; only one showed inhibition (with a Ki value of 3.1 mM), and this molecule had scored in the top 5% of docked fragments.

The paper presents a fascinating empirical test of the Hannian chemical complexity hypothesis. Starting with the 23 active fragments, the researchers calculated how many lead-like molecules (up to 25 non-hydrogen atoms) could contain these fragments. Of the roughly 47,000,000,000 to 430,000,000,000 possible lead-like molecules, only 675 are commercially available. By repeating this analysis with fragment-sized molecules (up to 17 non-hydrogen atoms), the size of the haystack was reduced by six orders of magnitude: only about 10,000 possible molecules contain these fragments, of which 93 are commercially available. Moreover, many of the active fragments represent unique chemotypes not previously observed in AmpC inhibitors. As the authors note:

The chances of discovering interesting chemotypes for biological targets is many orders of magnitude higher when targeting molecules in the fragment weight range than even at slightly higher size ranges.

But, as the paper asks, “are the docking predictions right for the right reasons?” The researchers solved the crystal structures of 8 fragments bound to AmpC. Four of these reproduced the docking predictions well, two were somewhat different, and two were way off. In these last two cases, the protein itself adopted different conformations than had been used in the docking studies.

Protein conformational flexibility is remarkably common, and likely to be a persistent difficulty for computational methods. Clearly, current computational methods can’t identify all possibilities, particularly with fluxional proteins. Still, especially with relatively rigid proteins, computational fragment-screening may reveal chemotypes that HTS won’t.

A notable feature of the fragments is their relatively poor ligand efficiency: with one unusual exception (a phosphinate), all of the active fragments have ligand efficiencies less than 0.3 (kcal/mol)/atom. AmpC has a large, open active site, and the authors suggest that the failure of other hit-ID methods against this target may reflect issues such as solubility.

It remains to be seen whether these fragments can be advanced to low nanomolar inhibitors, but at least fragment-screening has provided many new starting points. And the paper demonstrates, once again, that triaging a fragment set computationally can be an effective means for concentrating the needles in a haystack.

No comments: