One of the key selling points of
fragment-based lead discovery is that small fragments can search chemical space
much more efficiently than larger compounds, since there are fewer possibilites.
Nonetheless, the numbers are still daunting: more than 166 billion molecules
with up to 17 non-hydrogen atoms. The question of how many of these are
commercially available has come up before. In a paper just published online in Prog. Biophys. Mol. Biol., Chris Murray
and colleagues at Astex take a new look at this – and related – questions.
Rather than considering all possible
molecules, the researchers focused on six-membered rings with one or two small substituents
of no more than six non-hydrogen atoms. Six-membered rings are found in many
drugs, so this is a useful area of chemical space on which to focus. The
researchers first considered “topologies,” simple two-dimensional
representations of molecules. In the coarsest version, benzene, cyclohexane,
pyridine, and piperidine would all have identical topologies: a six-membered
ring with no substituents.
The researchers looked at how many
topologies having up to 16 atoms were listed in the available chemicals
directory (ACD) of 2.7 million commercial molecules. Even using the coarse
definition where all non-hydrogen atoms were considered equivalent, less than
half of 16-atom topologies are commercially available. At finer resolution (for
example, differentiating carbon from nitrogen), the numbers dropped even more:
less than 4% of the 2223 16-atom topologies with a pyridazine core were
available.
However, things get better the smaller the
molecule. When considering only molecules with 11 non-hydrogen atoms, all of
the coarsest topologies are available, as are more than 70% of pyridazines.
From this, the researchers concluded:
We need to focus on fragments with lower heavy atom counts and… improve the sensitivity of our screening methods to make sure that we can identify the binding of these smaller fragments.
The rest of the paper discusses how they
applied this approach, and what lessons they learned.
The researchers assert that X-ray
crystallography (upon which Astex was founded) is the most sensitive screening method.
That may elicit some debate, but is defensible given the presence of extremely
weak binders (water, buffer components, detergents) in many crystal structures.
They also argue that while NMR may allow detection of fragments with lower
solubilities, this may not be a good thing.
Of the 1633 fragments that were in the
Astex library between 2001 and 2007, 22% came up as X-ray hits (ie, they showed
up in at least one crystal structure). Strikingly, fragments with 11 or 12
atoms were enriched far above their representation in the overall library,
while fragments with 17 or more atoms were underrepresented. This is a
beautiful confirmation of the “molecular complexity” hypothesis, the idea that
there is a sweet spot where molecules are large enough to make productive
interactions with a target but not so complex that negative interactions become
dominant.
These results led the researchers to
redesign their library to focus on fragments having fewer than 17 non-hydrogen
atoms, which entailed considerable custom synthesis. The resulting library has 1371
fragments, of which 47% have shown up as X-ray hits. The average size of hits
is the same as that of the overall library (12.2 vs 12.4 non-hydrogen atoms and
172 vs 176 Da, respectively), though the hits are slightly more lipophilic
(cLogP = 1.1 vs 0.9).
What about “three-dimensionality?” This is
a topic that has been discussed quite a lot (here, here, here, here, and here, for starters), so it is nice to have some solid data. One problem is how to
define three-dimensionality: simple metrics such as Fsp3 don’t
account for the fact that aromatic compounds such as 2,6-substituted biphenyls
can be very non-planar. Many people use PMI, but the Astex researchers chose
deviation from planarity (DFP). This method puts a hypothetical plane through
the molecule that minimizes the deviation of all non-hydrogen atoms from the
plane; the average deviation from the plane for each molecule is calculated in
Ångstroms. So, for example, benzene has DFP = 0.0 Å, while cycloleucine has DFP
= 0.54 Å. In this study, the researchers used a single conformation for each
molecule, but since these fragments have on average only 1.3 rotatable bonds
this is probably a reasonable simplification.
Roughly 40% of the Astex library has a DFP
< 0.05 Å, but these “flat” fragments were enriched to ~50% among hits. Not
surprisingly, kinase hits tended to be even more two-dimensional (>60%), but
even protein-protein interaction (PPI) hits were, if anything, slightly more
planar than the overall collection, which is consistent with another recent study. Indeed, there seems to be nothing special at all about PPI hits, more
than half of which were also found against non-PPI targets. The researchers
argue that 3D-fragments are inherently more complex and thus less likely to
show up as hits, which supports Teddy’s Safran Zunft challenge.
One of the arguments in favor of
three-dimensionality is that such molecules may have better physicochemical properties, and the researchers examine the DFP for fragments and resulting
leads. It turns out that there is a weak correlation between the shapeliness of
a fragment and that of the resulting lead, but there are many exceptions (such
as this one).
This paper contains one of the least scientifically relevant passages I have read in a long time.
ReplyDelete"In order to assess the deviation from planarity for molecules we use an approach that has recently been described and analysed in detail by Firth et al. (Firth et al., 2012). A number of years ago, we independently developed this method and here we give a brief description of the approach."
I think I'll try to claim gravity the next time I'm writing up a manuscript!
I really enjoyed the paper, but I don't quite agree some of with the authors points with respect to the 3D fragments. The authors state that even for PPI targets, there is still a higher % of planar X-ray hits than the % in the library itself (although . . . not statistically significant)". Then the authors state "non-planar fragments . . . generally show a slightly poorer hit rate than flat molecules".
ReplyDeleteMy interpretation of the data shown is that for kinases, the aromatic hit rate is higher. But for PPI's, the hit rates of planar and non-planar hits appear pretty comparable.
I really like the point about trying to de-emphasize hit rates from fragment libraries and instead to focus on whether or not the hits "can be optimised into good chemical leads or drugs".
Along those lines, for very difficult targets with shallow 3D pockets, the hit rates of non-planar molecules ought to be quite low, but perhaps with higher hit-rates than aromatics, as the complementarity of aromatics may be too weak to detect binding. However, this would depend on the fragment library containing the right non-planar fragments that can fit...a tough challenge given the much larger chemical space of non-planar fragments.
Thus, hit rates can be quite deceiving for that reason as well. If you happen to have the right fit for a tough target with a shallow, spherical pocket, hit rates might be exceedingly low, but also of tremendous value and able to generate leads. And in those cases, aromatics might not show detectable binding. Ind if your frag-lib's non-planar space is insufficient, none of the non-planars might be detectable as well, but if you've got the right substituted, saturated ring, you'll get the hits.
Lastly, I really like the discussion of how Astex found the chemical space coverage of commercially available fragments to be sparse, thus propelling them to "enhance the library via synthesis".