This is the last
week for our poll on how much structural information you need to begin
optimizing a fragment – please vote on the right-hand side of the page if you
haven’t already done so. We’ve recently discussed crystallography and NMR, so
this post is focused on computation.
Predicting hot
spots – regions on proteins where fragments are likely to bind – is becoming
something of a cottage industry (see for example here and here). These can
provide some indication as to whether or not a protein is ligandable, and
ideally even provide starting points for a lead discovery program. But how should one searching for promising
hot spots and binders choose a method, or evaluate a new one? In a
recent paper in J. Med. Chem., Marcel
Verdonk and colleagues at Astex provide a method as well as a validation set,
both of which are freely available.
The validation
set consists of 52 high-quality crystal structures pulled from the Protein Data Bank
(PDB). These were chosen to be maximally diverse in terms of fragments (41 of
them) and proteins (45). The fragments were not taken in isolation; rather,
fragments of larger molecules were considered if they bound in the same region
of the protein when presented from at least three different ligands. For
example, the researchers note that there are no structures in the PDB of
resorcinol bound to HSP90A, even though this is a privileged fragment that
usually binds in a conserved fashion at the ATP-binding site in the context of
a larger molecule.
Fragments chosen
for the validation set have at most one rotatable bond and are quite small,
just 5 to 12 non-hydrogen atoms. However, as they are culled from larger
molecules, some (such as adamantane) are more lipophilic than standard “rule of three” guidelines.
The 52 examples
in the test set were divided into 40 hot spots and 12 “warm” spots, depending
on the occupancy of the binding site in the protein across the PDB. For example,
the canonical purine binding site of kinases is a hot spot, while the nearby
chlorophenyl-binding site of the PKA-Akt chimeric kinase is classified as warm.
With this
validation set in hand, the researchers tested an in-house developed fragment
mapping method called PLImap (which relies on the previously published Protein-Ligand
Informatics force field, PLIff) to see how well they could reproduce the bound
conformations of the fragments. The results were quite favorable in comparison
with other docking methods tested. That’s exciting, and since PLImap is free to
download and use (here), it should be a useful tool for modelers everywhere.
But of course PLImap
also made mistakes, and some of these were “wrong in an interesting way.” Water
often plays a critical role in protein-ligand interactions, but water was not
included in the docking. Several cases where PLImap did not choose the
experimentally observed conformation of the fragment involved water molecules.
For example, PLImap placed the bromodomain-privileged 3,5-dimethoxylisoxazole
fragment in the right location but in a flipped orientation, because the highly
conserved water was not present.
Perhaps more
interestingly, in some cases warm spots were ignored in favor of hot spots. For
example, in the case of the PKA-Akt chimeric kinase mentioned above, the
chlorophenyl fragment bound not at the chlorophenyl-binding subsite, where it
sits in the context of larger ligands, but rather in the “hotter” purine
binding subsite. This phenomenon was observed experimentally several years ago
by Isabelle Krimm and colleagues; large BCL-xL ligands that were deconstructed
into component fragments bound mostly at a single site, rather than the two
sites occupied by the larger molecules. It would be fascinating to test this
same set of molecules using PLImap.
All of which is
to say that, while computational methods continue to make impressive strides,
we are still (happily!) some way from getting rid of the experimentalists.