23 May 2016

Calculating hotspots in detail

In the eight years since Practical Fragments first started, Moore’s law has held strong and computational power has increased accordingly. Last year we described how tools such as FTMap can be used to identify hot spots – regions on proteins where fragments are most likely to bind. Although FTMap is quite successful at identifying these, it is less able to point to specific interactions (such as hydrogen bond donors or acceptors) that are likely to drive binding. In other words, computational chemists have become adept at identifying where fragments might bind but lag in predicting how. A new paper in J. Med. Chem. by Chris Radoux at the Cambridge Crystallographic Data Centre and collaborators at UCB and the University of Cambridge addresses this challenge.

The approach starts with a set of three simple molecular probes: toluene, to look for hydrophobic interactions; aniline, to look for hydrogen bond acceptors; and cyclohexa-2,5-dien-1-one, to look for hydrogen bond donors. These probes are larger than those (such as ethanol) used in many other programs, the idea being that too-small molecules might find hot spots so small as to be useless. Indeed, with 7 non-hydrogen atoms, these probes are near the low end of the consensus size for fragments.

Calculations are performed on protein structures – either with no ligand bound or with a bound ligand computationally removed – to determine whether each surface atom of the protein is a hydrogen bond donor, acceptor, or hydrophobic, as well as how exposed the particular atom is. The three probes are then mapped onto the proteins to look for favorable interactions. Regions where multiple probes can bind are scored higher, with hotspots defined as those regions of the protein having the highest scores. The type of probe with the highest score also describes what type of interactions are likely to be favorable at various regions within a given hot spot. Although the researchers note that multiple software packages could be used for these calculations, they used a program called SuperStar, and calculations took just a few minutes on an ordinary laptop.

To validate the approach, the researchers used a previously published data set (discussed here) of 21 fragment-to-lead pairs against a variety of proteins for which crystal structures and binding affinities were available. In general, the method was able to identify the fragment binding site quite effectively; the one outright failure was on the fragment with the lowest affinity, which also had poorly resolved electron density in the crystal structure. Importantly, the fragments tended to have the highest scores, with added portions of the leads scoring lower. This data set was used to calibrate the scoring system for identifying hot spots, as well as specific molecular interactions within each hot spot.

Having thus validated the approach, the researchers took a more detailed look at two published fragment-to-lead programs for protein kinase B and pantothenate synthetase. In both these cases, group efficiency analyses had previously been performed to establish which portions of the ligands contributed most significantly to binding. Gratifyingly, the computations correctly predicted these.

Overall this approach appears promising. At a minimum, it is another tool for assessing the ligandability of potential targets. More significantly, by highlighting the hottest bits of hot spots, it could be useful for medicinal chemists trying to optimize and grow fragments and leads. Unfortunately, as currently described, the process will require a skilled modeler. It would be nice if the authors built a simple web-based interface for people to upload pdb files for analysis, as is the case for FTMap. Also, all the data presented are retrospective – a prospective example would be the true test. Does anyone have experience to share?

1 comment:

Peter Kenny said...

The choice of cyclohexadienone and aniline as probes seems a bit bizarre since the former can accept two hydrogen bonds and the latter can donate two hydrogen bonds (as well as accepting one). The problem with having two HB acceptor (or donor) sites is that deploying one site may place the other in an unfavorable environment. Five-membered rings might represent a better option for this sort analysis. For example, ( pyrrole | furan | cyclopentadiene ) or ( N-methylpyrrole | 3-methylpyrrole | N-methylimidazole ). I think the authors are actually doing matched molecular pair analysis rather than Free-Wilson analysis given the focus on group efficiency. Given that the authors have quoted a group efficiency of 1.5 (I'll assume kcal/(mol.HeavyAtom) ) in the PKB example, I'll flag up slides 18 and 19 in this presentation since these may be relevant to the discussion.