In the eight years since Practical Fragments first started, Moore’s law has held strong and computational power has increased accordingly. Last year we described how tools such as FTMap can be used to identify hot spots – regions on proteins where fragments are most likely to bind. Although FTMap is quite successful at identifying these, it is less able to point to specific interactions (such as hydrogen bond donors or acceptors) that are likely to drive binding. In other words, computational chemists have become adept at identifying where fragments might bind but lag in predicting how. A new paper in J. Med. Chem. by Chris Radoux at the Cambridge Crystallographic Data Centre and collaborators at UCB and the University of Cambridge addresses this challenge.
The approach starts with a set of three simple molecular probes: toluene, to look for hydrophobic interactions; aniline, to look for hydrogen bond acceptors; and cyclohexa-2,5-dien-1-one, to look for hydrogen bond donors. These probes are larger than those (such as ethanol) used in many other programs, the idea being that too-small molecules might find hot spots so small as to be useless. Indeed, with 7 non-hydrogen atoms, these probes are near the low end of the consensus size for fragments.
Calculations are performed on protein structures – either with no ligand bound or with a bound ligand computationally removed – to determine whether each surface atom of the protein is a hydrogen bond donor, acceptor, or hydrophobic, as well as how exposed the particular atom is. The three probes are then mapped onto the proteins to look for favorable interactions. Regions where multiple probes can bind are scored higher, with hotspots defined as those regions of the protein having the highest scores. The type of probe with the highest score also describes what type of interactions are likely to be favorable at various regions within a given hot spot. Although the researchers note that multiple software packages could be used for these calculations, they used a program called SuperStar, and calculations took just a few minutes on an ordinary laptop.
To validate the approach, the researchers used a previously published data set (discussed here) of 21 fragment-to-lead pairs against a variety of proteins for which crystal structures and binding affinities were available. In general, the method was able to identify the fragment binding site quite effectively; the one outright failure was on the fragment with the lowest affinity, which also had poorly resolved electron density in the crystal structure. Importantly, the fragments tended to have the highest scores, with added portions of the leads scoring lower. This data set was used to calibrate the scoring system for identifying hot spots, as well as specific molecular interactions within each hot spot.
Having thus validated the approach, the researchers took a more detailed look at two published fragment-to-lead programs for protein kinase B and pantothenate synthetase. In both these cases, group efficiency analyses had previously been performed to establish which portions of the ligands contributed most significantly to binding. Gratifyingly, the computations correctly predicted these.