Practical Fragments has written several times about “hot spots”: regions on proteins where small molecules and fragments readily bind. Knowing whether your target protein has a hot spot can help you decide whether to pursue the target in the first place. A variety of computational approaches have been developed for finding hot spots, most of which start with a crystallographically determined structure. In a new J. Chem. Inf. Mod. paper, Sandor Vajda and collaborators at Boston University and Stony Brook University ask whether computational models of proteins can also be used for one of the more popular methods, FTMap.
The researchers started with a set of 62 proteins, each of which had a published crystal structure bound to a fragment (MW < 200 Da) as well as to a larger molecule. The predicted structures of these proteins were then downloaded from the AlphaFold2 (AF2) site, and these models were truncated to correspond to the residues seen in the crystal structures to facilitate comparisons. The computational models were quite similar to the experimental models, particularly when comparing the positions of the peptide backbone atoms which define the overall shape of the proteins.
Next, the researchers applied the program FTMap, which computationally explores the surface of proteins using a set of 16 very small probes such as ethanol. Hot spots are regions where lots of probes bind, and the “hotness” of these spots correlates with the number of bound probes. FTMap assessed hotness on the AF2 structures and the crystallographicaly determined structures. (Before running FTMap, the bound ligands in the crystal structures were computationally removed.) Additionally, the researchers ran FTMap on unliganded crystal structures for the 47 proteins where these had been reported.
FTMap was broadly successful at finding the hotspots defined by bound fragments, succeeding 77% of the time starting with either the fragment-bound or unliganded structures and 71% starting with the AF2 models. Implementing stricter criteria (demanding the experimental fragment binding site be the top hot spot, for example) reduced the success to 56% for the crystallographic starting points and 47% for the AF2 models.
The paper discusses several examples in detail, in particular the two where the AF2 models were most different from the experimental models. Both of these were large, multidomain proteins. When AF2 models of just the ligand-binding domains were used, the models were significantly improved. This seems to be a generally useful hack: generating truncated AF2 models for other proteins also improved the performance of FTMap.
The utility of AF2 models for docking has been the subject of some debate, with some arguing that even though the overall protein folds may be accurate, local side chain conformations may be wrong, and a single side chain rotation may make the difference between ligand binding or not. This paper suggests that hot spots are not too sensitive to these subtleties, and that AF2 models can be used for finding hot spots.