14 December 2020

Benchmarking docking methods: a new public resource

Despite advances in crystallography, obtaining structures of fragments bound to proteins is still often elusive. Computational docking is likely to forever be faster than experimental methods, but how good is it? A new paper in J. Chem. Inf. Mod. by Laura Chachulski (Jacobs University Bremen) and Björn Windshügel (Universität Hamburg) assess four popular methods and also provide a public validation set for others to use.
 
When evaluating fragment docking methods, it is essential to have a well-curated set of experimental structures. To this end, the researchers started by combing the PDB for high quality, high resolution (< 2 Å) structures of protein-fragment complexes. They used automated methods to remove structures with poor electron density, close contacts with other ligands, and various other complications. Further manual curation yielded 93 protein-ligand complex structures. The fragments span a relaxed rule-of-three, with 7 to 22 non-hydrogen atoms (averaging 13) and ClogP ranging from -4.1to 3.5 (averaging 1.1). I confess that some choices are rather odd, including oxidized dithiothreitol, benzaldehyde, and γ-aminobutyric acid. The researchers might have saved themselves some effort, and obtained a more pharmaceutically attractive set, by starting with previous efforts such as this one.
 
Having built their benchmark data set, called LEADS-FRAG, the researchers next tested AutoDock, AutoDock Vina, FlexX, and GOLD to see how well they would be able to recapitulate reality. The results? Let’s just say that crystallographers look likely to have job security for some time.
 
Only 13 of the 93 protein-fragment complexes were correctly reproduced as the top hit using all four methods (even with a reasonably generous RMSD cutoff criterion of < 1.5 Å).There were 18 complexes that none of the methods predicted successfully. Across the four methods, the top-ranked poses were “correct” 33-54% of the time. Docking methods usually provide multiple different poses with different scores; up to 30 were considered here. Looking at lower-ranked poses increased the number of successes to 27 of the 93 fragments, while only three failed in all methods. Overall, the correct structure was present among the poses in 53-86% of cases. Changing the scoring function sometimes led to further improvements.
 
Why were some fragments more successfully docked than others? Fragments that were more buried within the protein (lower solvent-accessible surface area, or SASA) yielded better predictions than those that were more solvent-exposed. The researchers did not report on the effect of rotatable bonds; intuitively, one might think that a more flexible fragment would be harder to dock. A study we highlighted nearly ten years ago found that fragments with higher ligand efficiency also had higher docking scores, and it would be interesting to know if that reproduced with this larger data set.
 
The researchers conclude by noting that “these programs do not represent the optimal solution for fragment docking.” I think this is a fair assessment. And as the researchers acknowledge, the bar was set low: compounds were docked against the crystal structure of the protein with ligand computationally removed. In the real world, proteins often change conformation upon ligand-binding, which would make docking even more difficult.
 
In addition to trying to determine how a specific fragment binds, it can also be valuable to computationally screen large numbers of fragments. The programs used here took between 10 seconds and 42 minutes per ligand, but as we highlighted last year speed continues to increase.
 
Most importantly, the public availability of LEADS-FRAG will allow others to assess their own computational approaches. It will be fun to revisit this topic in a few years to see how much things have improved.

No comments: