X-ray crystallography is tied for second place among methods used in fragment-based lead discovery, according to our most recent poll. This makes sense, since structures are usually essential for advancing fragments to leads. Faster fragment-finding methods are usually used to triage fragments down to a manageable number of hits to feed into crystallography, but the high incidence of false negatives means that promising fragments might be inadvertently discarded. If structures are key goals at the end of a fragment screening campaign, why not start directly with crystallography?
In fact, this is exactly what more and more groups seem to be doing. The problem, historically, has been throughput. Increasing automation has been solving some of the mechanical issues (such as mounting crystals and collecting data at a synchrotron), but what about the actual processing? A recent paper in Structure by Andreas Heine and collaborators at Philipps-University Marburg and Helmholtz-Zentrum Berlin für Materialien und Energie provides some useful advice.
The protein in question is endothiapepsin, a model aspartic protease that is easy to crystallize and diffracts to high resolution. Earlier this year, we discussed the researchers’ work soaking 360+ fragments against this protein, and a companion paper gives detailed information on how several dozen fragment hits bind. The Structure paper describes an automated refinement pipeline, and highlights some of its most important features.
Determining a crystal structure involves iterative cycles of modeling the protein backbone and side chains into regions of “electron density.” One risk is “model bias,” illustrated memorably in this brief video. This is especially important for small molecules: since they represent such a tiny fraction of the overall structure, it is especially easy to see what you want to see. To avoid this, people often look for regions of electron density – which in addition to a bound small molecule could represent co-solvents, buffer, or an amino acid side chain that has unexpectedly moved – before doing much refinement.
The problem is that the electron density might be very spotty and easy to overlook. This is especially true for fragments that bind weakly and which are small by definition. Some initial refinement can thus improve the quality of the electron density maps. The researchers find that adding water molecules and including these in the refinement is the single most important step. Adding bound hydrogen atoms to the protein model is also helpful: even though each hydrogen only contributes one electron to the overall density, there are more than enough to make a meaningful difference. Finally, for very high resolution structures (better than 1.5 Å), it can help to treat each atom of the protein individually (anisotropic refinement of B factors, or atomic displacement parameters). However, at lower resolution, doing this can lead to overfitting. Incorporating these steps into the automated process revealed that 25% of fragments would have been missed had conventional methods been used.