04 May 2026

(Not) getting misled by crystal structures part 6: low ligand occupancies

It’s been a while since our last “getting misled by crystal structures” post. That one described unrecognized conformational heterogeneity of ligands. A more basic issue is ligand occupancy. It’s normally assumed that every protein in the crystal lattice has a bound ligand. A new paper in Structure by Timothy Stachowski and Marcus Fischer at St. Jude Children’s Research Hospital reveals that this is not always the case.
 
The researchers selected roughly 10,000 protein-ligand structures from the Protein Data Bank (PDB) and did a simple re-refinement of the ligand occupancies and B-factors (measures of conformational heterogeneity and modeling errors). 10% of the structures already presumed ligand occupancies at or below 0.9, but re-refinement saw the fraction jump to 35%, more than three-fold higher. There were no overall differences between covalent and non-covalent ligands, but 37% of fragments (defined as having MW <300 Da) saw decreased occupancy in re-refinement compared to just 22% of larger ligands. A few structures even saw occupancy drop completely. The authors wrote that “manual reviewing these revealed that ligands were built into spurious electron-density.”
 
Crystallographers use several metrics to assess the quality of their structures. In addition to B-factors, unique to each residue or atom, real-space correlation coefficients (RSCC) and real-space R values are commonly used, and the researchers compared these parameters before and after re-refinement. In many cases the metrics improved with decreased occupancy, but not for all metrics, and not always meaningfully. This means that standard assessments do not always flag partially occupied ligands.
 
OK, so a ligand binds at only 80% occupancy rather than the 100% assumed: does this matter? The researchers describe three categories where the answer may be yes. In the first, correcting ligand occupancy reveals alternative conformations of protein side chains, which could be informative for understanding the mechanism of binding. In the second, correcting ligand occupancy can reveal water molecules that interact with the ligand and/or protein. As we noted more than a decade ago, water is an essential player in protein-ligand interactions, and a single water molecule can make the difference between a binder and a non-binder. Finally, correcting ligand occupancy can reveal alternative binding modes for the ligand, and even ligands binding to other sites.
 
Importantly, this analysis was done on individually refined structures in the PDB, and it seems likely that the issues would be even more severe for structures batch-refined in high-throughput crystallographic fragment screens. As we wrote last year, the community needs to figure out how to deal with the increasing number of these structures.
 
The fact that roughly a third of PDB structures have less than 100% ligand occupancy has implications for training AI models. It also has implications for individual targets. As the researchers note, “non-crystallographers who rely on the PDB often assume that deposition itself is an implicit stamp of approval. Structural biologists, however, know that this is not always the case.” Before using a structure, it would be wise to re-refine it yourself.