Four years ago we highlighted an
analysis of fragments taken from the Protein Data Bank (PDB). Of 462 unique
fragments, just 21 bound in more than one pocket. With the assumption that such
“versatile” fragments may be particularly valuable starting points, Esther
Kellenberger and colleagues at CNRS Univeristé de Strasbourg have done their
own exploration of the PDB, as reported (open access) in Front. Chem.
Structures deposited in the PDB
starting in 2000 with resolution better than 3 Å were examined to find those
containing fragment-sized molecules (MW < 300 Da). Crystallization additives,
phosphate and sulfate ions, and other unlovable molecules such as PAINS were
excluded. Further triaging for fragments that bound in more than one pocket and
in more than one binding mode (ie, different types of interactions) ultimately yielded a set of 203 versatile
fragments. (One reason why so many more fragments were found in this study is the
fact that the previous analysis required the word “fragment” to be present in
the PDB entry.)
The versatile fragments are mostly
compliant with the rule of three, with violations mostly related to the number
of hydrogen bond donors or acceptors. Only a single molecule had ClogP > 3,
though 50 were quite hydrophilic, with ClogP < 0. Interestingly, 45 of the
molecules are listed as small molecule drugs, and 98 are substructures of
approved drugs. Perhaps this is not surprising; drugs themselves are studied
particularly intensively and frequently included in screening libraries.
The researchers had previously
analyzed commercial libraries, and in the new paper they compared versatile
fragments with the SpotXplorer library we wrote about here and the functionally diverse fragments used at XChem. Surprisingly there was very little overlap,
even though most of the versatile fragments or analogs are commercially
available. That said, some of the versatile fragments are molecules one may not
want in a fragment library, such as the cofactor lipoic acid and the metal chelator
1,10-phenanthroline.
Binding modes for the same
fragment in different pockets could vary considerably. The “universal fragment”
4-bromopyrazole, which we wrote about here, bound in two different binding
modes, while the nucleoside thymidine showed a whopping 26 different binding
modes. Conformations of the fragments could vary too, with only 43% of fragments
showing a conserved conformation in all binding sites (defined as < 0.5 Å RMSD). Conformational changes,
along with different protonation states, could be among the reasons why predicting
fragment binding continues to be challenging.
This is a nice analysis, and it
may be worth adding some of these versatile fragments to your own library. Laudably,
SMILES strings for of all of them are provided in the supplementary material.
This comment has been removed by a blog administrator.
ReplyDeleteHi,
ReplyDeleteThanks for sharing a nice study!
Don't you think that all activities related to mining ligands in PDB (numerous studies over last 10 years!) and making library designs on its basis lead to very polar, rich in HBD/HBA collections? Such structures are usually rather far from a typical fragment hit, don't score often in screening campaigns, and are not that easy to progress (if are evolvable at all, to be honest). Just a thought.
Hi Anonymous,
ReplyDeleteIt's an interesting point, and I think some of these fragments are perhaps too polar to be easily advanced, but it is worth noting that most fragments that have advanced to leads do have at least one polar interaction with the protein according to this analysis. Indeed, here's an example of a clinical compound that started from a very polar fragment that made multiple hydrogen-bonding interactions with the protein, all of which were maintained during optimization.