27 March 2023

Crystallography heats up, it seems for the better

Of the roughly 150,000 crystal structures in the Protein Data Bank (PDB), about 94% were collected at cryogenic temperatures (≤ 200 K), typically after being frozen in liquid nitrogen. Frozen crystals are easier to store and transport and can better survive bombardment by intense X-rays, but are the resulting structures still physiologically relevant? A newly published open-access paper by Daniel Keedy (CUNY) and a multinational team of collaborators suggests caution.
The researchers were interested in the protein tyrosine phosphatase PTP1B, a diabetes target that has implacably thwarted generations of drug hunters. A previous screen had identified dozens of fragments that bound at cryogenic temperatures to multiple sites on the protein, most prominently the active site and three secondary sites. In the new work, 143 fragments were chosen for reanalysis at room temperature, of which more than half were hits and the rest were non-hits that had been included in the previous screen. The goal was to assess whether the same fragments would bind in the same manner, and whether any of the non-hits would bind at room temperature.
Two methods were used for room-temperature crystallography. In both cases, fragments were soaked into crystals of PTP1B. In the first, crystals were harvested from drops as normal, but rather than being flash frozen they were enclosed in plastic capillaries to keep them hydrated. The second “in situ” method was performed by mounting the crystallization trays directly onto the goniometer at the synchrotron.
Both methods gave similar results, though the average resolution for the in situ method (1.99 Å) was better than the capillary method (2.30 Å) and even surpassed the previous cryo screen (2.10 Å). Fragment hits were initially identified using the automated PanDDA method (see here), followed by manual analysis and careful data processing to ensure that even the weakest binders were not overlooked.
Surprisingly, the hit rate was quite low: only about a third of the fragment hits that had been identified under cryogenic conditions were found in the screen at ambient temperature. Moreover, the room-temperature fragments tended to have lower occupancy factors, meaning that a higher fraction of a given binding site was empty. The researchers took special care to ensure that the results were robust, for example by checking to make sure that fragments had been correctly added to the droplets.
Of the fragments that confirmed, the binding modes often differed, for example by flipping 180°. In some cases water molecules around the fragment varied depending on the temperature; as we’ve written, water can be essential for binding interactions, so these differences could be significant. And in some cases the room-temperature fragments bound to different sites entirely, including one site that had not been observed to bind any fragments under cryogenic conditions.
Only one of the fragments that had previously not produced hits under cryogenic conditions showed up as hit at room temperature, but interestingly this one formed a covalent bond to two different lysine residues.
So what are we to make of all this? The researchers speculate that the overwhelming focus on cryogenic structures “may favor protein-ligand interactions that overweight enthalpic considerations and underweight entropic ones, feature inaccurate solvation environments, or suggest artificially rigid proteins.” The fact that most machine-learning models of protein-ligand interactions are trained on cryogenic structures may systematically bias their results.
In 2016 we highlighted an argument for moving crystallography to the front of a screening cascade, but as we noted last year doing so may lead to myriad hits too weak to advance. Perhaps room temperature crystallography can bridge this gap.

20 March 2023

Versatile fragments from the Protein Data Bank

Four years ago we highlighted an analysis of fragments taken from the Protein Data Bank (PDB). Of 462 unique fragments, just 21 bound in more than one pocket. With the assumption that such “versatile” fragments may be particularly valuable starting points, Esther Kellenberger and colleagues at CNRS Univeristé de Strasbourg have done their own exploration of the PDB, as reported (open access) in Front. Chem.
Structures deposited in the PDB starting in 2000 with resolution better than 3 Å were examined to find those containing fragment-sized molecules (MW < 300 Da). Crystallization additives, phosphate and sulfate ions, and other unlovable molecules such as PAINS were excluded. Further triaging for fragments that bound in more than one pocket and in more than one binding mode (ie, different types of interactions) ultimately yielded a set of 203 versatile fragments. (One reason why so many more fragments were found in this study is the fact that the previous analysis required the word “fragment” to be present in the PDB entry.)
The versatile fragments are mostly compliant with the rule of three, with violations mostly related to the number of hydrogen bond donors or acceptors. Only a single molecule had ClogP > 3, though 50 were quite hydrophilic, with ClogP < 0. Interestingly, 45 of the molecules are listed as small molecule drugs, and 98 are substructures of approved drugs. Perhaps this is not surprising; drugs themselves are studied particularly intensively and frequently included in screening libraries.
The researchers had previously analyzed commercial libraries, and in the new paper they compared versatile fragments with the SpotXplorer library we wrote about here and the functionally diverse fragments used at XChem. Surprisingly there was very little overlap, even though most of the versatile fragments or analogs are commercially available. That said, some of the versatile fragments are molecules one may not want in a fragment library, such as the cofactor lipoic acid and the metal chelator 1,10-phenanthroline.
Binding modes for the same fragment in different pockets could vary considerably. The “universal fragment” 4-bromopyrazole, which we wrote about here, bound in two different binding modes, while the nucleoside thymidine showed a whopping 26 different binding modes. Conformations of the fragments could vary too, with only 43% of fragments showing a conserved conformation in all binding sites (defined as < 0.5 Å RMSD). Conformational changes, along with different protonation states, could be among the reasons why predicting fragment binding continues to be challenging.
This is a nice analysis, and it may be worth adding some of these versatile fragments to your own library. Laudably, SMILES strings for of all of them are provided in the supplementary material.

13 March 2023

A very useful list: common linkers and bioisosteric replacements

Last week’s post highlighted an example of fragment linking, which despite being less common than fragment growing can still be effective. But how do you choose the linker? We’ve previously written about the most common rings found in drugs. In a new Bioorg. Med. Chem. paper Peter Ertl and colleagues at Novartis tabulate the most common linkers found in bioactive molecules.
The researchers start by defining linkers “as moieties connecting 2 ring systems.” To focus on druglike molecules, linkers could contain no more than eight non-hydrogen atoms total and no more than five consecutive bonds between the two ring systems. This means that para-disubstituted phenyl or 1,4-disubstiuted butyl would both be considered in the analysis, but longer linkers such as this recent example would not.
Molecules were extracted from the databases ChEMBL and ZINC, yielding a total of 1686 unique linkers. Various descriptors were calculated for all, which in addition to size and length included the number of heteroatoms and electronic properties. Bioactivity data for molecules in ChEMBL was used to assess which replacements were most frequently tolerated. If one linker could be replaced by another without causing a drop in affinity (or inhibition, etc.), the two linkers were considered to be bioisosteres.
So, what are the most common linkers? A single methylene is the most common, followed by an amide bond. I was surprised that, of the 40 most common linkers, only five are rings: para-disubstituted phenyl, 1,4-piperzine, 1,4-piperidine, 1,2,4-oxadiazole, and meta-phenyl, in that order. Not coincidentally, phenyl rings, piperidines, and piperazines are also the most common rings found in drugs, according to an analysis last year.
Last year we highlighted a paper from the Ertl group that included a link to a “Ring Replacement Recommender,” which suggests bioisosteric replacements for any ring. Alas, there is no “Linker Replacement Recommender,” but the new paper does provide a “bioisosteric replacement network,” which is a full-page 10 x 15 grid with the 150 most common linkers arranged such that nearby linkers are likely to be bioisosteric. For example, para-phenyl is adjacent to 2,5-thiophene and quite some distance from sulfone. These make sense, but there are also less obvious examples: the table suggests that a 1,4-pyrazole makes a good replacement for a carbamate.
The next time you’re doing SAR, it may be worth consulting the bioisosteric replacement network for ideas.

06 March 2023

Fragment linking on the bacterial TPP riboswitch

Last week’s post focused on fragment screening against RNA, and we continue the theme this week with a paper published in Proc. Nat. Acad. USA by Kevin Weeks and collaborators at University of North Carolina Chapel Hill, New York University, and Université de Sherbrooke.
The researchers developed a screening technology called SHAPE-MaP (Selective 2’-Hydroxyl Acylation analyzed by Primer Extension and Mutational Profiling). Essentially, RNA in the presence or absence of potential ligands is treated with an acylating agent that reacts with the 2’-hydroxyl group on ribose subunits. This addition requires the hydroxyl groups to be exposed, so ligands that bind in the vicinity may directly block or cause conformational changes to change the patterns of acylation. Conveniently, acylation causes mutations when the modified RNA is sequenced, making modified sites easy to detect. Moreover, by clever uses of “barcodes” in other regions of the RNA, multiple samples can be pooled and analyzed.
Because the approach uses sequencing to identify binding sites, long strands of RNA can be tested. In this case, the researchers built an RNA construct containing a ‘pseudoknot’ structure in the dengue virus genome as well as a thiamine pyrophosphate (TPP) riboswitch, which changes conformation when it binds to TPP. (We wrote about a different fragment screen against this riboswitch back in 2014). A set of 1500 rule-of-three compliant fragments from Maybridge was screened, resulting in 41 hits. These were then rescreened in triplicate, which winnowed the field to just eight fragments, of which seven bound TPP and one appeared to be nonspecific. All eight were assessed by isothermal titration calorimetry (ITC), which produced measurable affinities for six.
Compound 2 was the most potent hit, and when the researchers tested 16 analogs they found some, such as compound 17, with improved affinity. With an eye towards fragment linking, they took a conceptually similar approach to SAR by NMR by rescreening the original 1500 fragments in the presence of compound 2 to look for ligands that would bind at a second site. This yielded five hits, including compound 28. ITC characterization of a more soluble analog, compound 31, revealed that it had weak but measurably improved affinity in the presence of compound 2. A handful of linked analogs were made, and while most of these had affinity similar or worse than the best initial fragment, compound Z1 bound with sub-micromolar affinity as assessed by ITC.

The natural ligand TPP binds to the riboswitch with a Kd of 110 nM, and in doing so blocks in vitro transcription of bound RNA. Despite having a similar affinity as TPP, compound Z1 was much less effective at blocking transcription. Unfortunately, although the researchers were able to obtain crystal structures of several molecules bound to the riboswitch, including compound 17, they were unable to obtain one with compound Z1.
This is a rare example of fragment linking on RNA, and although the linked molecule does not show fully additive affinity, it does have reasonable ligand efficiency. But like the example last week, this paper illustrates how difficult discovering RNA binders is likely to be. The confirmed hit rate is less than 0.5%, and this is for an RNA sequence that evolved specifically to bind low molecular weight ligands. As the researchers note, none of the fragments bound the dengue virus pseudoknot. Perhaps most RNA is truly undruggable, at least with small molecules.