09 December 2024

They may be cons, but they’re our CONS

Practical Fragments has written repeatedly about various assay artifacts (vide infra). Different technologies are susceptible to different interference mechanisms, making general rules difficult. Earlier this year we wrote about the Metal Ion Interference Set, or MIIS: a collection of a dozen salts that could be used to assess the sensitivity of assays to metal contaminants. In a recent open-access JACS Au paper, Huabin Hu (Uppsala University), Jonathan Baell (Monash University), and collaborators extend the concept to small molecules.
 
The researchers have compiled a Collection Of useful Nuisance compounds, or CONS, perhaps with a nod to “Chemical con artists foil drug discovery” published a decade ago, which we highlighted here. The 103 members of the CONS are divided into three categories.
 
The first set contains five aggregators: molecules that have been shown to form colloidal clusters that non-specifically interfere with biological assays, as discussed here.
 
The largest set, at 67 members, consists of PAINS, or pan-assay interference compounds, which we first wrote about in 2010. These are themselves divided into various subcategories: non-specific electrophiles such as curcumin and an isothiazolone, redox cyclers such as quinones, contaminants such as the decomposition products of certain fused tetrahydroquinolines, miscellaneous, metal chelators, and additional mechanisms including optical interference and singlet oxygen quenchers, which are particularly problematic in AlphaScreen assays.  
 
The last set consists of 31 compounds that can cause problems in phenotypic assays. Some of these non-specifically disrupt cell membranes. Others have well-defined but toxic effects, such as interfering with tubulin or intercalating into DNA. Such bioactivity is not always a bad thing: some of these molecules, such as topotecan and colchicine, are approved drugs, but it’s useful to be aware of whether these types of activities will affect your assay.
 
One criticism of the PAINS concept is that it lumps together multiple mechanisms. (Pete Kenny wrote about this recently.) Another criticism is that, by focusing on chemical substructures, true hits may be unfairly deprioritized based on structure alone. What’s nice about the CONS list is that the potentially interfering mechanisms of each molecule are documented and categorized so they can be considered when establishing an assay. For example, you may not care whether a compound interferes in a phenotypic assay if you are performing a screen on an isolated enzyme.
 
The entire set of compounds is available from Enamine, and additional vendors are provided in a supplementary table. If you’re doing a lot of assays, particularly on new targets and mechanisms, it may be worth testing the CONS to understand what kinds of false positives might occur.

02 December 2024

Mapping protein conformations with fragments

Proteins can be remarkably dynamic, and, as we noted recently, different conformational states can reveal different pockets for small molecule ligands. But how can one survey and categorize all the possibilities? In a recent J. Chem. Inf. Model. paper, Doeke Hekstra and colleagues at Harvard University present a new tool for doing so.
 
High-throughput crystallographic fragment screens are becoming faster and more widely accessible, and the researchers wondered whether the information from these screens could be used to map protein conformational landscapes. To do so, they built a Python program called COLAV, short for COnformational LAndscape Visualization. This open-source tool can compile data from hundreds of protein coordinate files and then, for each protein, calculate the dihedral angles between backbone atoms, the pairwise distances between the alpha-carbon atoms, and the strain.
 
To a first approximation, dihedral angles capture local movements, while distances between alpha-carbons capture global movements, such as the distance between the N-terminus and C-terminus. Strain measurements are also local but can reveal particularly important features such as hinge movements. Also, while dihedral and pairwise distances can be calculated for single proteins, strain measurements are calculated after first aligning multiple structures.
 
Having calculated these three parameters for individual protein structures, COLAV can compare them across the selected set of structures using principal component analysis (PCA). These comparisons can reveal clusters with similar dihedral angles, pairwise distances, or strain.
 
The researchers provide two case studies. The first is the metabolic disease target PTP1B, which we recently wrote about here. This enzyme has been pursued intensively for decades, so the researchers were able to draw on 163 individual protein structures deposited in the protein data bank (PDB) as well as 187 structures from a high-throughput crystallographic fragment screen. PTP1B contains two flexible loops, each of which adopts one of two conformations, and COLAV successfully segregated all 350 structures into four clusters. Importantly, these four clusters were found whether the structures were pulled from the PDB (representing experiments conducted across multiple labs and years) or from the fragment screen, suggesting that a single crystallographic fragment screen can identify most or all of the conformational states available to a protein. This is particularly impressive given that most of the fragments bound in allosteric sites while most of the ligands found in the PDB bound in the active site.
 
Next, the researchers turned to the main protease (MPro) of SARS-CoV-2, the subject of intense and successful drug discovery efforts. They used 656 structures from the PDB and 631 structures from high-throughput crystallographic screens to perform COLAV analyses. Unlike PTP1B, discrete conformational clusters were not observed; rather a continuous band was seen, suggesting that the protein can assume myriad conformations. Here too though, the fragment screens were able to sample most of the conformations observed in the PDB.
 
The fact that a single high-throughput crystallographic screen can capture the conformations seen in hundreds of hard-won discrete protein-ligand crystal structures is encouraging, though of course the paper only describes two case studies. Also, as the researchers note, any structure that cannot be crystallized is not sampled. Since COLAV is free to use, it will be fun to see it applied to other proteins.

18 November 2024

Covalent fragments vs chikungunya nsP2

Perhaps because it sounds like “chicken,” when I first heard of chikungunya I thought it was a joke. But there’s nothing funny about a disease whose name comes from a word meaning “to become contorted,” referring to contortion caused by pain, which can last for months. The mosquito-borne alphavirus was first identified in 1952 in West Africa, introduced to the Americas in 2013, and is now spreading rapidly worldwide. There is no specific treatment. In three recent papers, a large group of researchers mostly from the Structural Genomics Consortium take the first steps towards one.
 
Like many viruses, the chikungunya genome encodes polyproteins that are cleaved by viral proteases, in this case a domain of the nonstructural protein 2 (nsP2). This cysteine protease is essential for viral replication, and the three papers collectively describe finding and exploring selective probes against it.  
 
In Proc. Nat. Acad. Sci. USA, Kenneth Pearce (University of North Carolina at Chapel Hill) and collaborators describe a screen of 6120 covalent fragments from Enamine against this target. Compounds were preincubated in a FRET-based functional assay at 20 µM for 30 minutes, resulting in 153 hits that inhibited activity by at least 50%. 43 of these were repurchased for full-dose response curves, and 20 of these had IC50 values < 20 µM. Of these, compound RA-0002034 was the most potent, with IC50 = 180 nM.


The proper way to assess irreversible covalent inhibitors is not the time-dependent IC50, but rather the (theoretically) invariant kinact/KI ratio. The researchers measured this for the best hits and found the value for RA-0002034 to be 6400 M-1s-1, which is not far below that for the approved covalent drug sotorasib for its target.
 
Mass spectrometry experiments after tryptic digestion revealed the compound binds to the catalytic cysteine of nsP2, as expected, and not to other cysteines. RA-0002034 contains a potentially reactive vinyl sulfone warhead, but the half-life against the biologically relevant nucleophile glutathione is a respectable 130 minutes. A screen against 13 other cysteine proteases was also quite clean, as was chemoproteomic profiling in human cells.
 
The compound was also tested in cellular viral replication assays and found to be remarkably potent, with a low nanomolar EC50 value. Encouragingly, it was also potent against three other alphaviruses, Ross River virus, Venezuelan Equine Encephalitis virus, and Mayaro virus.
 
RA-0002034 appears to be an attractive chemical probe for exploring the biology of chikungunya. Best practices are to also have an inactive control molecule, and the researchers made a substitution off the central pyrazole ring to produce RA-0003161, which is 500-fold less active.
 
The paper includes some SAR-by-catalog, and the chemistry is more extensively explored in an open-access J. Med. Chem. paper by Timothy Willson (UNC Chapel Hill) and collaborators. Although no crystal structures of the compounds bound to nsP2 were available, the researchers used modeling to guide modification of all portions of the molecule. The most potent molecule was 8d, which is slightly more active than RA-0002034. Also, methyl substitution near the electrophilic center is tolerated, which could improve stability, as seen with the covalent WRN inhibitor from Vividion which we wrote about here.
 
One annoying feature of RA-0002034 is its tendency to cyclize to inactive compound 2, a process explored in an open-access Pharmaceuticals paper by Timothy Willson and collaborators. This occurs even at neutral pH. However, replacing the central pyrazole with an isoxazole (compound 10) fixes this problem.
 
Collectively these three publications provide new insights and tools for investigating chikungunya. RA-0002034 is a far more attractive starting point than a molecule Teddy described on Practical Fragments back in 2015. The pharmacokinetics of RA-0002034 need to be improved before in vivo experiments are warranted, but this seems achievable, and I look forward to watching this story develop.

11 November 2024

Poll results: fragment finding methods and structural information needed for fragment-to-lead efforts

Our most recent poll asked about fragment finding methods. The poll ran from September 21 through November 8 and received 135 responses from 20 countries. Two thirds of these were from the US, about 12% were from the UK, 4% from Germany, 3% from the Netherlands, and 2% from Australia.
 
The first question asked how much structural information you need to begin optimizing a fragment. In contrast to 2017, when we first asked this question, crystallography has significantly increased at the expense of the other choices. 
 
 
I confess to being surprised, as I expected that by now people would be more comfortable beginning optimization in the absence of structural information, an approach that has been quite successful as discussed in a 2019 open-access Cell Chemical Biology review by Ben Davis, Wolfgang Jahnke, and me. Perhaps the increasing speed and accessibility of new methods has so lowered the bar to getting crystal structures that people have the luxury of waiting. Of course, with an online poll there is always the risk that many respondents from the same organization may skew the results.
 
The second question asked which methods you use to find and validate fragments. This is the fifth time we’ve run this poll, starting in 2011. As with our first question, X-ray crystallography came out on top, with nearly 80% of respondents choosing it. This was followed by SPR, at 67%, and thermal shift and ligand-detected NMR, each around 55%. 
 
 
Functional screening was used by nearly half of respondents, with computational methods, protein-detected NMR, and literature starting points used by around a third. Mass spectrometry and ITC were each used by slightly more than a quarter of respondents.
 
For the first time we asked about cryo-EM, and nearly 20% of respondents reported using this technique.
 
MST and affinity-based methods each came in at 13%, with just 4% of respondents using BLI, and 5 individual respondents using other methods. I’d be curious to know what these are.
 
The average respondent reported using just over 5 different techniques, which is down slightly from 6 in 2019 but up from 4 in 2016. Using multiple orthogonal methods is clearly well established as best practice, even if the precise number varies.
 
How do these results compare with your own practices?