19 February 2024

Hot spots real and imagined

Practical Fragments has written several times about “hot spots”: regions on proteins where small molecules and fragments readily bind. Knowing whether your target protein has a hot spot can help you decide whether to pursue the target in the first place. A variety of computational approaches have been developed for finding hot spots, most of which start with a crystallographically determined structure. In a new J. Chem. Inf. Mod. paper, Sandor Vajda and collaborators at Boston University and Stony Brook University ask whether computational models of proteins can also be used for one of the more popular methods, FTMap.
The researchers started with a set of 62 proteins, each of which had a published crystal structure bound to a fragment (MW < 200 Da) as well as to a larger molecule. The predicted structures of these proteins were then downloaded from the AlphaFold2 (AF2) site, and these models were truncated to correspond to the residues seen in the crystal structures to facilitate comparisons. The computational models were quite similar to the experimental models, particularly when comparing the positions of the peptide backbone atoms which define the overall shape of the proteins.
Next, the researchers applied the program FTMap, which computationally explores the surface of proteins using a set of 16 very small probes such as ethanol. Hot spots are regions where lots of probes bind, and the “hotness” of these spots correlates with the number of bound probes. FTMap assessed hotness on the AF2 structures and the crystallographicaly determined structures. (Before running FTMap, the bound ligands in the crystal structures were computationally removed.) Additionally, the researchers ran FTMap on unliganded crystal structures for the 47 proteins where these had been reported.
FTMap was broadly successful at finding the hotspots defined by bound fragments, succeeding 77% of the time starting with either the fragment-bound or unliganded structures and 71% starting with the AF2 models. Implementing stricter criteria (demanding the experimental fragment binding site be the top hot spot, for example) reduced the success to 56% for the crystallographic starting points and 47% for the AF2 models.
The paper discusses several examples in detail, in particular the two where the AF2 models were most different from the experimental models. Both of these were large, multidomain proteins. When AF2 models of just the ligand-binding domains were used, the models were significantly improved. This seems to be a generally useful hack: generating truncated AF2 models for other proteins also improved the performance of FTMap.
The utility of AF2 models for docking has been the subject of some debate, with some arguing that even though the overall protein folds may be accurate, local side chain conformations may be wrong, and a single side chain rotation may make the difference between ligand binding or not. This paper suggests that hot spots are not too sensitive to these subtleties, and that AF2 models can be used for finding hot spots.

12 February 2024

Fragment screening across the proteome, noncovalently

Last week we discussed methodological improvements to industrialize covalent fragment screening across the proteome. While I’m a huge fan of covalent binders, their noncovalent counterparts are the vanilla ice cream of FBLD: also tasty and much more common. Back in 2017 we described how “fully functionalized fragments,” or FFFs, could be used to screen noncovalent fragments in cells. A new paper in Nat. Chem. Biol. by Christopher Parker and collaborators at Scripps and BMS further optimizes the approach.
FFFs contain, in addition to the variable fragment, a photoreactive group (often a diazirine) and an alkyne tag. When exposed to light the photoreactive group can react with nearby proteins and the alkyne tag can be used to fish out the proteins. In the new paper the researchers started with a dozen FFFs.
One challenge, which we discussed in 2021, is that the FFFs may react with many sites on a given protein. During analysis, a protein is typically digested into peptides for mass spectrometry. If a FFF reacts at several sites on a peptide the resulting spectra will be “chimeric” and more difficult to characterize.
The researchers developed methods to take these chimeric spectra into account when searching for sites of modification. The approach, called Dizco (for diazirine probe-labeled peptide discoverer) can identify three times as many peptides as standard approaches, as well as more detailed information on sites of modifications. 
Two pairs of FFF probes consisted of enantiomers, and these showed differential labeling across the proteome, consistent with specific molecular recognition. The researchers also confirmed binding of a few FFF probes to several proteins using a cellular thermal shift assay (CETSA).
In all, the probes modified 3603 peptides on 1669 proteins. The sites of modification were then mapped onto predicted or modeled three dimensional structures of the proteins. Importantly, and consistent with the 2017 work, most of the labeled sites were near predicted pockets. The researchers confirmed this for four proteins by showing that FFF probe binding could be competed by adding ligands known to bind to the pockets.
Next, the researchers docked (using AutoDock) their FFF probes onto 175 proteins (108 from structures in the Protein Data Bank and 67 from AlphaFold structures). They found that the docking experiments recapitulated the experimental data, and in fact often placed the diazirine tag near the protein residues found to react. Strikingly, and in another step forward for in silico approaches, docking against structures from AlphaFold was nearly as effective as those from the protein data bank.
As the researchers conclude, “we identified many binding pockets that have no reported ligands… these probes may serve as leads for further optimization.” It will be fun to see how far they go.

05 February 2024

Fragment screening across the proteome, industrialized

Last week we discussed covalent fragment screens against isolated enzymes, which can be very effective. But screening in cells or cell lysates preserves proteins in a more physiological environment and allows many proteins across the proteome to be screened simultaneously. In 2016 we wrote about covalent screens in human cell lysates which identified fragment hits for 758 cysteine residues in 637 proteins. Mass spectrometry techniques have improved since then in terms of both speed and sensitivity, as illustrated in a new Cell Chem. Biol. paper from Steve Gygi, Qing Yu, and collaborators at Harvard Medical School and Biogen. (Disclosure: Steve Gygi is on the Scientific Advisory Board of my current company, Frontier Medicines.)
The approach is called TMT-ABPP, or tandem mass tag activity-based protein profiling, and it involves multiple improvements to previous methods, some of which Steve discussed at the Discovery on Target meeting last year. Covalent fragments are added separately to cell lysate aliquots, after which a desthiobiotin iodacetamide (DBIA) probe is introduced. If a given site on a protein has reacted with a fragment, it will not be available to react with the DBIA probe.
Next, proteins are digested to peptides and labeled with TMT (tandem mass tag) reagents, which allow multiple samples (18 in this case, either individual fragments or DMSO-only controls) to be combined for simultaneous analysis. Peptides functionalized with the DBIA probe are captured on streptavidin resin while those that had previously reacted with a covalent fragment will not stick to the resin and be lost. Peptides eluted from the resin are then analyzed by mass spectrometry. The “competition ratio” between treated and untreated lysate gives a measure of how strongly a given site on a given protein is labeled by a fragment.
Multiple other tweaks, such as capturing proteins using magnetic beads and using a special type of mass-spectrometry (high-field asymmetric waveform ion mobility spectrometry, or FAIMS), further streamline the process to a 96-well plate format, with each well containing a mere 10-20 µg of cell lysate, as much as 100-fold less than earlier approaches.
The researchers benchmarked TMT-ABPP using three reactive “scout fragments,” including compound 1 from last week’s post. Collectively they identified 6813 cysteine residues hit by one or more of the scouts.
To demonstrate throughput, the researchers next screened 192 fragments, a third of which were acrylamides while the rest were chloroacetamides. Even with two controls for every 16 samples, this only required 12 injections on a mass spectrometer and resulted in hits against 38,450 cysteines, about 50-fold more than the 2016 paper. Proteins that were more highly expressed were better represented, as were proteins with known reactive cysteine residues, such as thioredoxins. Surprisingly though, surface-exposed cysteine residues were only slightly enriched over more buried cysteines.
The researchers also applied TMT-ABPP to five well-characterized covalent molecules, including the mutant KRASG12C inhibitor ARS-1620, which we wrote about here. In addition to the G12C site of KRAS, several other proteins were also liganded, including adenosine kinase (ADK). The researchers confirmed that ARS-1620 inhibited ADK in an enzymatic assay.
As the researchers note, “proteome-wide profiling of thousands of compounds remains a formidable challenge, both technically and financially.” This paper reveals how to significantly reduce the costs. By using such approaches, it is possible to build a catalog of fragment ligands for thousands of proteins. Doing so with a well-curated library could enable rapid fragment-to-lead campaigns.