Back in 2023 we highlighted a
computational fragment linking/merging approach which was used to find high nanomolar inhibitors of the SARS-CoV-2 macrodomain (Mac1), a COVID-19 target. However, those molecules contained carboxylic acids, often associated
with poor cell permeability. In a new open-access Sci. Adv. paper, James Fraser and collaborators
at UCSF, Relay Therapeutics, Enamine, and Chemspace describe a related approach to find
new, non-charged inhibitors.
The new approach, called FrankenROCS,
“takes pairs of fragments as input to query a database using the rapid overlay
of chemical structure (ROCS) method of comparing 3D shape and pharmacophore
distribution;” the goal is to find larger molecules that most closely resemble the initial fragment pairs. As with the previous publication, the team started with more than 200 crystallographic fragment hits published in 2021. A set of 7,181 pairs of adjacently-bound
fragments were searched against 2.1 million compounds commercially available
from Enamine. The top 1000 were inspected, and 39 were purchased and soaked
into crystals of Mac1. This led to 10 successful structures, of which AVI-313 did
not contain a carboxylic acid. This molecule had weak but measurable activity
in an HTRF competition assay.
Two million compounds is a lot
but pales in comparison to Enamine’s “make-on-demand” REAL space, which at the
time this research was done consisted of more than 22 billion molecules. The REAL space molecules are constructed from 960,398 building blocks that can be combined using
143 reactions. We previously described an approach called V-SYNTHES to screen Enamine’s
REAL space. FrankenROCS takes a different active-learning approach called
Thompson Sampling, which dates back nearly a century.
Imagine two sets of 1000 building
blocks, R1 and R2, which could be coupled to generate 1,000,000 molecules.
Rather than searching all possibilities, each R1 building block is linked to
three random R2 building blocks, and each R2 building block is linked to three
random R1 building blocks. These are virtually screened, and the R1 or R2 building blocks
from those with the highest scoring compounds are used for further iterations.
In theory, after tens of thousands of iterations, the best compounds will have
been identified.
The researchers fed 97 fragment
pairs from the 2021 paper into Thompson Sampling FrankenROCS to find molecules
that would best overlay with the fragment pairs. Ultimately 32
compounds were purchased, six of which were successfully crystallized with Mac1.
Unfortunately, the most potent was a weaker inhibitor than AVI-313 and contained
a carboxylic acid. The researchers speculate that the inability to find better
molecules in larger chemical space may have stemmed from limitations of the
scoring function, a problem we’ve previously discussed.
The researchers returned to focus
on AVI 313, making substitutions at multiple positions, ultimately synthesizing
148 compounds, 121 of which could be characterized crystallographically. Importantly,
several compounds had low micromolar activity, even without a carboxylic acid. The
crystal structures show the binding site to be somewhat flexible, as evidenced
by side chain and main chain movements to accommodate some of the binders.
This is a nice, thorough
investigation, and the 137 protein-compound crystal structures deposited into
the protein data bank provide useful training data for next-generation computational
approaches. Moreover, the fact that immeasurably weak fragments can be advanced
to low micromolar, ligand-efficient hits is yet another reason for the research community to
figure out how to make crystallographic fragment screening data more widely
available, as we exhorted here.
No comments:
Post a Comment