Practical Fragments has quite a few posts about
PAINS, or pan-assay
interference compounds. In part this reflects their sad
prevalence in the
literature, but it’s also fair to say that they are easy targets because many
are readily
recognizable.
But not all
artifacts are so
easily spotted, as discussed in a
new paper just published in
J. Med. Chem. by John Irwin, Brian
Shoichet, and colleagues at the University
of California San Francisco
(see also
here for Derek Lowe's excellent summary).
The researchers took on one of
the most insidious problems,
compound aggregation, in which small molecules
form colloids that bind to and partially denature proteins, causing false
positives in all sorts of assays. This can happen even at nanomolar
concentrations of compound, and is all the more problematic at higher
concentrations used in fragment screening and early hit to lead optimization.
In many cases aggregates can be disrupted or passivated by including nonionic
detergents such as Triton X-100 or Tween-80, but not all assays tolerate
detergent, and some aggregates form even in the presence of detergent.
Worse, all sorts of molecules can
form aggregates, including many approved drugs. Previous attempts to try to
predict which molecules will aggregate have not been very successful. Colloid
formation is essentially a phase transition, and like other such transitions
(crystallization, for example) it is fiendishly difficult to predict what
molecules will do this under what conditions. But if we can’t predict from
first principles which molecules will form aggregates, can we at least draw
empirical lessons?
The researchers assembled a set
of >12,600 known aggregators and put together a very simple model that
assesses how similar a molecule of interest is to one of these aggregators
(using
Tanimoto coefficients, or Tcs). Aggregators have a wide range of physicochemical
properties, with ClogP values from -5.3 to 9.8, but 80% have ClogP> 3.0. The
team hypothesized that a molecule sufficiently similar to a known aggregator –
and also somewhat lipophilic – would have a higher probability of being an
aggregator than a molecule chosen at random.
To test this idea, the team took a batch of 40 molecules and tested them for
aggregation. Among those most similar to known aggregators (Tc ≥95%), 5 of 7
molecules were confirmed as aggregators. This fell to 10 of 19 for the next set
(Tc 90-94%), 3 of 7 after that (Tcs 85-89%) and only 1 of 7 for the least
similar (Tcs 80-84%). Thus,Tc ≥85% was chosen as the cutoff.
Next, the researchers examined
molecules that had been reported as active in some sort of biological assay,
and found that 7% were ≥85% similar to a known aggregator and had ClogP> 3.
Ominously, this rate is an order of magnitude greater than the number of
commercially available compounds that also fit these criteria. More damning,
most of this enrichment has occurred since 1995, when high-throughput and
virtual screening really went mainstream. In other words, the past couple
decades have seen a sizable enrichment of potential aggregators in the
literature.
All of this is fascinating, but
what really makes this paper significant is that the researchers have made all
their primary data available, and also built a simple to use website called
“
Aggregator Advisor”. Just draw your molecule or paste a SMILES string to
generate a report. For example, entering
gossypol tells you that this molecule
has previously been reported as an aggregator. (With two
catechol moieties,
it’s also a PAINS.) Perhaps not coincidentally, it shows up in more than 1800
publications.
Of course, as the researchers
note, “just because a molecule aggregates, under some conditions, in the same
concentration range as it is active, does not establish that its activity is
artefactual.” Indeed, 3.6% of FDA-approved drugs are known aggregators. Still,
particularly if your hit has only modest activity (0.1 µM or worse), similarity
to a known aggregator should at least make you cautious.
The researchers are at pains to
emphasize that their model is “primitive and subject to false negatives and
false positives.” Thus, any hits need to be tested to see if they behave
pathologically in any given assay. More importantly, a molecule that comes up
as a negative should not be presumed to be innocent.
All these caveats aside,
Aggregator Advisor is very easy to use. It’s certainly worth running
the next time you find an interesting molecule – whether in
your lab or in the literature – particularly if there was no detergent in the
assay.