08 May 2012


I have been thinking a lot lately about library design, especially after the roundtable at breakfast in SD. I found a rant from an old friend/colleague, from the very early days of FBDD for me (2002 or 2003).   With his permission, but no citation for obvious reasons, I am reprinting it here.  I find this particularly interesting as AMW is still being used as recently as last year in papers describing fragment library design

The Average Molecular Weight [Ed:AMW] is currently an accepted orthodoxy within the Medicinal Chemistry community, a role reinforced by the recent popularity of things like the Rule of Five. To be sure, molecules with large molecular weights are not typically observed to be successful drugs.

In all the rest of this note, I’d like to focus on the common scenario of selecting molecules for purchase or testing. I’ve often seen people apply AMW cutoffs or scalings to these processes. I’d like to show why this may be sub-optimal.

First, I contend that the number of heavy atoms may be a much better proxy for “size” than AMW. Certainly, if you want to discriminate against heavier elements like Phosphorus, Sulphur, Chlorine, Bromine and Iodine, then by all means, use AMW. But with an AMW contribution of 126, a molecule with a single Iodine atom would be considered heaver/less desirable than the same molecule with a C6H9N2O substituent! Again, if you have good reasons for wanting to suppress these elements, AMW is a (very) good way of doing that.

The case of the third row and beyond elements is fairly straightforward, but what happens within the organic group Carbon, Nitrogen and Oxygen. Does AMW vs natoms mean anything in there?

Imagine we are selecting molecules for an assay and imposed an AMW cutoff of 180 – admittedly very low. We would then ignore Aspirin, with an AMW of 180.66, and instead we would test the “lighter” .

But it gets worse. Imagine if the Lilly chemists looking for antidepressants had imposed an AMW cutoff of 308. That would have excluded Prozac, with an AMW of 309,

and instead they would have tested the “more desirable lighter molecule”with an AMW of just 306.
Again, an AMW cutoff or bias could see us miss Zyprexa, AMW 312.4 and instead use the “lighter” molecule AMW 308.

Or the even more lighter still Oxygen variant! Now, of course, no medicinal chemist faced with the choice between these molecules would choose the second one, that’s obvious. Logp estimates would probably eliminate molecules that are all or mostly carbon atoms.

But what the examples above show is that in any automated system that is basing decisions on AMW, there will be a systematic, albeit small, bias towards Carbon atoms and against Nitrogen and Oxygen atoms. Why? A single CH2 group contributes 14 to AMW, whereas an NH contributes 15, and a two connected Oxygen atom 16. As terminal groups, CH3 is 15, NH2 is 16 and OH is 17. A Pyridine Nitrogen contributes 14, but an aromatic Carbon contributes 13, t-butyl groups preferred over CF3. Carbon wins the low AMW contest every time.

Now, how significant is this. Probably very small in most practical applications. I’d say that if you are setting up some kind of automated process, and you have equivalent access to AMW and the number of heavy atoms, use the number of heavy atoms in order to eliminate any small, pro-Carbon bias.


We see that automated procedures using AMW instead of natoms, will not only systematically suppress elements like P, S, Cl, Br and Iodine, but may also work to drive out N and O atoms as well!

1 comment:

Peter Kenny said...

I've always counted non-hydrogen atoms in preference to using molecular weight. Even before we got into fragment screening in the late 90s at Zeneca, we used a count of non-hydrogen atoms in analysis of high throughput screening output.

While it's fun to discuss the relative merits of different measures of molecular size, it's worth remembering that using a single set of cut offs is not the only way to do things. The Core and Layer approach that we've used and both Zeneca and AstraZeneca applies a series of progressively less restrictive cut offs.

We have a series of posts on fragment library design and these have been set up with 'next' links so people can read them in order. Here's the url of the first one: