I have been thinking a lot lately about library design, especially after the
roundtable at breakfast in SD. I found a rant from an old friend/colleague, from the very early days of FBDD for me (2002 or 2003). With his permission, but no citation for obvious reasons, I am reprinting it here. I find this particularly interesting as AMW is still being used as recently as last year in papers
describing fragment library design.
The Average Molecular Weight [Ed:AMW] is currently an accepted
orthodoxy within the Medicinal Chemistry community, a role reinforced by the
recent popularity of things like the Rule of Five. To be sure, molecules with
large molecular weights are not typically observed to be successful
drugs.
In all the rest of this note, I’d like to focus on the
common scenario of selecting molecules for purchase or testing. I’ve often seen
people apply AMW cutoffs or scalings to these processes. I’d like to show why this
may be sub-optimal.
First, I contend that the number of heavy atoms may be
a much better proxy for “size” than AMW. Certainly, if you want to
discriminate against heavier elements like Phosphorus, Sulphur, Chlorine,
Bromine and Iodine, then by all means, use AMW. But with an AMW contribution of
126, a molecule with a single Iodine atom would be considered heaver/less
desirable than the same molecule with a C6H9N2O substituent! Again, if you have
good reasons for wanting to suppress these elements, AMW is a (very) good way
of doing that.
The case of the third row and beyond elements is fairly
straightforward, but what happens within the organic group Carbon, Nitrogen and
Oxygen. Does AMW vs natoms mean anything in there?
Imagine we are selecting molecules for an assay and
imposed an AMW cutoff of 180 – admittedly very low. We would then ignore
Aspirin,
with an AMW of 180.66, and instead we would test
the “lighter” 
.
But it gets worse. Imagine if the Lilly chemists looking
for antidepressants had imposed an AMW cutoff of 308. That would have excluded
Prozac, with an AMW of 309,
and instead they would have tested the “more desirable
lighter molecule”
with an AMW of just 306.
Again, an AMW cutoff or bias could see us miss Zyprexa,
AMW 312.4 and instead use the “lighter” molecule
AMW 308.
Or the even more lighter still Oxygen variant! Now, of
course, no medicinal chemist faced with the choice between these molecules
would choose the second one, that’s obvious. Logp estimates would probably
eliminate molecules that are all or mostly carbon atoms.
But what the examples above show is that in any automated
system that is basing decisions on AMW, there will be a systematic, albeit
small, bias towards Carbon atoms and against Nitrogen and Oxygen
atoms. Why? A single CH2 group contributes 14 to AMW, whereas an NH contributes
15, and a two connected Oxygen atom 16. As terminal groups, CH3 is 15, NH2 is
16 and OH is 17. A Pyridine Nitrogen contributes 14, but an aromatic Carbon
contributes 13, t-butyl groups preferred over CF3. Carbon wins the low AMW
contest every time.
Now, how significant is this. Probably very small in most
practical applications. I’d say that if you are setting up some kind of
automated process, and you have equivalent access to AMW and the number of
heavy atoms, use the number of heavy atoms in order to eliminate any small,
pro-Carbon bias.
Conclusion
We see that automated procedures using AMW instead of
natoms, will not only systematically suppress elements like P, S, Cl, Br and
Iodine, but may also work to drive out N and O atoms as well!