17 July 2013

The rule of three at ten

One of the rewards of following a field for years is being able to revisit classic papers to see how they’ve held up. Two years ago we re-examined molecular complexity. This year marks the tenth anniversary of the publication of the “rule of three”, and Harren Jhoti and colleagues from Astex have marked the occasion with a brief but trenchant letter in Nature Reviews Drug Discovery.

Think back, if you will, to 2003. Abbott researchers had published their seminal SAR by NMR paper seven years previously, but fragment-based efforts were still widely scattered, with each organization more or less figuring things out on its own. It was in this primordial environment that Jhoti and colleagues published a short “discussion forum” in Drug Discovery Today. It framed its premise as a question (A ‘rule of three’ for fragment-based lead discovery?) and suggested that fragments have the following characteristics:
  • Molecular weight (MW) < 300
  • ClogP ≤ 3
  • # of hydrogen bond donors ≤ 3
  • # of hydrogen bond acceptors ≤ 3

In the new publication, the researchers note that most of the focus has been on the first two critera. Indeed, as has been pointed out, there is some ambiguity as to how one defines hydrogen bond donors and acceptors.

Since proposing the Rule of Three, Astex has been moving towards ever smaller compounds; the majority of their fragments now have fewer than 17 non-hydrogen atoms, with a molecular weight < 230 Da. One consequence is that the other properties automatically fall into line: a quick search of ~100,000 compounds with ≤ 16 heavy atoms reveals that 86% have ClogP ≤ 3, while out of 370,000 compounds with MW 300, only 72% have ClogP ≤ 3.

This push towards smallness has been questioned, particularly in the context of protein-protein interactions, where some have suggested that larger fragments may be required. Jhoti (and others) counter with two arguments.

On a theoretical level, all proteins are made up of amino acids, so there shouldn’t be anything special about protein-protein interactions:

Fragments are – or should be – simple enough to probe the basic architecture of all proteins yet have sufficient complexity to allow them to be elaborated into lead compounds.

On a practical level, after screening more than 30 targets, the researchers find that many fragments that hit protein-protein interactions also hit other targets.

The researchers are in favor of “three-dimensional” fragments, but not at the cost of increased size. They note that the perception that fragment libraries are dominated by “flat molecules” may be distorted by the fact that many fragment success stories (including nearly half of clinical-stage compounds) involve kinases, which have a predilection for planar adenine-like fragments. That said, they acknowledge that many fragment libraries are sub-optimal, leading to heartbreak during optimization. As they note with restraint, “not all fragment libraries are alike.”

Finally, there is a nice analysis of what to do with fragments that don’t reproduce in orthogonal assays. They typically observe 30%-40% correlation between fragment hits from ligand-observed NMR and X-ray crystallography, but note that this isn’t bad given that an NMR hit can be detected at just 5% binding, while crystallography typically needs at least 70% occupancy. Thus, NMR can detect fragments with solubilities less than their dissociation constants, which is unlikely in the case of crystallography. Although it is reassuring when multiple techniques confirm, the danger is that:

This strategy implicitly places a reliance on the least sensitive technique. This is of particular concern as the most potent fragment is often not the best starting point for hit-to-lead chemistry.

Not surprisingly to those familiar with Astex, the researchers put a premium on crystallographic information.

Closing with the rule of three, I think part of what bothers some folks is the notion of “rules” in general; nature has never read an issue of Nature, and drug discovery will never be as reductionist as physics is. Indeed, the researchers acknowledge that the rule of three “is just a guideline that should not be overemphasized.” The rule of three is a play on Chris Lipinski’s (equally contentious) rule of five; perhaps the “guideline of three” would have been less controversial. But the spirited discussion ensuing over the years has generated light as well as heat, which the authors welcome:

We trust that our comments, some of which are deliberately provocative, on these many facets of FBDD will generate active discussion and might assist in improving the success of this approach for the broader drug discovery community.

The comments are open for those who would like to continue the discussion here!


Dr. Teddy Z said...

The Ro3 is dead. Finally. Now, we can commence with the discussion of what fragment libraries should look like. I think the only correct answer is what Justice Potter said, "I know it when I see it." However, in this case, it's the target that sees it. There are also good discussion fodder for what should be the triaging criteria for screen actives.

Anonymous said...

What strange comments, I dont understand anything you are going on about? I also dont know your credentials for commenting on this blog given a distinct lack of publications from you.

Dr. Teddy Z said...

Let me clarify my statements. The rule of 3 has been shoveled under the bus, appropriately so. I think everyone can agree it shouldn't be a rule and it should be applied very sparingly.

Point #2: What should fragment libraries look like? I don't think there is an answer here. I think every library is designed for a purpose and as long as somebody is happy with it, it works. However, I would propose that for literature references, any starting molecule greater than 18 heavy atoms SHOULD NOT be allowed to be called a fragment.

Point #3: How do you triage actives? Or more accurately, how should you screen? The point is raised that as little as 5% binding will be detected by NMR, while you need >70% for X-ray. This is a very important point. I would love to hear people's comments on this.

Dan Erlanson said...

Actually, I still think the rule of three is useful, both as a tool and as a springboard for discussion. As a tool its value is in setting upper bounds on the size and lipophilicity of fragments. This is not to be underrated. I would venture to say that on balance the rule of three has had a positive effect by keeping fragment libraries populated with more attractive molecules than they otherwise would have been.

Far from being dead, I predict that the rule of three will be alive, well, and still catalyzing discussion in another ten years.

Glyn Williams said...

Thanks, Dan, for the article and leading the discussion.

We here at Astex continue to believe that the Ro3 is a useful indicator of the boundaries of fragment space, particularly when the aim is to develop an oral drug. However it is also true that, after practical experience of X-ray and biophysical screening, we tend to weight size and lipophilicity more heavily than other Ro3 features.

Our view that smaller molecules make the best fragments is driven in part by the theoretical advantage of better sampling of chemical space, but also by the practical advantages of screening fewer, smaller fragments. So it was interesting to see Dan’s data on the proportion of compounds with cLogP<3 for compounds with <16 (~225Da) and <21 (~300Da) heavy atoms - this could provide a useful benchmark if compared with average values for fragment libraries.

On average, it is likely that smaller fragments will bind less tightly to the target and therefore will need to be screened at higher concentrations in order to detect (ligand efficient) hits. However smaller fragments are generally more soluble and, with smaller (less-numerous) libraries, it becomes feasible to measure phys. chem. properties (logP, solubility, stability, aggregation etc.) for all library members.

Exactly where the best compromise between decreasing fragment size and decreasing affinity lies depends on the detection method employed and will not be the same for all practitioners of FBDD. X-ray screening and protein-detected NMR screening can be robust at very high concentrations of ligand and thus can detect very weak binders (say Kd>20mM), but that requires highly soluble compounds. In contrast, ligand-detected NMR screening is able to detect hits whose solubility is less than their affinity, but would require considerably higher protein concentrations than are typically used in order to detect very weak binders.

Our approach has been to understand the strengths and limitations of different detection methods with the ultimate aim of creating a library and methods that allow us to generate X-ray structures of multiple protein-fragment complexes. This gives a number of starting points for chemistry; in addition comparing multiple structures can allow fuller exploration of the 3D nature of the binding site, as well as the ligand, and can highlight areas of protein flexibility.

As we all refine our approaches there are additional guidelines (not rules!) that could be added, but many will depend on our experimental methods, target classes and ultimate destination, from tool molecule to oral drug.

Dipen Shah said...

Thanks for this interesting discussion. Is "Type of detection based library composition" a more pragmatic approach?

Dr. Teddy Z said...

Dipen, I am not sure what you mean. Could you clarify?

Dipen Shah said...

Hi Teddy, I meant whether in general different criteria used for selection of fragments in a library further needs to be influenced by the type of detection method?

Dr. Teddy Z said...

Dipen, very interesting thought. I have an opinion, but I am curious as to what other people's are.

Glyn Williams said...

In practise, only a small fraction of chemical space can be sampled, at whichever fragment size we might choose. However that still leaves open the question of how few fragments will give ‘sufficient coverage’ and what their properties (size, lipophilicity, shape etc) should be. It should go without saying that those fragments should be free of PAINS and ‘well behaved’ (e.g. aggregation, fluorescence/quenching..) under screening conditions.

Arguments based on the output from previous fragment screens will necessarily be weighted by the methods used to discover fragment hits. This is a positive advantage if the same methods are available to you but it would be dangerous to formulate laws (although obviously not guidelines) based solely on those observations. For example, 1H ligand-detected (LD) NMR screening in cocktails favours molecules containing protons with long relaxation times and which lie in regions of the spectrum which are relatively uncluttered by biological buffers: so LD-NMR tends to favour fragments which contain aromatic protons. Since aromatic rings are flat, taken alone this could look like a preference for 2D hits. And since aliphatic hits can be disfavoured, this could be interpreted by the unwary as a bias against rotatable bonds.

The advantage of LD-NMR is that hits can be detected at fragment concentrations below their Kd, provided the protein concentration is sufficient to give a few % of complexed ligand. In contrast, methods where a property of the protein is detected (X-ray screening, thermal unfolding, protein-detected NMR, protein-immobilised SPR, etc.) require fragment concentrations that approach or exceed Kd.

If resource allows I would advocate running screens that are as large as possible, provided the library is well characterised and the strength and limitations of the screening methods are understood. However in the practical world, and particularly when evaluating targets, it may pay to organise screening to deliver most hits early in the screen. That might entail leaving either less soluble or predominantly aliphatic fragments to the end of the process.

Dave Stepp said...

Great discussion. I must agree with much of the message in the original Letter as well as Glyn's comments. In particular I feel that an emphasis on smaller fragments which bind with weaker affinity is the right way to go. We tend to be somewhat potency-agnostic, with a focus on crystallographic binding mode, synthetic tractability, and use of computational methods to guide fragment elaboration. I totally agree that without structural information it is extremely difficult to drive fragment elaboration (sorry Dan!).

Another area that has been an active area of discussion/debate is the concept of selectivity of fragment hits. For some there has been an expectation that fragments should be selective (bind only to target protein and not to others) and non-selective fragment “hits” should be discarded. However, we (and others) have found that non-selective fragments can be successfully elaborated into highly selective (and potent) lead compounds. This is in line with the Letter’s assertion that proteins present common recognition motifs and thus many fragments can bind to different proteins/classes.

I would, however, like to challenge the assertion that LD-NMR is the only fragment detection method that can detect binding at concentrations well below a compound’s KD. If designed and executed properly, SPR can be used to detect binding at 5-10% occupancy as well. We routinely observe binding of fragments with KD’s >2mM when screening at 200uM. Additionally, SPR provides qualitative information about compound binding (sensorgram shape) that is not available by other methods. FWIW, our crystallographic success rate for SPR hits varies per target/class, but we’ve seen upwards of 75% for some (full disclosure: ~20% for others). Bottom line: no one method is perfect (as many have asserted) and now that we (Genzyme/Sanofi) have LD-NMR capabilities & expertise we can use both methods to identify hits (but not requiring a fragment to be a hit in both methods), especially for more difficult targets & classes.

Glyn Williams said...

I was aware that my knowledge of SPR was getting out-of-date, so I am happy to hear from Dave that SPR is proving useful even at low protein occupancy. Provided that sensitivity can be maintained when using smaller fragments, the availability of sensorgram information and its automation justify SPR’s position, with LD-NMR, as the dominant fragment screening methods - see the Practical Fragments October 2011 poll and NMR/SPR /X-ray comparison (Jan 2013).

Staying with libraries, I also believe that the experience of medicinal chemists should not be overlooked when selecting fragments. Although fragments do not need (or benefit from) ‘synthetic handles’, they do need to provide tractable design ideas with a range of growth vectors. We should avoid the common HTS experience of looking through hundreds of hits and deciding none are suitable for med-chem follow-up.

Synthesising novel fragments has not proved to be easy though – cramming a lot of heteroatoms into a small space presents enough challenges to make it interesting.

Dan Erlanson said...

Thanks everyone for your comments - really good discussion here.

Glyn's point about ligand-detected NMR favoring fragments with aromatic groups is particularly interesting in the context of trying to judge fragment library compositions from the resulting hits. Has anyone compared the aromaticity of fragment hits arising from NMR versus SPR screens on the same library?

Dave brings up good points too; I do agree that fragment optimization in the absence of structure is difficult, but it is possible, as shown here and in some of the work that we've been doing at Carmot.

On the issue of selectivity, I've come to believe that this is not something to worry about at the fragment stage. On a theoretical level, the molecular complexity model argues that small fragments should not be very selective. On an experimental level, Paul Bamborough and colleagues at GlaxoSmithKline showed that selective fragments do not necessarily lead to selective leads, while non-selective fragments can be optimized to highly selective molecules. Finding a fragment is just the beginning!