17 November 2025

xLE: solving problems or missing the point?

Ligand efficiency (LE) has been discussed repeatedly and extensively on Practical Fragments, most recently in September. Two criticisms are its dependence on standard state and the observation that larger molecules frequently have lower ligand efficiencies than smaller molecules. In a just-published open-access ACS Med. Chem. Lett. paper, Hongtao Zhao proposes a new metric, xLE, to address these concerns.
 
LE is defined as the negative Gibbs free energy of binding (ΔG) divided by the number of non-hydrogen (or heavy) atoms, and of course ΔG is state-dependent. Standard state assumptions are 298K and 1M concentrations, choices that some people see as arbitrary since few biologically relevant molecules ever achieve concentrations near 1M. To remove the dependence on standard state, Zhao proposes to remove the translational entropy term of the unbound ligand from the free energy calculation.
 
Zhao also addresses the second criticism, that larger molecules often have lower ligand efficiencies. This phenomenon was observed in an (open-access) 1999 paper titled “the maximal affinity of ligands,” which found that, beyond a certain threshold, larger ligands do not have stronger affinities; there are very few femtomolar binders even among the largest small molecules. Thus, Zhao proposes attenuating the size dependence.
 
The new metric, xLE, is defined as follows:
 
xLE = (5.8 + 0.9*ln(Mw) – ΔG)/(a*Nα) - b
Where N is the number of non-hydrogen atoms, α is chosen to reduce size dependence, and a and b are “scaling variables.” He chooses α=0.2, a=10, and b=0.5, with little explanation.
 
To assess performance, Zhao examined nearly 14,000 measured affinities from PDBbind. When plotted by number of atoms, median affinity increased up to about 35 heavy atoms but then leveled off. Median LE values decreased sharply from 6 to 12 heavy atoms and then leveled off somewhere in the 20s. But median xLE values were consistent regardless of ligand size.
 
Zhao also examined LE and xLE changes for 175 successful fragment-to-lead studies from our annual series of J. Med. Chem. perspectives. LE decreased from fragment to lead for 48% of these, but xLE increased for all but a single pair.
 
And this, in my opinion, is a problem.
 
In the seminal 2004 paper, LE was proposed as "a simple ‘ready reckoner’, which could be used to assess the potential of a weak lead to be optimized into a potent, orally bio-available clinical candidate." The metric was particularly important before FBLD was widely accepted, when chemists were even less inclined to work on weak binders.
 
Here is the situation for which LE was devised. Imagine two molecules, compounds 1 and 2. The first has just 12 non-hydrogen atoms, a molecular weight of 160, and a modest 1 mM affinity for a target - similar to some fragments that have yielded clinical compounds. The second is much larger: 38 non-hydrogen atoms, a molecular weight of 500, and 10 µM affinity for the same target. Considering potency alone, compound 2 is the winner.
 
However, the LE for compound 1 is a respectable 0.34 kcal/mol/atom, while the LE for compound 2 is 0.18 kcal/mol/atom. So while a 10 µM HTS hit may initially look appealing, the LE suggests that this is an inefficient binder, and further optimization may require adding too much molecular weight to get to a desired low nanomolar affinity.
 
In contrast, the xLE values for both compounds are nearly identical, 0.38, and so this metric would not help a chemist prioritize which hit to pursue. In other words, xLE does not provide the insight for which LE was created. It might even lead to suboptimal choices. 
 
Moreover, unlike LE, xLE is non-intuitive. And finally, with three scaling or normalization factors, xLE is arguably even more arbitrary than a metric dependent on the widely-accepted definition of standard state.
 
Personally I find the practical applications of xLE limited, but I welcome your thoughts.

8 comments:

Hongtao Zhao said...

Thank you for commenting on the viewpoint—much appreciated. xLE actually has only one parameter; the other two are used solely to place its distribution on the same scale as LE. One can remove those two by setting a = 1 and b = 0 without changing the conclusions. The parameter α was determined empirically to minimize the dependence of the median xLE on molecular size; I realize the original wording may have been misleading.

In the context of LE for fragments with 12 heavy atoms, the “respectable 0.34 kcal/mol/atom” warrants a second thought. In contrast, xLE indicates that both compounds have relatively low binding efficiency compared with the median of 0.55. In the viewpoint, xLE is recommended as an efficiency metric to guide potency optimization rather than as an order-ranking tool for prioritizing starting points.

When xLE falls in the first quartile, further potency optimization will likely require adding more heavy atoms. For both compounds in your example, it may be necessary to identify any suboptimal interactions and optimize those before increasing heavy atom count. In HTS triage, ranking compounds by efficiency metrics is understandable: we prefer to start from compounds that make optimal interactions and are easy to elaborate by adding atoms. However, do we truly believe that 0.34 kcal/mol/atom for a 12–heavy-atom fragment is a good starting point? What would "fit quality" indicate for compound 1? If we argue it is always easier to start with smaller compounds, wouldn’t heavy atom count alone suffice (albeit with an implicit and arbitrary activity cutoff)?

One point I considered, but did not include in the viewpoint, is how to choose a dataset to empirically set the parameter α, and why the median is used instead of maximal affinity.

Hongtao Zhao said...

I calculated the fit quality (FQ) and the size‑independent ligand efficiency (SILE) for both compounds. For compounds 1 and 2, FQ is 0.51 and 0.64, respectively, and SILE is 1.95 and 2.30. I hope I calculated them correctly. If so, does this ring a bell about the likely deceiving feature of LE when taken without the context of heavy atom count?

Dan Erlanson said...

Hi Hongtao,

Thanks for your detailed response.

We seem to agree that xLE is not recommended for prioritizing starting points. But even for guiding potency optimization, I'm not sure how much information xLE provides given the low dependence on molecular size; as you found in 174 of 175 published F2L studies, almost any improvement in potency increases xLE. If this is the case, why not just use potency?

Regarding the hypothetical example, I think most practitioners of FBLD would agree that 0.34 kcal/mol/atom is a reasonable starting point. I do not think most practitioners would agree that "it is always easier to start with smaller compounds," which is why LE is useful for prioritizing starting points.

I do think a 10 µM, 500 MW compound may be harder to optimize than a 1 mM, 160 MW fragment.  Perhaps this is why Fit Quality seems to be seldom used, judging by three  polls on Practical Fragments. 

Which would you prefer to start with?

Hongtao Zhao said...

I didn’t recommend using efficiency metrics for ranking because certain cautions should be taken. As mentioned earlier, if the aim is to triage hits following library screening—whether a fragment library or a HTS library—then it is fine to do so. However, when there are only a handful of hits, it may be better to pause and reflect on why their binding efficiencies differ. It is not unheard of for a single methyl group to turn a single‑digit nanomolar inhibitor into a micromolar inhibitor.

Additive growth is pragmatically convenient for exploring large chemical space in a high‑throughput fashion, but it will inevitably make compounds larger. Reductive optimization is much harder: it often requires a clever, human‑guided tweak and is unlikely to be carried out in high‑throughput fashion. From this point of view, starting with a weak‑binding fragment that has high binding efficiency does have a competitive edge in the context of high‑throughput experimentation (for example, virtual library enumeration followed by FEP+ or R‑group‑based QSAR).

If we introduce a third compound with 38 heavy atoms but only 30 µM potency, xLE would suggest higher efficiency for both 1 and 2 over 3. It is not always the case that improved potency results in higher efficiency.

Arguably, it can be relatively easier to correct suboptimal interactions in smaller compounds than in larger molecules. That is a slightly different topic from a pragmatic perspective, which nevertheless may help justify FBLD for the reason above.

Peter Kenny said...
This comment has been removed by the author.
Peter Kenny said...

Something worth thinking about, Dan, is how you would generate a concentration response from the measure of ‘activity’ used to formulate xLE (or CLE which you reviewed recently). The xLE metric does appear to have a whiff of Heath Robinson to it and I don’t see it as having any relevance whatsoever to real world drug discovery. My stock advice to drug discovery scientists in this context is to carefully examine the response of activity to molecular size for your project compounds (and don’t get distracted by published analyses like the ‘fit quality’ studies in which activity data for other compounds against other targets have been aggregated).

My criticism of LE is simply that perception of efficiency varies with the arbitrary choice of the standard concentration (or more commonly the concentration unit used to express potency) and there is no ‘right’ concentration unit. Temperature is not arbitrary (nor have I ever suggested that it is) and you need to use the experimental temperature when converting Kd to ΔG° (drug discovery assays are typically run at normal human body temperature rather than 298 K).

Dan Erlanson said...

Hi Hongtao,

I agree that improved potency doesn't always result in higher xLE, but I'm concerned that the metric insufficiently penalizes molecular obesity. Continuing with our example, let's say we optimize the 10 µM, 500 Da hit to a 1 nM, 1000 Da lead with 75 heavy atoms. xLE improves from 0.38 to 0.55, signaling success. But is this molecule likely to have good drug-like properties? In contrast, LE drops from a low 0.18 to an even lower 0.16 kcal/mol/atom, suggesting inefficient binding that only gets worse.

Hi Pete,

Thanks for adding your thoughts. It seems that we are more in agreement with regards to xLE than LE.

Good discussion both!

Swiss of the alp said...

That alpha value is awfully near the SILE exponent value, innit? And one Pete K has given us some sh#t about that one and fit quality over the years.

It is just a number, gents. Use it if it works for you. Drug designers are pragmatists. If it does not work for you, use something else.

I, for one, do not see the point of splitting hairs on the heads of angels on a pin.