20 May 2013

Fragment Mixes for NMR

The number of fragments in a mixture for NMR screening has been the subject of a poll.  Some people have stated that they go much higher than 10 fragments (of course for 19F it is totally different).  What many people who are interested in doing ligand-observed NMR screening, it is daunting to figure out how to compute the mixes.  This paper addresses the issue.  Unlike many current approaches which use the spectra and then deconvolute them, this approach encodes the spectra into "fingerprints" and uses a Monte Carlo algorithm to minimize signal overlap.  The paper itself delves deeply into a lot of computer-ese gobbledygook (e.g. "the knapsack problem, one of the typical, non-deterministic polynomial time (NP-complete) problems") that I don't find interesting at all.  What I do find interesting is that they are targeting mixtures of 5 fragments. Other than that, they go into serious detail about their algorithm and what version was best.  They worked with 342 fragments from their in-house library.

However, after doing the initial POC on these they did not have a library big enough to test for scalability so they generated some virtual libraries: 500, 1000, 3000, and 5000 fragments.  Similar to discussion going on elsewhere, they clustered their fragments as strongly aromatic, strongly aliphatic, or balanced shown here.  As would be expected, library size and peak distribution did not affect the algorithm, but number of fragments per mixture did. As shown here, for the optimized libraries there is less overlap as you increase the number of fragments per mix (for 5 fragments it was ~0% to about 10-20% for 8-10 fragments).  This is a potentially huge increase in efficiency, simply increasing the number of compounds per mix from 5 (our poll found 5-7 to be the median number in mixes) to 10 would half the number of spectra that need to be acquired; hence lowering the potential cost to companies (especially if they are outsourcing (shameless self-promotion)). 

I have spoken to the authors and while, unlike the Beatson, their tool will not be available online, it is being incorporated into an upcoming release of Mnova's software. [Full disclosure: I have a business relationship with Mestrelab.]  Of the other software available, I believe only AMIX (Bruker) has built in screening tools, but I am not sure entirely as I have never used AMIX.  NMRpipe would be the one to be most likely to also have such tools, but their availability would be based upon the kindness of strangers.  Frankie D (Mr. NMRPipe) is at Agilent (nee Varian) now, so maybe vNMRJ will become more utile.  That last major software package from ACDLabs is not geared to this kind of work AFAIK. Additionally, this approach of course is just as "easily" applied to 19F, which could mean a mean increase of compounds from 10-15 to 25-30 routinely. 

I of course will update this if information on other software becomes available in the comments or via email.

[UPDATE #1: Ben Davis (Vernalis) pointed out CCPN has tools for this.  
Anna Vulpetti (Novartis) points out that python scripts for 19F have been published.
Arvin Moser (ACD) points out that ACD does offer screening tools.]


Ben Davis said...

The screening module for CCPN analysis


has a very nice tool for optimising mixtures. I believe that at the moment it's only in beta release, but it will be included in the next release.

Dr. Teddy Z said...

Ben, thanks. I was hoping you would comment.

anna vulpetti said...

For those who use 19F-NMR: in 2010 (J. of Fluorine Chem. 131, 570-577, 2010) we have proposed a novel method for predicting the fluorine chemical shift that is based on the recently introduced fluorine fingerprint descriptor (J. Am. Chem. Soc. 131, 12949-12959, 2009). The ability to predict the 19F chemical shifts can be used in the generation of large mixtures by reducing the likelihood of spectral overlap and for the virtual de-convolution of the identified active mixtures.
Python scripts are included in the J. of Fluorine Chem Supplementary material.

Anonymous said...

With respect to fragment mixtures, I was wondering whether anybody takes reactivity of the compounds into account?
See Hann et al Strategic Pooling of Compounds for High-Throughput Screening. J. Chem. Inf. Comput. Sci., 1999, 39 (5), pp 897–902. DOI: 10.1021/ci990423o