As most of you know, Teddy has retired from
active blogging, which is unfortunate not just for the loss of his wit but also
for the loss of his expertise – particularly regarding NMR. But you blog with
the army you have, not the army you want, so I'll take a stab at two recent
papers on the subject.
The first, published in J. Med. Chem. by Chen Peng and
colleagues at software maker Mestrelab in collaboration with Andreas Lingel and
colleagues at Novartis, describes an automated processing program for just
about any type of ligand-observed NMR data. After going into some detail on how
“Mnova Screen” works, the program was benchmarked on three experimental data sets
(on undisclosed proteins) which had previously been processed manually. The
first was 19F data from a collection of 19 mixtures of up to 30
fluorinated compounds each – 551 altogether. Here the program performed quite
well, identifying 56 of the 64 hits identified manually and misidentifying only
4 compounds as hits. Most of the false positives and false negatives were close
to the predetermined cutoff threshold, which can be set as stringent or lax as
desired.
T1ρ and STD NMR experiments on 55
individual protein-compound complexes were also examined, and the results were
similarly positive. Of course, single compound experiments are easy to analyze,
and the real test was with a set of 1240 compounds in 174 pools. Here the
program was not quite as good, missing 16 of the 50 manually identified hits
and coming up with 74 hits that had not been identified manually. Although most
of these were false positives, closer inspection revealed that 10 of them are
probably real. Moreover, some of the “false negatives” should perhaps not have been
classified as hits in the first place. Clearly the program isn’t perfect, but
it does seem to be a quick way to triage large amounts of data.
Of course, ligand-detected NMR provides at
best only limited information on binding modes, which is where the second paper
comes in, published in J. Biomol. NMR.
by Mehdi Mobli (University of Queensland), Martin Scanlon (Monash University)
and collaborators at Bruker and La Trobe University. The researchers were
interested in finding inhibitors of the bacterial protein DsbA, and a previous screen had identified a weak fragment that initially proved recalcitrant to
crystallography.
One of the best methods to determine the
binding mode of a ligand is to look at intermolecular NOEs, NMR signals which
only show up when two atoms are in close proximity to one another. In theory
you can look at NOEs from ligands to the backbone amide protons in proteins,
but this is technically challenging for aromatic ligands, of which there are
many. Proteins have plenty of methyl groups – so many in fact that it can be
difficult to correctly assign each methyl group to a specific residue, leading
some researchers to only focus on isoleucine, leucine, and valine (ILV).
However, by carefully studying more than 5000 high-quality protein ligand
complexes, the researchers found that looking at all the methyl groups in a
protein (ie, including those found in alanine, threonine, and methionine)
greatly increases the number of protein-ligand complexes suitable for analysis.
The researchers were able to assign most of
the methyl groups in DsbA using several approaches, and this allowed them to identify
11 NOEs between their ligand and ILV methyl groups. Modeling was unable to
provide a unique binding mode, but by including 8 more NOEs to threonine and
methionine methyl groups a single binding mode for the ligand was determined. Crystallography
came through in the end too and confirmed the NMR-derived model.
Teddy would normally end his NMR posts by
stating – often forcefully – whether he thought the tools under discussion were
practical or not. NMR is one of the most popular methods out there, so new tools are clearly welcome. Since I'm no expert on the subject, I'll ask readers to weigh in –
what do you think?
We've been using MNova Screen for around the past 12 months. Overall I think it's an excellent package - with one or two things that could be improved a little (e.g. baseline corrections) - but this is something that the developers are aware of and I understand that they are working on it. There's a bit of effort required to get things into the correct format for running the analysis, but we've found it to be worthwhile.
ReplyDeleteOne of the things that I regard as a real strength of the package is that it provides a framework for analysis which is less dependent on the individual. This is something that we tested with a cohort of undergraduate students. We gave 9 students a set of data from STD screening of 96 cocktails of 5 fragments each. These were 3rd year chemistry undergraduates - so familiar with 1D-NMR analysis but had never seen any ligand-detect NMR data previously. Across this cohort there was ~80% agreement between the fragments that were called as hits, which was not the case if they performed the same analysis manually.
Automatic hit identification is quite hard to achieve, as can also be seen from the paper published in J. Med. Chem. It is crucial to avoid false negatives completely, because the hit is rare and if the algorithm misses several, then I have to check all negatives manual anyways to search for missed hits. As for 19F, I am analysing my library in less than an hour, 900 fragments in 30 cocktails, 30 spectra. Moreover, automatic hit identification fails even more, if spectra are not perfect, which is often the case, broad peaks, etc.
ReplyDeleteThe most important feature in an analysis software is the data bookkeeping and setup for viewing NMR screening data. Ideally this is implemented within the processing and analysis software as it is now in Topspin. I had the chance to beta-test the new Topspin based Fragment screening tool and is is quite easy to set up, analyze data and a really nice reporting function. Best of all, it is free of charge.