An efficient method for the detection and elimination of systematic error in high-throughput screening -- Makarenkov et al. 23 (13): 1648 -- Bioinformatics - http://bioinformatics.oxfordjournals.org/cgi...
CAUTION: Popular “Benchmark” Data Sets Do Not Distinguish the Merits of 3D QSAR Methods - Journal of Chemical Information and Modeling (ACS Publications) - http://pubs.acs.org/doi...
Funny they mention a 31-sized steroid set... in my PhD I studied a 42-sized steroid set, which was *way* *too* *small* to make *any* statistical comparison between models, let alone prove any effect of including 3D information... I wonder how they manager to do that with only 31 molecules!
- Egon Willighagen
One should probably ask, the same question of all the people who have used the steroid data in the past
- Rajarshi Guha
Rajarshi, nice summary. It sounds like you are dealing with a lot of quantitative signal screens. My experience is dealing with researchers on scoring based RNAi screens. These are more painful for them to do and the scoring is more subjective, so there is a strong tendency to err towards including more false positives. Elimination often occurs in secondary or tertiary screens.
- Brad Chapman
Also, I like STRING a lot, but as you mention it does suffer from incompleteness. It is good at giving back expected known positives, which gives confidence, but when you start doing sanity checks on learning methods with and without it, you'll find likely positives that it misses because they just haven't been studied. Just a few thoughts from my limited experience; looking forward to reading more as you get into things.
- Brad Chapman
Brad, thanks. Yes, right now the screens are based on a flourescent reporter. Phenotypic screens are in the near future - is that what you mean by 'scoring based RNAi screens'? I can see how they'd be subjective - but looks like lots of scope to try and develop some numerical methods to help out. WRT STRING and other PPI db's - that's been exactly my worry. It's useful in our pilots,...
more...
- Rajarshi Guha
Rajarshi -- yes, that's exactly right: phenotypic screens with ranked categories, like 0-4 with 4 having the most prominent and 0 having none of the phenotype of interest. Definitely would love to hear how you approach these when you come to them. In my experience, scientists preferred easier to interpret mean/median style cutoffs. It's generally no problem to have a few extras in follow up screens -- what is one extra plate when you've already screened hundreds? Best of luck with the research.
- Brad Chapman
Nice work. I wonder how many talks I will have to sit through where I hear people express surprise that two compounds are structurally very similar but display very different activities. I can now use "QSAR Cliff" in my irritated question. Thanks!
- Matthew Todd