Sign in or Join FriendFeed
FriendFeed is the easiest way to share online. Learn more »

Jeremy Leipzig › Comments

Jim Hardy
edgeR: a Bioconductor package for differential exp... [Bioinformatics. 2010] - PubMed result - http://www.ncbi.nlm.nih.gov/pubmed...
the vignette on this package indicates there is no clear path from sequence to count table within Bioconductor. You just roll that yourself. Very odd. - Jeremy Leipzig
Deepak Singh
Hadoop World video is out - http://mndoci.com/2009...
Only took, what, 2 months to post this... - Shiran Pasternak
Watching now. BTW, you've convinced me to try "hundreds of slides" approach some day - I wasn't sure it works in more scientific talks. - Pawel Szczesny
Pawel, I wasn't sure either, but I've had a chance to do deeper dives now (at Supercomputing) and it seems to work, especially as I get more comfortable talking science in this format. If I had to choose an alternate one it would be a pure storytelling/figure format, i.e. about some of your core results and work backwards if required. - Deepak Singh
that was a really great talk! - Jeremy Leipzig
Jeremy, thanks - Deepak Singh
Jeremy Leipzig
this employer spends half their time defending their job postings and the other half defending PHP - Jeremy Leipzig
Jason Stajich
6lane PE solexa on new machine, using 8GB RAM, 2hr run time. Got to fix parameters tho getting awful assembly. Next up euler-sr
is there a script that can take assemblies from velvet/abyss/euler/etc.. and generate a generic report for comparison? - Jeremy Leipzig
N50? or something else? - Jason Stajich
Deepak Singh
How-To Setup a Linux Server for Ruby on Rails - with Phusion Passenger and GitHub - Hack'd - http://hackd.thrivesmarthq.com/how-to-...
This document serves as a comprehensive how-to to setup and deploy your typical Ruby on Rails application on a blank new *nix (Unix, Linux, etc.) server. Rails deployment is hard, but it shouldn’t be. This how-to assumes you’ll want to use Phusion Passenger to serve your Ruby on Rails application, and that your application’s source code repository is in GitHub. Most of the defaults are targeted towards those using Ubuntu Linux on their server, but the same setup should be able to be applied to ’nix systems. - Deepak Singh
i am wondering if jruby on rails would be easier - Jeremy Leipzig
Jeremy Leipzig
How would one record to a file the maximum memory usage experienced by a process as part of a shell script?
just a vague idea: mixing 'watch' and 'top' and redirect this to a file ? - Pierre Lindenbaum
specifically I would like to add the max velveth RAM consumption to an assembly report - Jeremy Leipzig
looks helpful, thanks - Jeremy Leipzig
The 'time' command is pretty useful; I think the %M format string parameter gives what you want: http://unixhelp.ed.ac.uk/CGI... - Brad Chapman
Nice hack Brad, thanks - Pierre Lindenbaum
Duncan Hull
Unskilled and unaware of it: How difficulties in recognizing one's own incompetence lead to inflated self-assessments. - http://www.citeulike.org/user...
Really cool study from 1999! - Björn Brembs
Aye, found it via this http://www.cs.man.ac.uk/~hulld... Wonder how much of it applies to incompetent scientists?! - Duncan Hull
The more interesting question is the reverse argument: does it mean that in areas where we think we are good, we are just too incompetent to know better? :-) - Björn Brembs
one of my favourite papers of all time :) - Neil Saunders
also one of my all-time favorites :) - Lars Juhl Jensen
Duncan, I seem to remember that there was some study showing that 90% of university professors believe that they belong to the better half - Lars Juhl Jensen
People tend to hold overly favorable views of their abilities in many social and intellectual domains. The authors suggest that this overestimation occurs, in part, because people who are unskilled in these domains suffer a dual burden: Not only do these people reach erroneous conclusions and make unfortunate choices, but their incompetence robs them of the metacognitive ability to... more... - Duncan Hull
The question is "with respect to what"? I think these things are better spoken about in terms of standard deviations of clearly specified populations. - Mr. Gunn
who would rank themselves in the 12th percentile of anything? - Jeremy Leipzig
Am I missing out on anything by not having institutional access to this article, beyond the summary provided by Duncan? - Chris Lasher
I deal with a lot of undergrad students, and a big one here is "ability to find useful stuff for my assignments using Google." - John Dupuis
@Chris, a free pdf version is available from http://www.apa.org/journal... - Duncan Hull
Thanks, Duncan, yet again! - Chris Lasher
Pierre Lindenbaum
What is the most useful hack you have ever written ?
i didn't write it: for f in *txt; do mv $f `echo $f | sed -e 's/.txt/.fa/'`;done - Jeremy Leipzig
aptitude install nxclient - Egon Willighagen
a very short perl script called 'lst': #!/usr/bin/perl ($a,$b) = @ARGV; print join(' ',"$a".."$b"), "\n"; - Dan Gezelter
something I use almost every day: a tool I called 'verticalize': transforms a tab-delimited file into a vertical output ( just like the option -B in mysql ) - Pierre Lindenbaum
"push label" to remember the current working directory, and then later "pop label". - Noel O'Boyle
@Noel Do you mean pushd and popd? - Chris Lasher
There was this hack I had written to parse PDB headers once which was a lifesaver. Really badly written hack too - Deepak Singh
@Dan Have you ever heard of the Unix command 'seq'? It does the same as your hack 'lst' ;-) - Lars Juhl Jensen
seq isn't quite the same as lst. Try `seq 01 10` and `lst 01 10`. Also lst can do alpha strings like `lst a f`. I don't think seq does this. - Dan Gezelter
@Dan: seq -w 1 10 | xargs -d "\n" - Chris Lasher
Don't know about useful, but I did a horrible set of hacks so that Bioperl functions were available to a PHP-based website, via the Zend perl.so module. Why I didn't go with perl CGI, I'll never know :) - Neil Saunders
@Chis: It's always like that with me. An hour with a perl book just to save ten seconds with the man page. - Dan Gezelter
Duncan Hull
Educating biologists in the 21st century: bioinformatics scientists versus bioinformatics technicians. - http://www.citeulike.org/user...
i think a biologist can be taught that BLAST performs heuristic alignments without going through an entire algorithms course - Jeremy Leipzig
Neil Saunders
it was another day of configuring and tweaking hardware; seriously considering mac as next purchase now
just don't say anything to Mr Gunn - Paulo Nuin
lol, you take your chances! To what degree did you tweak just because you could, though? - Mr. Gunn
Go for it :) - Deepak Singh
life is too short for a scientific programmer to do much sysadmin stuff - Jeremy Leipzig
Agree with Jeremy - Rajarshi Guha
Jeremy Leipzig
How exactly would you go about doing this: take two solexa-sequenced fungal transcriptomes, one wt and one mutant, and find SNPs between them?
btw no genomic reference sequence available - Jeremy Leipzig
Berci Mesko, MD
Analysts Say deCODE Genetics Headed for Bankruptcy Court - http://www.eyeondna.com/2008...
Like as in strong dislike if it turns out to be true. We need DeCode, we really do... - Nils Reinton
Iceland is for sale. I was thinking of buying the whole Sugar Cubes, but they are not around anymore. Deus does not exist. - Paulo Nuin
Can Björk bail out Iceland? Seriously,though, that sucks. They were one of the good ones. - Mr. Gunn
Not liking the story though - Deepak Singh
I was interviewed by one of the subsidiaries of deCODE in Seattle for a position a couple of years ago. - Paulo Nuin
it will be interesting to see who buys the deCODE intellectual property - Jeremy Leipzig
Pierre Lindenbaum
@Chris Lasher : "89 databases: 51 reported that they are struggling financially. Seven of these have closed; the rest are being updated sporadically in their owners' spare time." - http://www.nature.com/nature...
jeter.jpg
One of the DBs cited in the article, BIND, was operating next door to the lab I was working in 2005. They closed shop and the data was sold to Thomson Scientific. - Paulo Nuin
I remember when BIND went commercial ..all sorts of web-services suddenly broke(changed URLS )and were partly inaccessible. Its sad that we have free web-email but no money for persistent hosting of these resources. Maybe we should assemble a bioinformaticians-sans-frontier to rescue these services and port them to the cloud ( google-app-engine , amazon and other such services) - Hari
@Hari Wouldn't it be nice? - Chris Lasher
How'd you pay for cloud resources? - Rajarshi Guha
It'd be good to repeat this kind of survey now (that data is from 2005). Maybe a good Biogang project? Get all the databases from six months of Bioinformatics (or pick a NAR databases issue) from, say, three years ago and see how they're doing? - Euan
@Euan Wasn't it done last year ? I don't remember who's done the job. - Pierre Lindenbaum
we started just to check if they were online and if they worked but we never finished. - Pedro Beltrao
NIH has an active PA for "Continued Development and Maintenance of Software" (http://grants.nih.gov/grants...) which even dates back to 2002. Wonder how many grants have been funded off of it... - Andrew Su
Great data point... Agreed with Euan that this kind of survey needs to be repeated. Ideally, one would re-contact the databases contacted in 2005 in order to update the data, and then also contact databases mentioned in a new collection of issues. - Hilary
One could create a shared spreadsheet on google-doc, with each database and its current status. - Pierre Lindenbaum
It's almost worse if a one-off database stays up abandoned with old data. I bear some major guilt here - the Alternative Splicing Gallery (my thesis project) is still running with ESTs from 2003 and consistently gets 130 unique visitors a month. There is no simply no mechanism (guilt doesn't count) to maintain a program that is created for the purpose of getting a paper published. - Jeremy Leipzig
@Pierre @Hilary Something like Michael Barton's survey? http://tinyurl.com/6gs6pq - Chris Lasher
I guess more like http://bioinformatics.oxfordjournals.org/cgi..., 404 not found: the stability and persistence of URLs published in MEDLINE - Paulo Nuin
Chris Lasher
Duncan Hull | Semantic Matching of Bioinformatic Web Services - http://www.cs.man.ac.uk/~hulld...
Duncan Hull's thesis. - Chris Lasher
Duncan, is your thesis available somewhere as a pdf document ? I've the feeling it contains many things I'd like to learn about taverna, biomoby etc.... - Pierre Lindenbaum
Likewise, but at the bottom of the page it says Duncan says he'll publish it soon, he just has to make final corrections. - Chris Lasher
I'm also very interested in giving this a read. Please let us know when it's available :-) - Ricardo Vidal
*blushes* I can email you a copy - Duncan Hull
@Duncan , that would be great: plindenbaum yahoo fr - Pierre Lindenbaum
@Duncan chris DOT lasher <AT> gmail TOD com - Chris Lasher
wouldn't it be great if we created an email list? - Paulo Nuin
Excellent idea, Paulo. Any ideas (interested) FF'ers as to how best to take this forward? - Graham Steel
It is possible to create a simple mailing list and have full access by everyone of everyone's info. - Paulo Nuin
I've set up a Mailman server before. I don't know if that does what you want. It sounds more like you're looking for an email registry. FWIW most of us have our email listed somewhere on one of the services published through FriendFeed. - Chris Lasher
yes, we do, but if we need access to someone else's email we don't have it at hand. I have a mailing list in my ISP's server, but that's not handy. - Paulo Nuin
Google groups? - Rajarshi Guha
Nice one, Rajarshi, didn't know about Google groups http://groups.google.com/ - Graham Steel
Rajarshi may be on to something. Groups provides mailing list support, as well as support for wiki-like pages. We could choose to publish our emails on that. Still, my Spidey Sense says we're re-inventing the wheel. - Chris Lasher
Most of us are connected threw LinkedIn. The e-mail are available to the direct contacts. - Pierre Lindenbaum
maybe a Ning network would solve this problem - Jeremy Leipzig
Thomas Lemberger
What is better? Comma-separated values (csv) or tab-delimited text? Which one is easier/more robust to import anywhere?
tab-delimited is easier because your fields may already contains some commas - Pierre Lindenbaum
Shouldn't make a difference if you're using the appropriate libs. The same problem Pierre mentioned with comma's can occur with tabs - Rajarshi Guha
this is where explicit semantics can be useful :) - Egon Willighagen
No difference at all - it just depends on the quirks of whatever you use to read the file. Spreadsheets have "merge delimiters" and might require quotes around fields, databases likewise, awk has the -F switch, etc. etc. I prefer tabs since they (usually) make the plain text file more readable. - Neil Saunders
can I say neither? Just use something like XML ;) - Allyson Lister
@Allyson absolutely - Duncan Hull
@Allyson: I was anticipating this one a little...But I was thinking of something 'wet biologist'-friendly. Also Google Doc chokes on XML... - Thomas Lemberger
XML is great - where it's available and where it's appropriate for the job. "Wet biologists" love files that they can open in spreadsheets. Delimited files are also good for database import and simple shell script munging using grep, awk, sed, sort etc. - Neil Saunders
(Thanks Duncan!) Yes, I also agree with Thomas and Neil about the right tool for the job. I'm actually involved both in FuGE (http://fuge.sf.net) and ISA-TAB (http://isatab.sf.net). FuGE is a fanastic UML/XSD/(and more) object model for experimental metadata structure & syntax. However, biologists love excel, and that's one of the reasons I'm also involved in the partnered ISA-TAB project, which builds on MAGE-TAB and also leverages FuGE. They are virtually interchangeable - so best of both worlds! - Allyson Lister
a table should stay a table (maybe with a header), not xml, otherwise you are just duplicating the header as xml tags for every single row - Jeremy Leipzig
I agree with Jeremy, if you have a ifxed number of well-defined cols, why add more verbiage? Though if you do want to add it, VOTables is an XML format for tabular data (used in the astro community) - Rajarshi Guha
Do *.csv files open up in Excel without the whole 'choose which delimiter is being used' dialog thing? That'd be a minor edge over tab delimited text. - Euan
Don't use XML if you have something well-structured. XML is powerful when things are *not* well-structured. No XML for a vector of floats, do XML for web pages. No XML for protein sequence, do XML for web/cloud services. - Egon Willighagen
Yes, Euan, they do. - Mr. Gunn
Paulo Nuin
Programming in R makes serious statistics serious fun - http://www.reddit.com/r...
I don't normally expect "fun" and "R" in the same sentence, but appreciate their enthusiasm :) - Neil Saunders
Not fun? apply and friends make me jump with joy :) - Rajarshi Guha
For a few months I worked almost exclusively in R and I would often hear the following from our statistician: "I don't know if that's a feature or a bug" - Jeremy Leipzig
R looks strange to me. But that's because I've only 'looked' at it so far - I like stats, and so I'm goanna learn R :D - Yuvi
Pawel Szczesny
It is Computation Time for Bacteriology! - http://www.ncbi.nlm.nih.gov/entrez...
Tremendously naive, poorly-written and patronising to almost everyone, but the good news: J. Bact. are launching a computational biology section. - Neil Saunders
i think the editor's fifth-grade son wrote some of this, maybe drunk - Jeremy Leipzig
JBac has a long history of... controversial papers :) (so far my favourite is paper showing that if you kill ATPase complex bacteria stop growing, from 2005 or so). Proxy server is down, so I cannot read that now, but the abstract sounds... intriguing. :) Jeremy, how old is fifth-grade in the US? - Pawel Szczesny
"In order to support this statement with some data, I have looked up 100 papers in the area of molecular and cellular biology published in 2008 in Journal Science and found that only 3 papers out of 100 were purely computational." Even a non-English speaker could spot the redundancies in that sentence. This is truly cringe-inducing. - Jeremy Leipzig
I sent this email to my former PI:I have stumbled onto an accepted manuscript that is so poorly written I fear for the entire scientific community. http://www.ncbi.nlm.nih.gov/pubmed... "It is Computation Time for Bacteriology!" by Igor B. Zhulin is intended to announce a new Computation Biology section to the Journal of Bacteriology. However, even in that limited... more... - Jeremy Leipzig
Bora Zivkovic
Love the Google logo today ;-)
electionday2008.gif
why the hell in Luxembourg I cannot see it??! :-) I can't vote, I know ;) - Luca Conti
I wish people in Luxembourg and all other countries could vote in the US elections today as well ;-) - Bora Zivkovic
Screw that - the USA stands tall against Luxembourg PM Jean-Claude Juncker and his sensible economic policies - Jeremy Leipzig
Will Google have a new logo tomorrow, something about the New Day in America? - Bora Zivkovic
Neil Saunders
there are people who use bioinformatics in their research and there are people who do bioinformatics; I think they're different people
You're on a roll with quotable quotes! - Chris Lasher
say I am a geeky biologist working with developers (bioinformaticians) on a software giving biological inputs, design ideas & testing, but not writing code, isn't it a 3rd category? - Attila Csordas
No, that would be the first category :) To rephrase: academic institutions want bioinformatics, but they don't want to fund bioinformaticians, they want to fund researchers who "do bioinformatics". Discuss. - Neil Saunders
Depends on the dept - I'd figure that CS groups would definitely want people who do bioinformatics (as in algo design and implementation). - Rajarshi Guha
Isn't that a function of your background? I guess a Computer Scientist would 'do' bioinformatics while a Biologist would 'use' bioinformatics - Yann Abraham
I don't think it's a function of the background. I am a biologist and I "do" bioinformatics. Is using a software bioinformatics at all? - Paulo Nuin
I think a better phrasing might be: bioinformatics user and bioinformatics researcher. The former will be a subset of the latter - Rajarshi Guha
It's more complicated than that. The CS types are the ones who develop algorithms. Then you have software types (who may or may not be computer scientists) who implement these algorithms. IMO a highly underrated and underapprecated group. Then you have users. If you agree that bioinformatics is the application of informatics to biological problems, then in a sense almost everyone will have to use bioinformatics methods/software at some point. Contd - Deepak Singh
However there are those who spend their time thinking about the methods which need to be applied, how they will be applied, and often applying them. That's a very complex task given the complexity of our data today, and again this is an underutilized/underappreciated role in academia (in industry, I would argue they are well used in many places) - Deepak Singh
neil it does seem most bioinformatics research professors need to link up with an established biologist with a big nih grant or a major informatics portal - that has left engineering schools out of the mix - Jeremy Leipzig
@Jeremy, this makes a lot of sense - unless the bioinformatics research profs have labs of their own. For a field like bioinformatics, theory with no application is not very helpful - Rajarshi Guha
i don't think there are enough grants for tool builders - Jeremy Leipzig
Jeremy nailed what I was thinking originally :) - Neil Saunders
getting back to the original comment I think this is pretty true for all techniques e.g. replace bioinformatics with 'crystallography' - Cameron Neylon
Pierre Lindenbaum
wondering if the JSF technology can be used to build a RDF tree
Sure, why not? - Egon Willighagen
run away from JSF as fast as you can - Jeremy Leipzig
I wish people were as strong with such statements with science too... - Egon Willighagen
@Egon; why not ? don't know... would it be the best solution ? may be I'm wrong but I see JSF as a GUI - interface and I guess it is possible to extends/create some components. - Pierre Lindenbaum
@Jeremy: Yes, I heard many bad things about JSF. People seem to choose Struts or Spring, but I found both frameworks not so easy to learn and my tests with JSF worked fine so... - Pierre Lindenbaum
JSF is one of those frameworks that works well until you need to do something novel, then you are in deep woods. It tries to obfuscate JavaScript to the point it is difficult to even to things like "open new windows" or "use bookmarks" (no really). Kind of like welding the hood of your car shut. I've heard good things about Seam, maybe that makes JSF more bearable. - Jeremy Leipzig
at this time I don't want to play with javascript At first glance, I would use GWT for this. - Pierre Lindenbaum
Ntino
Why should you compose workflows instead of scripting pipelines… - http://semanticlifescience.wordpress.com/2008...
i'm not sure what he means by "consumes resources" - does he mean humans or bandwidth? - Jeremy Leipzig
Bora Zivkovic
NO way! I thought you were going to say "April Fools"! eek. - Christina Pikas
will see - Paulo Nuin
As per Heather's link, see Suber's comments:- http://www.earlham.edu/~peters... - Graham Steel
I cannot access Peter Suber's blog for some reason - can someone copy the text here or, if too long, e-mail me, please? - Bora Zivkovic
check your inbox Bora - Graham Steel
Thank you! - Bora Zivkovic
i thought it was very strange that they could even buy PubMed Central. d'oh! - Jeremy Leipzig
Jan Aerts
What program/scripting language would you use to generate simple deBruijn graphs as at http://picasaweb.google.com/jan...? heavily tweaked graphviz?
If it's just that simple, I'd use Inkscape. - Pawel Szczesny
If you're on OS X, OmniGraffle would do it very nicely - Rajarshi Guha
Trouble is it'd have to be scriptable: read an input file and generate the picture automatically... Sorry; was not clear. - Jan Aerts
i've seen LEDA GraphML used for this - methinks that will generate LaTeX docs, so no dynamic web stuff - Jeremy Leipzig
Maybe processing might do the trick. Would make it interactive as well. Have to read up on it. - Jan Aerts
Other ways to read this feed:Feed readerFacebook