Eric Jain
Create an account or sign in to get started
Show: Comments - Likes - Both
delicious
Eric Jain bookmarked a page on delicious
Monday at 9:18 pm - Link
Federal recreation, camping and tour reservation information. - Eric Jain
FriendFeed
Sunday at 8:30 pm - via Bookmarklet - Link
Haha! This wins my award for favorite article title of the year. Particularly apropos to Deepak's talk, on multiple levels. - Chris Lasher via Bookmarklet
Curious: Has anyone used BioMoby (especially the service discovery part) for real (TM) applications? - Eric Jain
Same as Eric: I was recently asked "Who use Biomoby" ? - Pierre
There are active communities of users in canada, spain, germany and the UK that I personally know of. Mark Wilkinson could provide you with more specific examples. See also http://pubmed.gov/17237074 and http://pubmed.gov/17496321 - Duncan Hull
Interestingly neither of the two papers talk about discovering (or even reusing) existing BioMoby services. Are there any stats on the amount of reuse (e.g. how many services are used as part of how many workflows from different organizations)? - Eric Jain
FriendFeed
“It's question time again: Is there any scenario where using the arithmetic mean has an advantage over using the median ?”
Monday at 2:39 am - Link
at least 2: calculation is more transparent, and you don't have to explain what the median is ;-) More seriously, if you assume your data is normally distributed without outliers, then there is no reason to use the median. And if you return the median instead of the mean, then you should not use the standard deviation to describe the spread of the distribution but the MAD or the IQR. Hope this helps! - Yann Abraham
Yann: Thanks. I know that many things build upon the arithmetic mean, so I didn't want to completely question its use. I was just curious if there was any other reason (apart from the arithmetic mean being more precise in the absence of outliers in normal distributed data). - Daniel Jurczak
I believe that the mean better represents the distribution of data in large populations/sets of data, while the median is more representative in small data sets. But I am not a statistician. Go ask one. I bet their answer is a lot longer and more philosophical. - Jim Hardy
Daniel: the arithmetic mean is not necessarily more precise, but the statistical test that build on it are, so it is all a question of assumptions - if you assume no outliers use mean and powerful stats, if not use median and non-parametric tests. I guess it all boils down to being consistent, it makes little sense IMHO to mix things. Again, hope this helps! - Yann Abraham
Yep, thanks Yann :) - Daniel Jurczak
Just want to add some stupid ideas: it always comes down to what your dataset looks like. There is no point in calculating statistics when the data has no meaning, is heterogeneous or at least some form where you don't know what to do with it. Personally, I have never liked the median, since it feels so arbitrary and completely independent of the data you have at your hands. - dekay
…oh, and to "answer" your original question: Imagine a series of coin flips. Median or Mean? - dekay
Here's another (purely practical) reason: You want to compute the value at the database level, but (like most relational databases) your database doesn't have a median aggregate function... - Eric Jain
@eric jain nice one ;-) @dekay your flip coin example relates to binomial distribution isn't it? so it approximates to a normal distrubtion after 5 to 10 flips and then mean should be more appropriate than median ;-) thank you wikipedia http://tinyurl.com/fwn4j ! - Yann Abraham
Nice Eric. - Daniel Jurczak
FriendFeed
November 13 at 6:35 pm - Link
There aren't enough expletives in the English language to express how deeply I agree with this. - Paul Davis
True. But I'd add: 1b. Reuse an existing format (even if it was meant for something entirely different) 4b. Overspecify (to ensure the format can't be used for anything else) 5b. Make it generic (because you're so smart and to ensure it's a pain to use for the intended purpose) 7b. Everyone has dedicated programmers to create tools to hide the complexity of the format from users. - Eric Jain
The format used by the statistical tools performing genetic associations analysises has always been my problem. All those tools use a bunch of variations (HARG!) around a format describing the markers, the individuals and the genotypes. This information is splited across various files (ARGHH !! again) , sometimes lost (alleles are converted into numeric values) and is missing many fields (markers rs##, assembly version etc...) . Tool late: too many tools use this format, it would be hard to change this... - Pierre
Check out Law's laws (http://bioinformatics.roslin.a...): the classic rant about this. - Jan Aerts
Can I 'like' this twice? - Simon Cockell
FriendFeed
Sunday at 6:33 am - via Reshare - Link
I do that all the time. I can highly recommend it. Add some tagging (or branches) for versions you send around for review, and you're all set. Beats version tracking in Word. - Egon Willighagen
I have always preferred a service like Google Docs, at least for the rough draft stages of a paper that is. Cleaner and faster to get started, especially for people who are new to those things. - Daniel Jurczak
I'll second Egon - though it really shines if you're working in LaTeX. I've never tried to merge two Word docs via subversion. Would it work? - Rajarshi Guha
We are using Google Docs to write a grant, but people who are using it are a little bit skeptical to say the least. It's a slow progress. I never tried to merge Word via subversion, but I guess this is something to check. A quick search showed this: http://nicolas.lehuen.com/inde... - Paulo Nuin
I wish I had posted the whole series I wanted, dunno why I didn't. - Paulo Nuin
multiple editing on a word document can present some issues as it is a binary file, difficult to merge changes via svn. I would recommend LaTex - Frank
If LaTeX is an option then just go for svn. :-) - Daniel Jurczak
@Frank: most of the time, LaTeX is not an option, unfortunately. - Paulo Nuin
Google Docs doesn't seem promising - poor offline version (gears) + inability to tackle anything beyond standard documents (even emf/jpg images do not fit). - Yaroslav Nikolaev
@Rajarshi, Daniel & Frank: what's so special about SVN with LaTeX vs Word? don't have any experience with the former unfortunately...If its only binary vs ascii - one still has ability to track changes/comments from within the Word document, which might be an easier track for an average researcher ;) - Yaroslav Nikolaev
As far as free services are concerned, I've been happy with GoogleDocs for private sharing and Wikispaces for public sharing until the very last formatting step, where we generally use Word. Note that people in my field don't use LaTeX. - Jean-Claude Bradley
@Jean-Claude: what about schemes, images, etc - or you generally don't have a versioning issue on these? - Yaroslav Nikolaev
@Paulo: thanks for the plug, svn-time-lapse looks impressive! - Yaroslav Nikolaev
@Yaroslav, SVN+LaTeX is a nice combination because the latter is plain text. Therefore a simple diff allows you to quickly look at changes etc. I agree that it is not for eveybody - certainly most people in chemistry dept's do not (will not?) use LaTeX. Word's track changes is nice, and indeed I do use that when my collaborators won't use LaTeX - but the lack of branching can be a pain. But in general, I passionately hate Word - anything beyond 5 pages with many figures is simply a pain - Rajarshi Guha
@Yaroslav: My personal caveat with Word+SVN is as you've already guessed the binary vs. ascii thing. Maybe it is just a personal bias. :-) - Daniel Jurczak
More generally, when one writes regularly, the value of focusing on content rather than presentation is a huge boon. And LaTeX documents look sexier than Word documents for the same amount of effort :) - Rajarshi Guha
Its the issue with multiple editors of a binary file in svn - not word vs latex. If two different people check out the same version at 9:00am and work on it. If one person checks back in at 10:00am, the second person trying to check in at 11:00 will get a conflict because svn in not clever enough to resolve or merge conflicts in a binary file. - Frank
LaTeX looks better with LESS effort than Word. Specially now that they abolished menus in Word. - Paulo Nuin
GoogleDocs for the non-Tekkis and LaTeX+VersionControl for the Tekkis, or for the web2.0 freaks a Wiki with RTF export option, this allows post-editing in OpenOffice or Word. One good example is XWiki (Java). - joergkurtwegner
@Frank: this issue is clear! However in non-geeky context it seems easier to resolve it using Word built-in tools, rather than forcing everyone to use LaTeX. - Yaroslav Nikolaev
Any opinions on Git? It sounds more promising in terms of speed & stability (distributed system), however does not seem to have a stable Windows port so far (apart from over-cygwin version)?! - Yaroslav Nikolaev
You could consider using a WYSIWYG LaTeX editor... - Egon Willighagen
Oh, the Git port to Windows is fine, the GUI seems a little bit rough but the command line with SSH included works just fine. The git Eclipse plugin also works fine with most commands, but I had problems pushing things to Github from it. - Paulo Nuin
@Yaroslav For the projects I've been on images and figures don't change enough that it's ever been a problem. On Wikispaces I would just fix the image and replace it. Text is where the massive editing takes place. - Jean-Claude Bradley
One major shortcoming of Google Docs (and most Wikis I've seen) for scientific publications is the lack of support for numbered references - Eric Jain
Eric: that's usually the place were Zotero with its drag and drop comes in. - Daniel Jurczak
Problem with Zotero drag/drop is that it doesn't insert named/numbered citations, just the reference list at the end. The Word/OO plugins are better in that respect, but lag behind the latest Zotero release. I'm sure it wouldn't take too much javascript to get data from Zotero->Google Doc as both citation and reference, if anyone fancies a nice coding project ;-) - Neil Saunders
Neil: Well that's true. To be honest I have never used anything other than LaTeX/BibTeX for "larger" documents, so for me manual adding of references was never a major problem. - Daniel Jurczak
@Eric: exactly the point! that's why file-sharing beats document-sharing in research-paper-writing perspective - one can share a reference library along with the docs.. - Yaroslav Nikolaev
Unfortunately Zotero also suffers from inability to share the library between users/computers..they promise multi-computer sync in Zotero 1.5, and multi-user social sharing in Zotero 2.0...However we're not there yet.. - Yaroslav Nikolaev
Yaroslav: I guess you could use DropBox and put the Zotero library into the shared folder. (??) Have never tried it, so just a guess ? - Daniel Jurczak
The Zotero Sync works pretty well and you can access your database online. - Paulo Nuin
The only thing is, I guess you can't access your papers? Does it also sync up attachments, or at least snapshots? - Chris Lasher
tested Zotero 1.5 Sync (Preview): library metadata and text notes are synced over Zotero servers, while for attachments and snapshots have to use any third-party WebDAV disk (in future developers plan to employ Amazon S3 for this purpose). The problem here is that attachments on WebDAV are stored in own Zotero format (which appears as a collection of .zip and .prop files), so one would have to sync full library with metadata to dig out the appropriate file. - Yaroslav Nikolaev
actually the idea of using Zotero for collaborative paper writing seems very interesting, however currently (v1.5) it only fosters single-user synchronization mode, and hacking for group sharing appears cumbersome...'d have to wait for Zotero 2.0.. - Yaroslav Nikolaev
and rethinking the above discussion (including binary vs ascii part) - it appears that would Zotero or Mendeley extend their desktop clients with a "2.0" text-editing functionality (multi-user + version control), they might actually win over the conventional workflow [Word/OpenOffice <> plugin <> reference repository]. Unless Neil will mashup Zotero with GDocs before that ;-) - Yaroslav Nikolaev
FriendFeed
The Life Scientists: dekay posted a message
“Folks, is this too philosophical? What types of problems do we have in science? Optimization? Screening for relevant factors? Anything else?”
November 14 at 12:15 am - Link
ethics for sure - sofarsoshawn
unconcious bias - although it's about a historical, and not a current problem, a book like Gould's Mismeasure of Man illustrates this beautifully. And, I think that even current scientists need to keep the idea of unconcious bias in their heads at all times :D - Allyson Lister
Outreach. Not enough of it and what there is often not good. - Neil Saunders
true Allyson: an anthropomorphism of science especially, ie understanding how a rat would feel in maze through a human's eyes - sofarsoshawn
I second outreach, ethics and anthropomorphism. good discussion! - Allyson Lister
Definitely outreach from my point of view - so much good work but not clearly communicated to scientists as well as the public! - James Watson
Lack of reporting of negative results- they are usually just as important as positive results. - Shirley Wu
@Shirley -- I hear you, we really need a way for communicating negative results. But how do you really know when a negative result is really negative, and not just a fluke? ;) There's always a better way. You just need to find it. (Or stick with Edison's ways a lightbulb does not work) - dekay
problem to make it productive, profitable and beneficial for society - Alexey
Dekay, point taken - it's true that there are an infinite # of ways something can go wrong, and usually only 1 way it can be right. But I get the impression that a lot of "positive results" are not "typical" - how do we know that the other 100 trials that failed were all flukes? Biologically possible != biologically likely, though I suppose it is also important and cool to see what is possible too. - Shirley Wu
@dekay , @Shirley some sort of clearinghouse for negative results would be useful. You would never know if the negative result was a fluke or not ... but that is taken into account in you assessment (eg, maybe you try the experiment yourself anyway with some tweaks, just in case. If your result is negative too, add it to the clearinghouse - maybe a trend would emerge). A well connected Open Notebook Science 'ecosystem' could play this role nicely. - Andrew Perry
@Andrew Perry I agree, you keep tweaking the experiment until there are negligible "negative results" otherwise you're stuck with a relativist rhetorical argument ie. w/ drug prescriptions there are always negative some side effects, but usually the cure outweighs the disease - sofarsoshawn
Allocating grant money? - Eric Jain
Information consumable by software. Exhibit A is every paper ever written on data mining of literature abstracts. - Paul Davis
@Andrew - your term of ONS ecosystem is a fairly good representation of how we're evolving. To truly be open ONS can never be reviewed as quickly as it is published but we're getting better with multiple judges commenting on the notebooks now - Jean-Claude Bradley
FriendFeed
November 13 at 7:25 pm - Link
also...lists of fungi, viruses etc... - Antony Williams
Someone needs to download the NCBI taxonomy database - Neil Saunders
Left a comment at your blog about getting species lists from NCBI. - Neil Saunders
Neil..great source of information. Thanks - Antony Williams
No worries; by the way, taxonomy files at NCBI FTP site are at: ftp://ftp.ncbi.nih.gov/pub/taxonomy/ - Neil Saunders
You can also get the (more or less same) taxonomy data here in tab-delimited format or RDF: <http://www.uniprot.org/taxonom...> - Eric Jain
Eric, cool! Thanks for that link. - Neil Saunders
FriendFeed
November 12 at 8:13 pm - Link
They propose a "Web 2.0-based Scientific Social Community (SSC) model for bioinformatics" whose adoption "may help catalyze paradigm-shifting advances in other fields of science"... - Eric Jain
FriendFeed
November 12 at 9:10 am - Link
I'd say the main impediment is still that the technological approaches doesn't take sociological factors into account enough... - Eric Jain
My argument is that we're in this state because of historical reasons. Much of my argument is here: http://www.davispj.com/bioinfo... There is cursing involved, so don't say I didn't warn you. - Paul Davis
@Paul I'll give it a read. - Chris Lasher
@Paul: Replacing XML with JSON (or YAML) is trivial, but fixes nothing -- unless you think the biggest problem is aesthetics? - Eric Jain
@Eric Not that at all. Its creating an interchange layer that is ubiquitous and easy. XML is the wrong answer for many reasons. Also, I'm a bit biased by working with CouchDB which is why I picked JSON over YAML. - Paul Davis
@Paul I read your article. I hadn't taken a look at JSON until you pointed it out, so thank you. I agree it's a lot less verbose, and aesthetically appealing compared to XML, but--correct me if I'm wrong--it's just an encoding, like XML. It doesn't in itself enforce a standard in the data encoded by it, or the exchange of it, which is the real issue at hand: How do we standardize the data for exchange? How do we provide protocols for the exchange? How do we encourage people to participate in data exchange? - Chris Lasher
@Chris I think the key will be to find some middle ground of RFC's that allow for ad hoc and sub-RFC extensions. Similar to the XMPP specs. Encouraging people to participate is easy. Make it dead simple and show them the benefits. - Paul Davis
It seems some very very smart people have not figured out how to make this all dead simple, though. This is what I have gathered from my readings thus far. This is a very difficult problem, not the least of which is, as the title of the post states, sociological. - Chris Lasher
Never underestimate the power of your own ideas. I don't believe it's as hard as we have been led to believe and I would like to believe that this problem is technology limited so that I can sleep at night. - Paul Davis
FriendFeed
November 11 at 10:09 pm - Link
This appears to be nothing more than eye candy? That said, I want the screen saver version! :-) - Eric Jain
FriendFeed
The Life Scientists: dekay posted a link
The Political Compass: Away from 1D Left/Right to 2D of political views
November 11 at 12:16 am - via Reshare - Link
What's your coordinates? - dekay
That's one tedious survey! Could reduce the number of questions a lot by choosing them based on your previous answers. Might also be a good idea to allow people to not have an opinion on a topic. I'd add a third dimension: How much you care :-) - Eric Jain
FriendFeed
Science Online: Björn Brembs posted a message
“I'm looking for a CMS for a lab homepage. Should incorporate multi-user blogging, as well as categorized articles for both outside and inside use, about pages for users, etc. What are you using?”
November 4 at 6:12 am - Link
Tried a few in my time. Joomla for a while but fell from favour (security issues). TikiWiki looks great, but high admin overhead and learning curve. I think dokuwiki + plugins could be a great solution. Good resources for comparison: CMS matrix (http://www.cmsmatrix.org/), opensourcecms (http://www.opensourcecms.com/). - Neil Saunders
Wikispaces+Blogger+GoogleDocs - Jean-Claude Bradley
and as JC points out, plenty of free hosted services that relieve you of most admin and maintenance - Neil Saunders
I would try Google Apps http://www.google.com/apps/int... + blogger - Anders Norgaard
Don't forget Drupal. More work, but powerful. Also evaluate Socialtext and Mindtouch Deki - Deepak
Can't believe I'm beating Ricardo to this: have you considered OWW? - Bill Hooker
I'll also plug OWW, as I'm a user (http://www.openwetware.org/wik...) - but also, they are really amenable to adding features you might like to see (I've had much conversation about searchability). - Heather
I've tried a couple, and second the wiki+blog solution. Unfortunately, unless they have improved it, Google Docs don't integrate well into my favorite blog platform, Wordpress. - Mr. Gunn
If you want something that works out of the box, Confluence may be worth a look - Eric Jain
What value does a blog add that you can't get from a wiki? I find that blog posts have a habit of disappearing into the nether, while wiki pages are productively edited and have a useful outcome. - Donnie Berkholz
@Donnie - for me the value of a blog is that people will subscribe to your RSS feed and you can discuss milestones - RSS feeds on wikis are very messy, except perhaps for keeping on top of seldom used wikis (even there I would prefer email notification) - Jean-Claude Bradley
@Bill I agree OWW is definitely worth considering - Jean-Claude Bradley
I sometimes find it useful to distinguish between blog posts and wiki pages: Blog posts are like wiki pages, except they are timestamped and are not meant to be kept up to date and evolved. Don't want to have e.g. old announcements and reviews clutter your wiki namespace. - Eric Jain
Thanks, very helpful suggestions so far. Does any of these have the ability to configure the front page such that posts from the different user-blogs tagged with, e.g. "front page" get aggregated on the front page? - Björn Brembs
To show e.g. the last 10 posts tagged with "frontpage" in Confluence, you can add the following to the home page of the default space: {blog-posts:10|labels=frontpage|content=titles}. See http://confluence.atlassian.co... - Eric Jain
FriendFeed
The New Genomics - Software Development at Petabyte Scale
Play
November 2 at 11:41 pm - Link
Matt Wood's interesting (though somewhat off-topic) talk at the recent Google Test Automation Conference in Seattle. - Eric Jain
Quite impressive. Cluster of 1000 cores and uses Perl and Ruby/Rails. - imabonehead
I got to see a preview in person ... and completely on topic there :) - Deepak
Shout out to CouchDB too. Woot. - Paul Davis
FriendFeed
October 26 at 10:14 am - via Reshare - Link
The Distributed Annotation System, a protocol for biological annotation exchange. - Chris Lasher
So does anybody use this? Until the 2008 Briefings in Bioinformatics review article by Zhang et al. http://tinyurl.com/5blwsc I wasn't aware such a thing was out there. This concerns me for two reasons: 1) the original paper for DAS was published in 2001 by Dowell et al. http://tinyurl.com/5jvysp and 2) our lab group's focus is in genome annotation. - Chris Lasher
To echo Chris, I'm a big fan of the rationale and goals behind DAS. but if you check out DASregistry.org, the fact that the vast majority of DAS providers are the same groups that wrote the specs doesn't speak to wide adoption. (Does the length 9k+ word spec http://biodas.org/documents/da... have something to do with that?) any comments from DAS users/developers? - Andrew Su
I've tried to use the DAS server at UCSC ( http://genome.ucsc.edu/cgi-bin... ) a few years ago. The result was not enough verbose to use the information. - Pierre
The lofty goals are commendable and something I'm concerned about. I don't believe DAS is the right answer. - Paul Davis
It's not hard to imagine a simpler and more flexible mechanism for finding and retrieving annotations from multiple sources (especially for protein sequences). On the other hand it does seem to be working for some people, and what's the alternative if you don't want to spend a lot of effort creating (and advocating) your own spec? - Eric Jain
I actually believe that DAS is absolutely the right solution for some users. In particular, for example, large genome centers and model organism databases seem to use DAS effectively for data sharing. Someone from (or familiar with) those communities care to chime in? (Todd maybe?) Is DAS just not meant for smaller shops and individual users? - Andrew Su
@Andrew if BioDAS is a spec that requires me to hire a full time person to use it, then something is very, very wrong. - Paul Davis
Paul, I don't necessarily agree. There are specs that take take a full person to implement/use (thing FDA/FAA/OSHA compliance), and rightly so. The target users of those specs can and should devote a person or team to it. But it's always been unclear to me whether DAS is targeted only at big bioinformatics shops (where it does well), or do they actually intend for the average individual bioinformatician to participate (where I think they don't do so well)... - Andrew Su
Andrew, I can't boil my thoughts down to anything more concise than if a distributed annotation system approaches the complexities of governmental regulation its fundamentally flawed. Its connected to: http://roy.gbiv.com/untangled/... - Paul Davis
Implementing the spec does look like a lot of work, but presumably you'd use an existing implementation (e.g. Dazzle or LDAS). There it seems all you need to do (once you managed to set up the software) is figure out how to fit your data into GFF. There may be simpler ways to serve such data, but if there is evidence that there are people who are using DAS-enabled clients and would benefit from having access to your data (or you need a buzzword for a grant application), this investment may make sense... - Eric Jain
Paul, touche... but I chose government as something we'd all recognize. Not sure it was the best analogy. Eric, LDAS seems like exactly what would be needed to bring the complex DAS to a longer tail of developers. It hasn't been updated since 2003 though (http://www.biodas.org/download...), so I wonder how widely it's actually used... - Andrew Su
I was expecting more DAS defenders/educators here. No? - Andrew Su
I've followed DAS quite closely since 2000. I think there's no doubt the spec is top-heavy, especially DAS 2.0. We are developing a genome wiki (http://genome.biowiki.org/) and DAS 2.0's writeback has been mooted as a possibility. Then again there is some investment in DAS (e.g. by Affymetrix) and I expect that when we reached a certain level of functionality, DAS might well look more appealing. - Ian Holmes
Ian, I'm not sure I understand. The demo at http://genome.biowiki.org/test..., does that use DAS to get genome features? Of all the applications, I would think this would be a best-case scenario for DAS use... - Andrew Su
Correct me if I'm wrong, but nowadays I would say "why don't they use RDF instead of DAS ?" - Pierre
@Pierre I'm trying to figure that out, myself. There must be a reason; I must keep reading, keep learning. - Chris Lasher
Can someone walk me through how RDF would to the DAS use case? My understanding is that DAS is great for defining genome features. e.g., Exon 4 of Gene X spans from position A to position B on Chromosome Z. How would one express this in RDF? How would you standardize the vocabulary between all the DAS servers providing this type of data? Just starting my RDF education now... - Andrew Su
@Chris: yes, all those tags genome/chromosome/start/end could/should(?) be described as a set of RDF statements... - Pierre
DAS defines URL conventions, and a bunch of XML documents. I'm not sure why they didn't separate these two aspects better, and go with RDF as the default format for representing data in DAS/2. This could have made custom extensions and ontology integration a lot more elegant. - Eric Jain
FriendFeed
The Life Scientists: Chris Lasher posted a message
“I'm trying to understand RESTful services. How does one go about doing non-trivial queries (e.g., not just a simple GET of a single resource)? Say we have a database with authors and publications. How would one perform a query to obtain all publications that two authors were on together?”
October 14 at 6:11 pm - Link
Also, if you have any suggested reading for understanding REST, providing RESTful APIs to databases, and the like, feel free to post them here, tag them to del.icio.us (er, delicious.com), or both. - Chris Lasher
Chris, presumably you know the mandatory "restful web services" - Deepak
Take a look at this rails tutorial. Should give you some ideas that should translate to Django? http://darynholmes.wordpress.c... - Deepak
@Deepak I was taking a look at that book, actually. I may pick up a print version of that book, even though VT has Safari access to it. - Chris Lasher
/me bookmarks useful blog post - Neil Saunders
A number of the web services I wrote at IU have REST interfaces - on the backend they do things like retrieve stuff from a DB, calculate molecular descriptors, get 3D structures etc. http://www.chembiogrid.org/pro.... I don't know whether this answers the question, since the input is restricted to a simple identifier - but the descriptor service lets you 'dig' through the various descriptors to get at actual values - Rajarshi Guha
GET requests can still use query string parameters. Something like: http://hostname/by_author?auth..."Author One"&author="Author Two" - Paul Davis
See http://xml.nig.ac.jp/tutorial/... for examples using DDBJ and http://www.myexperiment.org/wo... for a simple workflow that uses it - Duncan Hull
double author query is just a simple GET, e.g. http://www.ncbi.nlm.nih.gov/si...] for Kell + Oliver - Duncan Hull
If you want to avoid name-value parameter pairs in favour of something more natural and resty, how about http://hostname/authors/smith/... where your app doesn't actually care what order smith & jones come in? - Andrew Clegg
Friendfeed truncated that URL but basically /authors/smith/jones or /authors/jones/smith can be interchangeable - Andrew Clegg
e.g. http://www.uniprot.org/citatio.... Note that the path is the name of the collection (/citations/), not a verb. The constraints are formulated in a custom (Google-like) query language, and are put inside a single parameter. For simpler cases, per-field parameters may be fine. In any case, what matters is to have a URL that can be GET. Bonus points for supporting machine-readable formats (specified e.g. via a "format" parameter), and offset/limit parameters. - Eric Jain
Don't get too lost in the technicalities or conventions. The HTTP verbs should match the 'intent' of query: for example a PUT translates to a write, a GET nicely translates to a read. With that in mind, GET requests with multiple parameters may retrieve your author list, as may a URL with multiple levels of /author/author. - Matt Wood
Before following the suggestions to integrate all query parameters into the path (especially if their order doesn't matter), read http://googlewebmastercentral..... - Eric Jain
I think there's more of a dynamic-static continuum than that Google post suggests. /authors/smith/jones ought to be *fairly* static even if it's served from a database (assuming unique smith and jones) - Andrew Clegg
Okay, I'm glad someone else is commenting on the Google Webmaster article on dynamic and static URLs. That blog post implies Google promotes what the Django community dubs "ugly URLs". The tutorials and material I've read indicates the "Django way" is to have "clean URLs" which to me look like what that article claims are dynamic URLs rewritten as static URLs. Hmm. - Chris Lasher
I think the main point in the Google post is that you don't need to hide query strings in order to get a page indexed (common misconception) and that doing so can in fact be counterproductive in some cases (e.g. Google may be smart enough to know that there is no point requesting URLs that are identical, except for the order of the parameters, more than once). Of course there are other practical and aesthetic criteria as well. - Eric Jain
delicious
Eric Jain bookmarked a page on delicious
September 23 at 11:21 pm - Link
Group of entrepreneurs in the Seattle area who give and seek advice on running technology startups. - Eric Jain
FriendFeed
BioJobs: Duncan Hull posted a link
September 23 at 6:47 am - via Bookmarklet - Link
quote "The ideal applicant would have knowledge of databases such as Linux/Unix." !?! - Duncan Hull via Bookmarklet
the same way that Chrome is a OS - Paulo Nuin
If your data is stored in a bunch of text files, then I guess Linux is your "database" :-) - Eric Jain
Slightly worrying that recruiting types at the EBI don't understand the difference between a database and an Operating System though? - Duncan Hull
FriendFeed
BioJobs: Paulo Nuin posted a message
“Question: I have applied in the past to many positions in the US but got one phone interview only. Would employers in the US consider the immigration requirements to have a negative impact in hiring a foreigner?”
September 23 at 9:27 am - Link
you mean applying in the past 2-3 years ? I have the impression that was a downtime for H1Bs... after realizing how much talent they lost to Europe due to the stricter immigration laws, I think its becoming relaxed again... - Ntino
Obtaining academic visas don't seem to be a problem (at least from what I've seen). But such places often don't have a budget for flying in candidates for in-person interviews (which most people still prefer). So you'll get invited to a lot more interviews if you "happen to be in town"... - Eric Jain
the visa restrictions make it *very* difficult for companies to consider candidates who don't already have work authorization. First there are an extremely limited number, so you really have to hit the queue in the first day or two the applications are open. Second, there are no guarantees, either of getting the visa or of the applicant following through. (Got burned by one candidate who took another job right after we paid his fees... ugh...) - Andrew Su
Also should mention that at my company, we have hired foreigners at the PhD level, but I can't think of any at a level more junior than that... - Andrew Su
Yes, I have applied to several positions in the US until I found the job I am currently in. I once got a telephone interview, but after that the guy never wrote me back. I don't know if the processing time for an HB1 visa might scare employers (or maybe on in the US thinks I have a good CV). - Paulo Nuin
If the hiring manager is a scientist, then I assure you the last thing s/he wants to think about are the finer points in US immigration law. The system sort of sucks... - Andrew Su
That's what I guessed. But at the same time is quite difficult to understand the "rejection", at least initially, because I was getting interviews in Canada and Europe. - Paulo Nuin
http://bit.ly/2n1PKl... bringing back bad memories. In short, 65K H1B slots, 163K applicants in the first week. Applications open in April for October start. So basically, you have a 40% chance of getting someone who can start in six months... - Andrew Su
more hits in Europe and Canada because caps are looser? Don't really know, just a guess... - Andrew Su
I don't think Canada has a cap for work permits (what they are called here). Usually hiring post-docs is faster than a regular employee, because you need a confirmation letter from HR Canada for the latter. This letter can take up to six-months, and a work permit itself usually takes an extra month. Regarding Europe, I have an EU passport so that wouldn't be a problem, but even with the distance barrier people kept inviting me for an interview. - Paulo Nuin
Same for me - about 80+ applications and one phone interview only. - Björn Brembs
@Bjorn -- I've had friends with similar response rates -- and they are U.S. citizens, so no immigration barriers and lower overhead for in-person interview. - Todd Harris
delicious
Eric Jain bookmarked a page on delicious
September 22 at 8:54 pm - Link
The Java Open Source Graph Drawing Component - Eric Jain
FriendFeed
The Life Scientists: Sutee Dee posted a message
“Tufte's name pops up here once in a while so I thought I would ask-which of Tufte's ideas do you think are most relevant to biologists? What kinds of graphical depictions (in biology) do you think are the most misleading? How do you think we can better represent specific types of biological data?”
September 19 at 10:44 am - Link
I am putting together an informal presentation summarizing Tufte's ideas. The audience is a room of software engineers and computational biologists. I always enjoy hearing people's interpretation of Tufte's work. - Sutee Dee
I do hope it's not a PowerPoint presentation with bullet points ;-) - Eric Jain
I will probably use ppt to call out key themes. Definitely no bullet points. It's hard to appreciate a lot of Tufte's figures in digitized form and the audience is fairly small so a lot of the presentation will be me showing figures from his books and passing them around. - Sutee Dee
Correct me if I'm wrong, but I don't think there's lot of ideas how to apply Tufte's rules to biology. When I compare NYT infographics to finalists in scientific visualization challenges I get impression both images talk to me in completely different languages... - Pawel Szczesny
Biologists tend to be reluctant to generalize/simplify, and therefore often cram a lot of data into their graphs or tables, which makes Tufte happy... Btw does anyone know of a review-type paper that looks at graphics commonly used in biology? - Eric Jain
Eric, I've recently found this: Visual software tools for bioinformatics http://dx.doi.org/10.1016/j.jv... It doesn't necessarily look as what you've described but things like this paper are rarely submitted to journals indexed by PubMed. - Pawel Szczesny
I like his small multiples concept - rather powerful. - Chris Cotsapas
My own MO in graphics is to apply Tufte's aesthetics to Cleveland's ideas of data presentation. - Chris Cotsapas
FriendFeed
September 8 at 12:26 pm - Link
To summarize: No. - Chris Miller
Nyet. - Paulo Nuin
Definitely not - Sally Church
Depends on how much money you have? - Eric Jain
I'd still rather buy (another) Aston Martin or Bugatti - Chris Cotsapas
I don't think I'd even pay $1000 for 500K typing - Chris Cotsapas
I'd like to generalize this question: How much would you pay for sequencing 6 billion base pairs in 1 cell type of yours now relative to your current budget? I'd pay, say $2000. - Attila Csordas
For me, it all boils down to interpretation. Since I don't know what most of it means, why should I pay anything for it? I'd rather pony up the cash (or my genotype, for that matter) to a cohort collection so we can figure out what this stuff means. - Chris Cotsapas
I guess its $399 (for genotyping at least) now and you can download the results. IMO you are part of a big cohort and as an opt in. As long as I maintain ownership, don't mind. $350K though is too hgih - Deepak
I don't want other people's interpretations or choice of markers--I want my whole sequence (every base pair) and I'll interpret it myself. Still, I think it's worth about $20K. - Ruchira S. Datta
I look forward to the day, perhaps some decades hence, when every student in college or even high school gets their personal genome sequence and learns in class how to find information about themselves. - Ruchira S. Datta
@ruchira Noble thoughts - but how exactly do you interpret our probabilistic understanding of risk modulation by common variants? What does it mean, for you, to be told you have a 14% chance of diabetes compared to a population mean of 10%? For me the perennial medical advice would still hold: quit smoking, exersize, eat well, drink red wine in moderation. - Chris Cotsapas
Not 350K, definitely not, a lesser sum may be ok. - Aarthy
FriendFeed
The Life Scientists: Deepak posted a message
“Functional programming anyone? Just curious to see how many of you are using or thinking about using Functional programming in your work, or know off bioinformatics applications? I can only think of some DSL work, but otherwise, not too much that I have seen.”
September 8 at 9:23 am - Link
I suspect I'm not the only person who sees DSL as 'digital subscriber line' not 'domain specific language', and has been confused by ff references over the last couple of days! - Daniel Swan
There's a bunch of stuff for bioinformatics in Common Lisp. I play with CL when I'm bored - it's very elegant, but I don't enough of it to code fast enough. Also, there isn't a whole lot of library support in cheminfo - I'd love to be able to use it though - Rajarshi Guha
Rajarshi, you should switch you Erlang or Haskell. You're likely to see much better support over time :) - Deepak
Deepak, yes, I ha thought of those two. But a lot of the Haskell related pages I've seen seem to be focused on theoretical CS constructs (though I can see it's getting good library support). As for Erlang, I thought it's claim to fame was concurrency. On the flip side, there are so many dialects of Lisp, it's can be a little painful to write portable code. Right now FP is just a hobby for me :) - Rajarshi Guha
Functional programming (especially with "pure" functional languages) can seem like a bit of a pain. But it makes sense for algorithm-heavy stuff (or so I'm told), and the strict separation of functions with and without "side-effects" is great when you need to write thread-safe code (which I guess we'll have to do a lot more in future). Scala is a nice (non-pure) functional language that conveniently runs in a JVM. Great for learning, but perhaps not stable & mature enough for everyone. - Eric Jain
In Barcelona there is a small group of OCaml enthusiasts http://arxiv.org/pdf/0801.3675 - Anders Norgaard
OCaml is my favorite programming language, but I haven't been using it in my bioinformatics work. There was already a Babel of programming languages in use in the lab when I got here, to which I don't want to add. I do make use of functional programming in Python. - Ruchira S. Datta
For me personally I'm satisfied with the possibilities for functional style programming in Python. - Anders Norgaard
Anders, seconded! I love list comprehensions :) (though proper closures would be nice) - Rajarshi Guha
List comprehensions in Python are great. I never understood closures enough to like them or not. - Paulo Nuin
A simple example of the utility of proper closures would be non-trivial (i.e., multi-line, if-then etc) lambdas - Rajarshi Guha