False choice. Calling it a "service" might be more fitting (it has both a protocol and a data format).
- Shiran Pasternak
Shiran, then you would say DAS (service) = HTTP/SOAP (protocol) + XML (format)?
- Andrew Su
That's pretty much what all web services are aren't they? REST/SOAP + XML/JSON
- Deepak Singh
DAS is a layer on top of these standards. The protocol describes how a client defines object types (e.g., genes) and coordinate slices; The data format describes how the results are represented in XML.
- Shiran Pasternak
@Shiron does DAS really defines what is the type of object ? I mean, is "gene" a simple word or is it a component of an ontology ?
- Pierre Lindenbaum
@Pierre DAS only defines "feature" as a type. There is no sequence ontology (not last time I looked). But it's up to the provider to define the interface so that a client can request meaningful objects like "genes" (or "pierres," for all it cares).
- Shiran Pasternak
thanks all for the comments. Personally I think this is largely semantics, but I suppose if we talk about it in a paper the semantics should be right...
- Andrew Su
"I haven't bought into Wave yet, although it has more to do with the current UI than anything else. In addition to what's mentioned above, there are plugins for LaTeX, for rendering molecules, etc which make collaborative editing very powerful, potentially at least."
- Deepak Singh
I'm still not convinced by wave (slow, messy, ... ) but ok, it is still beta. However, I found this comment interesting ( http://www.slate.com/id... ) : "The core feature of Wave is that it is a real time communication PROTOCOL. Right now that manifests itself as live time chatting, but Wave is not meant to be just a fancy new IM client like you seem to have been using...
more...
- Pierre Lindenbaum
I am still not convinced either, but the Slate article is right. The problem is the current implementation. Everytime I see it, my reaction is, put some of those features into Etherpad and you have one powerful system. Wave is not an application, so we shouldn't be thinking of it as one, at least in the long term
- Deepak Singh
I haven't bought into Wave yet either, although I'm hoping I will at some stage. I suspect that will be when Wave becomes interesting by slinging data rather than just text around. A new form of bibliographic manager or universal blog commenting system is a lot more interesting than a new slant on email.
- AJCann
Wave is very promising and interesting, but I think it will be a couple of years until it matures into a truly useful platform. I'm not giving up on email and wikis for collaboration just yet.
- Matt Leifer
Wave is really useful in its current incarnation for collaborative editing of structured documents a la wikis. For free-form discussion a la IM, Twitter, or FriendFeed, not so much; at least not currently.
- Shiran Pasternak
Keep your competition from doing real science?
- Egon Willighagen
Searching Google Wave with "tag:the-life-scientists" will get you to "Research collaborations in Wave", a good starting point for life scientists.
- Martin Fenner
I don't get how you search in public waves. I've tried searching for tag:the-life-scientists and it gets no hits -- I think it's just searching my own waves
- Andrew Clegg
there was a thread by Kol about wave usernames couldn't find the link
- ffcode
Aha -- with:public . They really should include a button for that
- Andrew Clegg
An undergraduate student in our lab, Caleb, just got his wave invite. I told him to look at this thread for possible people to connect with.
- Steve Koch
Afternoon all. I've written my first robot, which hopefully will embed an interactive mass spectrum into a blip whenever a UniProt name is encountered in the text, and corresponding mass spec data is found for this protein. I say "hopefully", as I've not been able to test it for real, as, alas, I have no account. When are the next batches released? If it's not for ages, does anyone fancy testing it anyway?
- Neil Swainston
More "like"ing you gave a talk on this. Slides are very contextual, rather than textual, so I don't think many will get but the gist of what was going on. I can certainly guess, though. Also, some of these look familiar. Like that you still have the computer with the password written on it (from Matt Wood, right?). Hope they changed the password by now.
- Chris Lasher
Deepak, is there a way to listen to your speech ?
- Pierre Lindenbaum
Cloudera is working on putting up videos of all the talks from the conference, but didn't give an ETA. Chris, you're right: the slides don't do the talk justice. They go very well with the delivery. Seeing the phrase 'c. elegans' in a purely software conference gets me as excited as seeing a word like 'Java' in a bioscience conference.
- Shiran Pasternak
Video should be up soon. Got great feedback from the attendees and on Twitter. Next place to talk about this stuff - Supercomputing
- Deepak Singh
wonder how the ideas displayed on 41 and on 104 will develop together :-)
- Claudia Koltzenburg
Interesting take, but still too "one-dimensional" in my opinion. If genomes are not linear, then should they be drawn left-to-right at all?
- Shiran Pasternak
Nearly 1 billion base pairs? That's huge! [Preemptive "That's what she said."]
- Chris Lasher
1 billion base pairs and imagine 4 or 5 copies of that in the same nucleus!
- Paulo Nuin
Imagine the poor guys genotyping those organisms :-)
- Pierre Lindenbaum
You mean rich guys, right? They need a ton of money if they want to sequence it.
- Paulo Nuin
Wheat is not being sequenced yet; For maize, I can say, it's a total pain: not because of its length (2.5 Gbp), but because of its repetitiveness (75% of it is transposable elements).
- Shiran Pasternak
I mean: most (all?) of the methods are based on the fact that the organisms are duploid
- Pierre Lindenbaum
Yes, but still they are very large chromosomes, and as Shiran pointed out, very repetitive.
- Paulo Nuin
@Shiran 75% TEs? Is it known why there's such a high percentage in maize, or is this one of those mysteries currently under investigation?
- Chris Lasher
Don't know exactly myself. I believe where each line is identified by an index? I've got >6million lines in a file where first column is basepair position (ordered from small to large). I want to be able to very quickly extract a chunk of the data (i.e. part of a chromosome). Although I'm all for databases they can become very slow at this point. From what I understand (and what mza has told me) indexed files should be much faster for this.
- Jan Aerts
Note sure i it can fulfill your needs: have look a Oracle/BerkeleyDB
- Pierre Lindenbaum
you mean something similar to a Nexus file, where the information is broken into blocks?
- Paulo Nuin
With nexus you have to define the blocks beforehand, isn't it? What I need is on the fly quick search. I actually think that it's so that I can use binary search for example.
- Jan Aerts
Yes, with Nexus you would need the blocks predefined, but your parsing will be fast if you have a short list of possible blocks.
- Paulo Nuin
SRS worked by indexing flat files for years - don't know about now - but there must be a paper out there somewhere.
- Allyson Lister
Jan your bseach algorithm seems to only work with data in memory. An alternative to search a big dataset using binary_search/lower_bound/upper_bound would be to build a (binary) file of your data (where all records have the same size) and scan the file using a random acces strategy.
- Pierre Lindenbaum
Why not create an index by noting file positions for say certain base pair positions. If you're working in C you could simple fseek to that position, based on a binary search of the indexed values. Very crude and it seems a DB would be a much better solution
- Rajarshi Guha
Nice. While I like and appreicate good visualizations, there are too many cases (networks are my current bug bea), where it seems that the goal is just to make pretty pictures. In those cases, it has seemed to me that unless I know what I'm looking for, the visualization is worse than a summary (for practical purposes). Of course a combination of vis and summary is ideal.
- Rajarshi Guha
Man, great thinking for a very often occuring problem !
- joergkurtwegner
I'm not sure how much Processing can help with ad hoc data visualization. There is still a learning and programmatic curve, albeit smaller. How can you easily load the data and provide a graphical UI to it? I played around with Spotfire a few years ago, which can readily transform any tabular data into particle plots and other agnostic yet intuitive visualizations. But Spotfire is super-expensive. Google and NYTimes are taking steps towards providing open visualization APIs.
- Shiran Pasternak
I agree, I quickly looked into 'processing', and for my daily work I could not imagine to work without SpotFire, but I can ignore 'processing'. Beside, the interface in 'processing' seems not practical and flexible enough for scientific data ?
- joergkurtwegner
@Shiran: "not sure how much Processing can help with ad hoc visualization". That's _exactly_ what I use it for. The scripts are often so small that they are merely "sketches" (in Processing-lingo). Even though a tool like Spotfire is very useful, it falls short when your data contains many types of information. For example: clone read pairs mapped on a genome: there are the actual mappings, whether or not they map where they should, quality values, inversions, ... Any visualization has to be custom.
- Jan Aerts
In that case, do you have any example scripts you can share? I didn't mean to rag on Processing. It's a great tool (especially coming from a Java and OpenGL background), and the applications out there that use it are fascinating.
- Shiran Pasternak
Jan I'd also be interested in some examples. The one and only processing sketch I've attempted is http://tinyurl.com/5r7543 to demonstrate a simple algorithm. When visualizing data I tend towards plots and molecular graphics. There's also ruby-processing http://github.com/jashken... which you might find easier as a ruby programmer.
- Adam Kraut
@Shiran, Adam: As they're sketches of current work, am not allowed to put them online. But I will probably generalize one of my current visualizations and put it on github. Will let you know when that has happened. My very first hack was a proof-of-concept for comparative map viewer (http://tinyurl.com/5gkt85).
- Jan Aerts
It is so cool to see Processing used to display scientific data! It can really lead to wider usage of advanced open source data vis libraries and APIs, that includes Google, NYT, and others. Check out the Javascript Processing.js which nicely expands the cross platform compatibility. If you get a chance to put more examples on github it would be appreciated, looking forward to seeing what else you are working on =)
- Mike Chelen
"I totally understand that for many situations (such as self-employment) it's necessary at certain times to shift accounts and rebalance cash reserves. But if cash is not urgently needed on hand, disciplined investing can pay off. Even though we're currently in a down-market, we hope that the market will return. Incremental contributions to investment accounts result in dollar-cost averaging, which in turn offsets volatility."
- Shiran Pasternak
"Great post. I'll take your challenge. For my next meeting (International Professional Poster Printers Conference, held in Kathmandu), I'm going to go back to basics and use a modularized poster. This is part of what I think is a grassroots effort to reclaim our focus and our creativity. It's very reminiscent of the Paper Web. Enthusiastic about my work? Now that's a whole other issue."
- Shiran Pasternak
Nature "weighs in" on doping in sports. Money line: "The anti-doping authorities have fostered a sporting culture of suspicion, secrecy and fear."
- Shiran Pasternak
Ticks me off when alarmists cite this so-called "extinction rate," without acknowledging a species "emergence rate." After all, evolution didn't stop with the Industrial Revolution. Not sure if the study includes data over the last 50-100 years, but overall it shows an interesting trend.
- Shiran Pasternak