Sign in or Join FriendFeed
FriendFeed is the easiest way to share online. Learn more »
Cameron Neylon
Data is what you do with it (or what you enable others to do)
...and not how you store it - Cameron Neylon
Was trying to capture the sentiment from neilfws' post last week in a soundbite - Cameron Neylon
Are the 'how you store it' and the 'what you can do with it' not somewhat related? - Jan Wessnitzer
I see 'how you store it' more related to 'what you enable others to do with it'. - Kubke
Definitely related then? ;) - Jan Wessnitzer
Jan, the point I think I'm getting to is that they shouldn't be. What matters is the interface layer, the API, and not the data format itself. I think this is Deepak's point about data warehousing versus data transport, and Neil's about the importance of APIs - Cameron Neylon
The thing about pdfs though is that you're throwing a lot of the information away. But maybe its a good example - pdf is fine as long as it is the interface through which people see the data (and they want it that way) but as a backend data format it is a problem? Why? Because it makes it impossible to present it in other ways (through other interfaces). So the problem is in a sense mistaking the use case and the use of the pdf as an "API for the human eyeball client" vs a data format. - Cameron Neylon
Looking at that I'm not sure it makes sense: Second try. The problem with pdfs is the things you can't do with them. If you could pull out the data that is usually thrown away when a pdf is made via some sort of interface it would be fine. I have to say I was thinking more in terms of spreadsheets and relational databases rather pdfs though. Anyway - working out whether it was a useful thought was the reason for putting it out - Cameron Neylon
Not a million miles away from this is this video of Jan Velterop from the Berlin 7 Conference:- http://www.canalc2.tv/video... Slides are here:- http://www.berlin7.org/IMG... albiet in .pdf format !! Oh the irony ;-) - Graham Steel
Here's what I think Cameron is saying with "Second try": PDF is a great way to *present* charts, text, etc., based on data so you'll know they appear the way you intended--but there should always be accessible *data* (in a widely-translatable spreadsheet or database format if not in XML or some neutral form) behind that presentation. (As a non-scientist, that's what I'd do if preparing a data-heavy document that others could reasonably build on.) - Walt Crawford
Or, crudely: PDF is a presentation format. It is not a data format. - Walt Crawford
Walt, that captures it quite well. Treating a pdf like a data format is like treating a tap like water. Bound to end in tears. - Cameron Neylon
Thanks. As Dorothea can tell you, I'm a big user of/believer in PDF--as a presentation format. But I understand that that's what it is. If my library blog studies were worth building on, I'd make the data available in .xslx or .xsl form (since those are widely translatable). - Walt Crawford
Personal anecdote on the pdf front: My talks have now got so big that slideshare won't take them so I have to upload as pdf often. This is of course a pain because people can't then re-use the bits from the slides anywhere near as easily as if I put up ppt or keynote files. But pdf at least means people can see what I did and ask if they want the originals. Of course - this doesn't scale...p.s. make the data available anyway - someone somewhere may want to build on them, and you'll never know if they can't get to first base. - Cameron Neylon
Cameron: Yabbut. This was entirely independent research, with no paycheck, no sponsorship, and the *hope* of maybe earning half of minimum wage through book sales. Given that, I'd at least like to know before someone else turns it into a consulting gig or otherwise. Maybe that's selfish. I dunno: The community of wholly unpaid/unsponsored researchers isn't that coherent. - Walt Crawford
Neil: As a simple humanist, your friend's statement puzzles me. Does this mean that text does not constitute information? And that you can't export a table from Word into a spreadsheet? In fact, a Word document is not a graphical representation of anything, and there are those of us who regard words, sentences and paragraphs as primary sources of information. - Walt Crawford
Walt: as someone who's drunk the OA koolaid the counter argument would be that you're better off using that research as advertising with the aim that someone will then pay you to do more. But its a classic personal against global thing- by making data available you make it more possible for more people to do effective research, raise profile and bring more money in. Doesn't mean that you personally will get that money. - Cameron Neylon
Knowledge=People+Information , in other words Information=Data has little meaning without a Person interpreting and making sense of it. Let me know, if you need more references on this. - joergkurtwegner
Can anyone point me to the post of neilfs on soundbite and any other references, which might be of interest? Fight information fragmentation ;-) - joergkurtwegner