Jan, the point I think I'm getting to is that they shouldn't be. What matters is the interface layer, the API, and not the data format itself. I think this is Deepak's point about data warehousing versus data transport, and Neil's about the importance of APIs
- Cameron Neylon
The thing about pdfs though is that you're throwing a lot of the information away. But maybe its a good example - pdf is fine as long as it is the interface through which people see the data (and they want it that way) but as a backend data format it is a problem? Why? Because it makes it impossible to present it in other ways (through other interfaces). So the problem is in a sense mistaking the use case and the use of the pdf as an "API for the human eyeball client" vs a data format.
- Cameron Neylon
Looking at that I'm not sure it makes sense: Second try. The problem with pdfs is the things you can't do with them. If you could pull out the data that is usually thrown away when a pdf is made via some sort of interface it would be fine. I have to say I was thinking more in terms of spreadsheets and relational databases rather pdfs though. Anyway - working out whether it was a useful thought was the reason for putting it out
- Cameron Neylon
Here's what I think Cameron is saying with "Second try": PDF is a great way to *present* charts, text, etc., based on data so you'll know they appear the way you intended--but there should always be accessible *data* (in a widely-translatable spreadsheet or database format if not in XML or some neutral form) behind that presentation. (As a non-scientist, that's what I'd do if preparing a data-heavy document that others could reasonably build on.)
- Walt Crawford
Or, crudely: PDF is a presentation format. It is not a data format.
- Walt Crawford
Walt, that captures it quite well. Treating a pdf like a data format is like treating a tap like water. Bound to end in tears.
- Cameron Neylon
Thanks. As Dorothea can tell you, I'm a big user of/believer in PDF--as a presentation format. But I understand that that's what it is. If my library blog studies were worth building on, I'd make the data available in .xslx or .xsl form (since those are widely translatable).
- Walt Crawford
Personal anecdote on the pdf front: My talks have now got so big that slideshare won't take them so I have to upload as pdf often. This is of course a pain because people can't then re-use the bits from the slides anywhere near as easily as if I put up ppt or keynote files. But pdf at least means people can see what I did and ask if they want the originals. Of course - this doesn't scale...p.s. make the data available anyway - someone somewhere may want to build on them, and you'll never know if they can't get to first base.
- Cameron Neylon
Cameron: Yabbut. This was entirely independent research, with no paycheck, no sponsorship, and the *hope* of maybe earning half of minimum wage through book sales. Given that, I'd at least like to know before someone else turns it into a consulting gig or otherwise. Maybe that's selfish. I dunno: The community of wholly unpaid/unsponsored researchers isn't that coherent.
- Walt Crawford
Neil: As a simple humanist, your friend's statement puzzles me. Does this mean that text does not constitute information? And that you can't export a table from Word into a spreadsheet? In fact, a Word document is not a graphical representation of anything, and there are those of us who regard words, sentences and paragraphs as primary sources of information.
- Walt Crawford
Walt: as someone who's drunk the OA koolaid the counter argument would be that you're better off using that research as advertising with the aim that someone will then pay you to do more. But its a classic personal against global thing- by making data available you make it more possible for more people to do effective research, raise profile and bring more money in. Doesn't mean that you personally will get that money.
- Cameron Neylon
Knowledge=People+Information , in other words Information=Data has little meaning without a Person interpreting and making sense of it. Let me know, if you need more references on this.
- joergkurtwegner
Can anyone point me to the post of neilfs on soundbite and any other references, which might be of interest? Fight information fragmentation ;-)
- joergkurtwegner