Bora Zivkovic
Excellent: Another reason data services need librarians by @TheRepoRat #scibling
Alright, I'll bite. I am not quite sure I understand the role(s) being discussed here - Deepak Singh
I am not sure that is possible. Heck even the lab people have trouble keeping up with the change. Unless we are talking about different types of data. Are you talking about data coming of instruments or the types of information that ends up at NCBI? - Deepak Singh
I'd argue instrument data does not have a role for librarians since you don't catalog that in the classic sense. The latter, I completely buy, and the question's a valid one. Also, I always wonder about the boundary between libraries and biocuration - Deepak Singh
1. the data changes every so often as instruments and techniques change. 2. Many times you are going to throw it away since you can regenerate it. 3. The data need to be reconstructed and it's the latter that are relevant. But I'll play devil's advocate. Most high scale data operations do not use librarians and are extremely effective with smart data management and good analytical tools - Deepak Singh
You aren't convincing me at all :). I've worked in enough places where data classification has not been a problem or something that comes in the way. Now good data management on the other hand is another problem. Now if you're telling me that they are one and the same I'd like to understand how and why - Deepak Singh
It seems that in the end it should be the domain experts that finally deal with domain specific data - likely, utilizing and modifying generic tools/techniques. To what extent should a librarian become a domain expert? - Rajarshi Guha
Not necessarily so, Rajarshi. To what extent did geneticists become bioinformaticians? - Mr. Gunn
domain scientists, data scientists, computer scientists, and librarians < all living together happily... unfortunately, what we've got are domain scientists....... and librarians (of various flavors) who want to help - Christina Pikas
Christina, I still don't understand what you want to help with especially with "live data"? Sounds too much like a solution looking for a problem. Happy to be proven wrong. - Deepak Singh
I haven't been in academic labs in a long time :). I see two aspects. One is the archival and cataloging side, which is the library challnge, but even more what you are describing is a data management problem, i.e. a software challenge, not necessarily a library challenge with one constant, data types and data handling are going to change all the time. Of course, there's a third one ... just good practices. - Deepak Singh
Therein lies the problem. You can pretty much count on all your early decision to be wrong. Things are going to change and you can count on it. I completely get your point on the end state, but not on the data in process part. Libraries do not deal well with constant change and experimentation. - Deepak Singh
Libraries may not, Deepak, but wouldn't someone who knows a bit about organization systems have some useful input? I know as well as you how variable data types and formats are, but scientists aren't born knowing good practices to manage data in a changing environment. - Mr. Gunn
Perhaps. But I don't know any librarians who work in highly changing environments who are either, especially high throughput environments. We are not talking about repositories here. What you need are data scientists and engineers who are used to working in such environments. I am sure there is a lot of input librarians can play as data curators and custodians of the information and metadata that comes out of high throughput labs, but I just don't see what real value they can add in the data production and primary analysis phase. - Deepak Singh
Agreed about primary analysis - Mr. Gunn
I don't think the fact that things are changing obviates a value for curation and good data management process. Software changes but the management procedures for best practice are built around that assumption, you still keep versioning your code though. The problem is most data production environments (on the small scale I work on) is that people never think about cataloguing, metadata collection, or archival at all. This means that not only are mistakes made, but that unnecessary mistakes are made. And I don't buy the idea that librarians don't deal with changing environments - they've just been through a ten year period in which their central function has been absolutely changed. Yes they're struggling with the consequences, but so are we as data producers. Best practice for data management needs support and training and it seems to me that is a function that libraries could fill if they chose to do so - someone in the research institution needs to and PIs aren't stepping up to the plate as far as I can see. - Cameron Neylon
Deepak, I think you're missing a fundamental shift in where librarians contribute within the process. If you think of librarians only dealing with organizing, keeping, preserving *complete* work, then yeah, I see your points. However, in many places - including MPOW - librarians have been asked to contribute in finding information prior to data gathering and to help design and implement collaboration tools to be used during the data gathering and analysis process. (pt 1) - Christina Pikas
So librarians - because they know about how people deal with information - can be part of the process earlier so that the data is saved with useful/usable metadata so it can be transitioned to other systems when complete. But to help in the earlier stages you have to understand a bit about the "epistemic culture" of the research area AND of the organization AND of the lab.... in academic libraries I think probably the liaison librarians are closer to this point, but they aren't the ones dealing with systems. - Christina Pikas