Sign in or Join FriendFeed
FriendFeed is the easiest way to share online. Learn more »
Attila Csordas
If bioinformatics is a profession on its own it should have some core basics. Would you help me to delineate the current core IT/programming knowledge and skillset that is needed for any bioinformatician wannabe? Just concentrate on the basics and don't talk about the biology side this time.
I'm teaching programming to bioinformatics MSc students at the moment and I wish it was that easy to define :-) I'm starting to think that the 'core' skills are things like version control, IDEs, unit testing, debuggers, and design principles, and not any one language -- because there are so many languages in use (often within just one lab). SQL and XML are always helpful though. - Andrew Clegg
Add to that, R + some basic stats and modelling skills, and an understanding of some of the fundamental algorithms like dynamic programming. - Andrew Clegg
in case of programming languages we may use the and/or logical disjunction form: Perl or Python or Java or Ruby or.... - Attila Csordas
Not all bioinformaticians are programming, and not all should. Are you asking about programming or non-programming type? - Pawel Szczesny
Learn a scripted language and a compiled language (whatever languages as long as your job is done). Learn them, master them. Learn about a database engine. And write a LOT of code. - Pierre Lindenbaum
@Pierre, yeah, I like that. We currently teach them Perl and Java. Not sure what I'd pick with a blank slate though. - Andrew Clegg
@Pawel, if a bioinformatician doesn't (can't??) code, what's the difference between him and a biologist who learned how to press the buttons in some tool after reading the manual? In other words, if a bioinformatician can't create a new solution, what';s the value add? - Rajarshi Guha
@Rajarshi. I know a few "bioinformaticians" spending their days in front of the screen , looking for SNP in the traces, or surrounding some spots. They use accurately some tools developed by some others. - Pierre Lindenbaum
@Andrew: today I would recommend to learn python and C. But if your students are not certain to find a job in science, then without any doubt.: java. - Pierre Lindenbaum
Rajarshi, what's the value added of programming bioinformatician that is using his skills to reinvent the wheel/redo the solutions available for years in new programming language (bioinformatics journals are full of examples of re-developed tools)? I'm being tongue-in-cheek here, but I think "bioinformatician=programmer" is extremely skewed view of the field. - Pawel Szczesny
@Pierre There has been talk of dropping Java but apparently they like having a Java course on their CVs. For that reason I guess. (Also I teach that one so I'm happy for it to stay :-)) @everyone-else Maybe there's a useful distinction between bioinformatician (hate that word) and computational biologist? - Andrew Clegg
We had similar discussion couple of months ago and I was advocating the same thing as Neil right now - all courses should prepare people to learn new things quickly when they need them. Depending on the preferences there should be more or less programming involved, but I don't think putting everybody into one basket is a good idea. - Pawel Szczesny
@Pawel, re reinvention, fully agree. That's pointless. But why have a degree in bioinformatics if you're just going to look at SNP's using somebody elses tools? I also agree that if bioinformatics == programmer and nothing else, that's a waste. Anybody can program. The value add is bioinformatics = programming + biology. I still think 'computational thinking' and its realizations is what will differentiate a bioinformatics person from a programmer or biologist biologist, in the ability to whip up ... - Rajarshi Guha
@Pawel it's a little bit hard for me to imagine a bioinformatician who has never done a little bit of coding, it's like a molecular biologist who has never run a gel before. (also would you give me the link of that similar discussion?) - Attila Csordas
...computational solutions (hopefully new) to new problems. - Rajarshi Guha
@Rajarshi "computational thinking"++ I spent a while working in medical informatics with people from librarian/information-management backgrounds, it was a nightmare, they would keep doing repetetive scriptable tasks by hand cos they didn't stop to think "I can automate this". Teaching people programming gives them a more productive mindset. - Andrew Clegg
++Andrew: I've heard people use the phrase "computational thinking" a lot, but usually in too vague a way for it to really be all that helpful. I'm guilty of this. "Automate everything" is nice and precise. - Michael Nielsen
Of course I'm sure many of us have been guilty of spending a day writing some code to automate an interesting task that we could have done by hand in a morning ;-) - Andrew Clegg
Attila, I didn't say that bioinformaticians don't code, but only that not all of them should do that full time. Automation is one thing, producing applications is the other. (I cannot find the respective discussion right now, will look for that it later.) - Pawel Szczesny
Rajarshi, my distinction between programming and non-programming bioinformatician is about people who build tools and people who do mashups of these tools. I'm lousy coder in any language, but at least I can glue things together. There's some coding background needed for, as you say, "computational thinking", but it's different for different groups. So, my first question in this thread was about which group we're talking about. - Pawel Szczesny
@Pawel I'm only talking about "the current core IT/programming knowledge and skillset that is needed for any bioinformatician wannabe" isn't that clear enough? It's an educational/practical question for the starters. - Attila Csordas
@Michael - re vagueness of 'comp thinking' - I was puzzled by that the first few times I heard it. But after reading http://www.cs.cmu.edu/afs... it got a little clearer - Rajarshi Guha
@Andrew, yes 'automate everything' is a good summary. But like Neil wrote, the ability to map natural (physical?) objects into a comp framework is a very desirable skill for anybody working with comp tools - Rajarshi Guha
@Pawel, aah, yes, I agree with the distinction. But given the development of pipelining tools etc - the ability to do mashups slowly becomes available to everybody with a modicum of programmin skills. If one wants to be considered a 'bioinformatician' - I think it'd require more than just the ability to mash things up (in addition to biological knowledge). I can manipulate sequences etc - but I'm certainly not going to pass myself of as a bioinformatician :) - Rajarshi Guha
Attila, I was going to suggest something along the lines of posts Paulo has linked, but then you've mentioned Java, and I wasn't that sure anymore. For me Java doesn't really fit core skill set "any bioinformatician wannabe" (that's just my opinion, Pierre would disagree :) ). - Pawel Szczesny
If it helps bring it back onto more concrete ground, this is what we give em: http://mscb.cryst.bbk.ac.uk/modules... Not saying this list is perfect by any stretch though. But it is one of the longest-running bioinf courses in the UK (I did myself 6-7 years ago) - Andrew Clegg
@Neil this is not an introspection on my part because I am a traditionally trained (wet lab) molecular biologist (with an extended software testing experience) who has the ambition to work as a bioinformatician in the near future so I am asking all practicing bioinformatician/computational biologists here to help delineate the necessary core skillset. That's all. - Attila Csordas
wrt language, I think any competent computational scientist should be expert in one language and be able to pick up any other language (of the same type) in a week or two. Python FTW :) - Rajarshi Guha
I would define a bioinformatician as someone who applies quantitative methods to understand and simplify biological data. Therefore the core skills would consist of statistics( lots and lots of it) , computer science and information theory ( lots of it ) and some language or platform as a glue to hold it all together . The programming language is necessary as a glue, but should never ever be the focus of learning the approaches. The bioinformatician should be able to abstract the process from the tools! - Hari
@Pawel Hey ! read my comment ! :-) I didn't say that java should be teached ! :-) - Pierre Lindenbaum
I tend to think that there are two types of bioinformaticians: Programmers and Biologists. Other than that, I think the key for bioinformatics is not a specific toolset (languages or otherwise) but a great deal of knowledge. - Paul J. Davis
There are many ways to divide what I think many of us perceive to be two camps under the "bioinformatics" umbrella. My take: people who write code for others to use/reuse, and people who write code to get their own stuff done... - Andrew Su
If I took a bioinformatics course and a lot of time was spent teaching programming I'd want my money back :-) I'd expect to learn about different sequence search and alignment algorithms -- up to the point of being able to choose the most appropriate implementation for a task (or even write my own). To keep such a course practical it may be a good idea to use e.g. pseudo-code level Python (which can be taught in an optional introduction course). - Eric Jain
@Eric It's pretty hard teaching people to evaluate or design/implement algorithms if they don't already know how to program a bit, and have the faculty of computational thinking as discussed above... Just like it's hard to choose the most appropriate statistical test if you don't have a bit of a stats background. - Andrew Clegg
Ruby | Python, R, Databases, makefiles, testing, linux, latex, classical statistics, machine learning + data mining. - Michael Barton
just to add to Michael's list: some kind of version control system. - pn
Doh! How could I forget git... - Michael Barton
I didn't mention git. Anyone would be fine :-) - pn
I totally agree with what Neil Saunders and Eric Jain say. Its sad when people think Bioinformatics is just programming. I think programming is a distant third as a Bioinformatics skill. Its got to be more about statistics and algorithms . The only way to learn programming is well to write programs , and what better way to learn programming than understanding and implementing algorithms and applying statistical methods to data. Bioinformatics is more Computer Science than IT ( information technology ) . - Hari
@Hari, nicely put and agree with bioinformatics being CS ratehr than IT - Rajarshi Guha
or perhaps rather IS than CS? - Adam Kraut
Second the programming concerns. Yes, scripting is useful, no doubt, but I tend to distinguish between the tool builders and the users. A good bioinformatician has an overview of several fields, knows and understands the tools at his disposal (including parameters), when to apply which solution, what standards are out there etc. I pay very little attention to code background when hiring a bioinformatics analyst. Different story when hiring a developer. - Oliver Hofmann
Knowing more about version control and such topics would no doubt be beneficial to a lot of bioinformaticians. But I don't see much that would be bio-specific about a course teaching such topics? Unless of course branding it as such gets the course more interest and funding (and this way you don't need to compete with programming courses offered by other departments)... - Eric Jain
great topic! I agree with the suggestion of one scripted, and one compiled language (perl or python work admirably here), and java is a good choice. IMO- as essential as languages, are being able to handle often massive datasets. For this, familiarity with a real RDBMS (Oracle, Postgres, or i guess mySQL) is *essential*. Add some R and stats for good measure. - John Major
John (Mr. Prime Minister?) I would extend that to a general understanding of data structures and data models, especially in an age where you might not always need to go with RDBMS for everything. - Deepak Singh