Sign in or Join FriendFeed
FriendFeed is the easiest way to share online. Learn more »
Bioinformatics

Bioinformatics

This Room is for Bioinformaticians to post research ideas related to application of Bioinformatics tools and algorithms in a clinical setting.
Mike Chelen
"I’ve gone back to using another wonderful visualization package, PyMol. I find that it hits the sweet spot between easy setup of the scene I’d like and generating nice figures. The specific feature that I’ve come to rely on quite heavily is the built-in ray tracer. There are three available ray tracing modes in addition to the default, each of which has its uses. Mode 1 will place a black outline around your structure, which can help make the secondary structure elements visually distinct. Mode 2 is really interesting, in that it only renders the outline. I find this especially helpful if I want to show something in an overlay without obscuring what is behind it. Mode 3 produces “quantized” color in addition to the outline, giving your figure a very cartoonish appearance. I find that this one has to be used with care :) " - Mike Chelen from Bookmarklet
Mike Chelen
"The OBO Foundry is a collaborative experiment involving developers of science-based ontologies who are establishing a set of principles for ontology development with the goal of creating a suite of orthogonal interoperable reference ontologies in the biomedical domain. The groups developing ontologies who have expressed an interest in this goal are listed below, followed by other relevant efforts in this domain." - Mike Chelen from Bookmarklet
Mike Chelen
"The Gene Ontology project is a major bioinformatics initiative with the aim of standardizing the representation of gene and gene product attributes across species and databases. The project provides a controlled vocabulary of terms for describing gene product characteristics and gene product annotation data from GO Consortium members, as well as tools to access and process this data." - Mike Chelen from Bookmarklet
sourceforge project page: https://sourceforge.net/project... - Mike Chelen
Mike Chelen
National Biomedical Computation Resource - Tools - http://www.nbcr.net/tools.php
National Biomedical Computation Resource - Tools
Show all
"Analytical Services, Databases and Software" - Mike Chelen from Bookmarklet
imabonehead
Accelerating bioinformatics searching and dot plotting using a scalable FPGA cluster - http://www.embedded.com/columns...
Accelerating bioinformatics searching and dot plotting using a scalable FPGA cluster
Accelerating bioinformatics searching and dot plotting using a scalable FPGA cluster
Show all
"This paper presents an FPGA-based accelerated solution for DNA sequencing and dot plotting. It describes how multiple FPGA devices can be deployed to create a scalable cluster dedicated to the task of analyzing large amounts of data, and how this clustered hardware application can be connected to a software application for visualization and analysis." - imabonehead from Bookmarklet
Mike Chelen
"PhyLIS is a user-friendly, free linux distribution for phylogenetics. Install it and you have an instant phylogenetics workstation. No downloading packages or messing with compilers, no configuring software, no worrying about small differences between systems that mess up your scripts. Simply install, sit down, and work. PhyLIS started during a period when I was acquiring several new computers for a large phyloinformatic project and grew tired of installing general purpose linux distributions, and then having to spend an hour or two reconfiguring everything and adding software on each new computer. I began developing scripts that would do some of this work for me, eventually they became overly-complicated, and I still had to carry around thumbdrives full of software or needlessly re-download everything. Eventually it just became more desirable to have an operating system that was specifically geared towards doing phylogenetics. PhyLIS is based on Ubuntu linux, a widely used... more... - Mike Chelen from Bookmarklet
Mike Chelen
trying to determine if the information can be pulled any ways besides html? maybe they will move to SMW soon :) - Mike Chelen
Mike Chelen
Bioinformatics applications for next generation sequencing on SEQwiki - http://seqanswers.com/wiki...
Mike Chelen
"fpocket is a very fast open source protein pocket (cavity) detection algorithm based on Voronoi tessellation. It was developed in the C programming language and is currently only available as command line driven program. A GUI is in development. fpocket includes two other programs (dpocket & tpocket) that allow you to extract pocket descriptors and test own scoring functions respectively. As the algorithm is very fast it can be used on a large scale level (PDB size for instance)." - Mike Chelen from Bookmarklet
Mike Chelen
Accessing #NCBI Entrez web services with Yahoo #YQL Open Data Table - Esearch example https://login.yahoo.com/config...
the open data table adapts the query URL parameters and specifies the series of elements to return. source code: http://github.com/mchelen... - Mike Chelen
all three tables are available using a custom environment, for example with esearch: http://bit.ly/27tTVn - Mike Chelen
Mike Chelen
Bio-Linux 5.0 — NERC Environmental Bioinformatics Centre - http://nebc.nox.ac.uk/tools...
Bio-Linux 5.0 — NERC Environmental Bioinformatics Centre
Bio-Linux 5.0 — NERC Environmental Bioinformatics Centre
"A dedicated bioinformatics workstation - install it or run it live. Bio-Linux provides more than 500 bioinformatics programs on an Ubuntu Linux base." - Mike Chelen from Bookmarklet
Neat idea- but how much of the 4gb USB stick remains for holding data / analyses: need a bigger stick? - Richard Badge from Nambu
This was one of the first (and probably best) of these distributions (think there was a BioKnoppix at one time?) It's been around at least 6 years. But a software suite is only half the battle. The biologist needs to know how to use the packages, store and interpret the output. Which is why we have bioinformaticians and IT staff. I've never been convinced that "bioinformatics on a stick" is much use to biologists compared with expert advice/support, but I may be wrong. - Neil Saunders
Are these targeted towards biologists, or informaticians? Don't see biologists getting much use from such a distro, but do see computational types making good use - Deepak Singh
Target market is what has always confused me. I'd assume that bioinformaticians are happy to install their own software locally and for biologists with limited tech skills, a live CD doesn't help much. But I'm happy to be proven wrong by success stories. - Neil Saunders
I gave it to one of my students - we'll see what he has to say. - Björn Brembs
Neil: making the software easier can help decreasing repetitive tasks, allowing more efficient use of expert advice and support, which is definitely the most valuable and scarcest resource. newcomers can often manage to boot the OS and start playing with some software, and experienced users can check if their software of choice is included, and save a little time when setting up new machines - Mike Chelen
Deepak: looking at the package list http://nebc.nox.ac.uk/tools... some favorites of both fields stand out, for example a biologist may run a BLAST search regarding a DNA sequence they are studying, while an bioinformatist could develop applications with Bio-Java and Eclipse IDE - Mike Chelen
Richard: data could be stored on a network drive, or a larger flash disk could be used, since there are 8, 16, and 32gb USB sticks available now pretty inexpensively. also, additional USB drives can be plugged in limited only by the number of USB ports on the machine - Mike Chelen
Björn: cool, would love to hear how useful others find it. the software packages can also be installed in current Ubuntu systems by adding their repository http://nebc.nox.ac.uk/tools... - Mike Chelen
Follow on question? Would a VM be equally useful? For example, I use VMs a lot to learn stuff and configure environments. - Deepak Singh
Deepak: yes absolutely! for exactly the reasons you mention, experimentation and reliability. found a VirtualBox VDI image: http://friendfeed.com/bioinf... any more formats such as VMware or EC2 AMI would be great too :) - Mike Chelen
there's a bunch of good EC2 AMI's that I will be highlighting either here or somewhere else soon (from familiar names), but more the merrier - Deepak Singh from IM
@bjorn please do get your student to feed back to NEBC, a long time ago Bio-Linux was my baby and my full time job. It's come a long way since I left it and I'm very happy to see that it's still going. It has its rough edges, and things which could be done better, but out of the box it's a well set up system ready to go. It's already been used as the base for other more focused... more... - Daniel Swan
@Neil with Bio-Linux I can happily say that we turned a few biologists into informaticians, and one into a programmer when I was with the team! Even last week a biologist walked into my office, asked me to help it getting up and running in VM on his laptop so that he could do some work. The Live-CD version was really just a distrubution method, we used to send out a bootable cd-rom that would netinstall a Linux image from our servers. Inefficient at best :) - Daniel Swan
Good to hear. I remember when NEBC were setting up many years ago, Dawn contacted me regarding compilation of Phred/Phrap under Cygwin after I mentioned it on Nodalpoint. The early days of the bioinformatics social network! - Neil Saunders
Deepak: thinking about combining the Bio-Linux packages with some of the standard Ubuntu EC2 AMIs from http://alestic.com/ since they are optimized already, and contain other common tools - Mike Chelen
Mike, that would be brilliant. Lots of our customers ask for starting points in this space and being able to point to something that they might be familiar with would be great. Let me know when you do that. I am thinking about writing up a post on all the available bioinformatics AMI's on AWS - Deepak Singh
Mike - you might want to talk to Tony Travis about this (ajt@rri.sari.ac.uk) he has interests in taking the Bio-Linux base in a more 'cloudy' direction, and I'm sure Dawn and Bela and co. at NEBC would be happy with any feedback along those lines. - Daniel Swan
Deepak, getting the bio-linux packages installed can be done with a bash script http://github.com/mchelen... and maybe used with runurl http://alestic.com/2009... or to generate an AMI - Mike Chelen
Daniel, almost all the packages install okay, are there any particular applications that would be important to test? here is how the desktop looks on ec2: http://ff.im/8fb8j - Mike Chelen
Great idea. Anything to reduce the tedium of wget; ./configure; make; make install is welcome and helps lower the entry barrier for people into the field. - Todd Harris
Todd: it would be nice to start an instance with the least manual input, especially when running a particular application. for example software set up to use a biology AWS dataset http://developer.amazonwebserv... - Mike Chelen
<-- pure biologist, willing to install on top of current Ubuntu to give it a try. Agree with Todd on extra installs. I essentially only use a java-based app to chomp on big files, also Bioconductor on rare occasion. Neil's point about need for expert advice is well-taken, but I find that a biologist, willing to use a linux platform, right away gets more targeted feedback when asking for help. - Heather
Heather: that's great, did you find it easy enough to add the repository? here's a script that can save a little time, it is only a few lines though: http://github.com/mchelen... - Mike Chelen
The ideal way to install the repository would probably be a .deb containing the apt sources and signing key. This is used for example with Ubuntu One https://one.ubuntu.com/support... and PlayDeb http://www.playdeb.net/updates... (expand the instructions). Maybe someone could prepare this given the existing NERC repository info? - Mike Chelen
Deepak: the Bio-Linux image from JCVI http://www.jcvi.org/cms... really looks great, anything that helps this project can benefit other researchers interested in running bioinf software on EC2. maybe a repository mirror within EC2/AWS cloud, or a public AMI, would help too? - Mike Chelen
Mike that is a public AMI, ami id is ami-6953b200 - Deepak Singh from IM
Deepak: found the wiki page now http://sourceforge.net/apps... thanks! - Mike Chelen
anytime - Deepak Singh from IM
does this image include a desktop environment? the screenshot sort of looks like remote x. in most cases that is probably best, it could be nice to have a desktop version as well. thinking about slower internet connections, where some compression (such as freenx) is usually needed for remote desktop - Mike Chelen
Not sure, you'll have to ask Bioinfo ... My guess is no, since I think it's built on a server image, but I could be wrong - Deepak Singh from IM
Deepak: that's good, been trying to find a way to get the bio-linux packages installed on 64bit instance. desktop could be handy for learning and testing - Mike Chelen
Mike Chelen
Ubuntu -- Details of package octave-pdb in karmic - http://packages.ubuntu.com/karmic...
"This package contains function for reading and displaying PDB-files from the Brookhaven protein databank in Octave, a scientific computation software." - Mike Chelen from Bookmarklet
coming soon to a koala near you! - Mike Chelen
Mike Chelen
ZenPDB: The Noble Eightfold Path in PDB file processing - http://code.google.com/p...
"ZenPDB is a Python module to process and analyze macromolecular structures. Macromolecular structures are represented as hierarchically nested python dictionaries, which allows to traverse and manipulate them in a pythonic way and implement structural biology algorithms compactly. This module is also useful to establish processing pipelines. ZenPDB is currently capable of parsing and writing PDB files, but PDBML input is also planned. ZenPDB provides fast and versatile Cython implementations of important algorithms in structural biology, namely accessible surface area (ASA) and distance contact calculations by using kd-trees for nearest neighbour look-up. It is crystallography-aware and can construct crystal lattices, unit cells and biological units." - Mike Chelen from Bookmarklet
Mike Chelen
Mike Chelen
Mike Chelen
Mike Chelen
"This package contains some ribosomal RNA BLAST databases distributed as part of the NCBI C Toolkit that are too large and specialized to include in ncbi-data. Specifically, it contains the databases Combined16SrRNA_2-12-2008, LSURef_93.fasta, and SSURef_93.fasta, along with an alias file to facilitate searching all three of them in conjunction with the 16SCore database included in ncbi-data." - Mike Chelen from Bookmarklet
OpenSci Info
Mike Chelen
Fwd: XMPP for cloud computing in bioinformatics supporting discovery and invocation of asynchronous Web services - http://www.citeulike.org/user... (via http://friendfeed.com/dullhun...)
Mike Chelen
Mike Chelen
"Nesoni is a high-throughput sequencing data analysis toolset, which the VBC has developed to cope with the flood of Illumina, 454, and SOLiD data now being produced. Our work is largely with bacterial genomes, and the design tradeoffs in nesoni reflect this. Nesoni focusses on analysing the alignment of reads to a reference genome. We use the SHRiMP read aligner, as it is able to detect small insertions and deletions in addition to SNPs." - Mike Chelen from Bookmarklet
Mike Chelen
"The Rotamerically Induced Perturbation (RIP) method generates local perturbations that are capable of inducing several Ångstroms of conformational change in just picoseconds of a Molecular-Dynamics (MD) simulation. It is particularly useful for identifying potentially mobile loops and helices in a protein structure." - Mike Chelen from Bookmarklet
The RIP code is released under GPL, however it does require the closed-source AMBER program. - Mike Chelen
Mike Chelen
Mike Chelen
"The available amount of data in bioinformatics is huge. One way to characterize for example protein function is to perform large-scale analyses on the available data. These analyses are to an increasing extent dependent on fast computers. The demands are similar for most bioinformatics groups within the Nordic countries and it therefore makes sense to coordinate common efforts. Presently, bioinformatics resources are available within each country but with limited coordination between the countries. With a Nordic bioinformatics grid infrastructure heavy computational tasks can be performed faster and more efficiently." - Mike Chelen from Bookmarklet
Mike Chelen
"Cytoscape is an open source bioinformatics software platform for visualizing molecular interaction networks and integrating these interactions with gene expression profiles and other state data. URL: http://cytoscape.org/" - Mike Chelen from Bookmarklet
Mike Chelen
Virtual Proteomics Data Analysis Cluster | proteomics.mcw.edu - http://proteomics.mcw.edu/vipdac
"ViPDAC uses Amazon Web Services to analyze proteomics data." - Mike Chelen from Bookmarklet
Mike Chelen
"The Distributed Annotation System (DAS) defines a communication protocol used to exchange annotations on genomic or protein sequences. It is motivated by the idea that such annotations should not be provided by single centralized databases, but should instead be spread over multiple sites. Data distribution, performed by DAS servers, is separated from visualization, which is done by DAS clients." - Mike Chelen from Bookmarklet
Mike Chelen
Sculpted Proteins in Second Life - import from .PDB file to sculpted prim - http://slusage.com/sculpties/
Sculpted Proteins in Second Life - import from .PDB file to sculpted prim
Bell Eapen
Applied Bimatics - A Bioinformatics Blog - http://gulfdoctor.net/bioblog...
"Applied Bimatics - A Bioinformatics Blog" - Bell Eapen from Bookmarklet
Other ways to read this feed:Feed readerFacebook