Physical Biology, Vol. 8, No. 5. (01 October 2011), 055011. The reverse engineering of metabolic networks from experimental data is traditionally a labor-intensive task requiring a priori systems knowledge. Using a proven model as a test system, we demonstrate an automated method to simplify this process by modifying an existing or related model-–suggesting nonlinear terms and structural modifications–-or even constructing a new model that agrees with the system's time series observations. In certain cases, this method can identify the full dynamical model from scratch without prior knowledge or structural assumptions. The algorithm selects between multiple candidate models by designing experiments to make their predictions disagree. We performed computational experiments to analyze a nonlinear seven-dimensional model of yeast glycolytic oscillations. This approach corrected mistakes reliably in both approximated and overspecified models. The method performed well to high levels of...
- Benjamin Good
Bioinformatics, Vol. 27, No. 13. (1 July 2011), pp. i111-i119. Motivation: Discovering useful associations between biomedical concepts has been one of the main goals in biomedical text-mining, and understanding their biomedical contexts is crucial in the discovery process. Hence, we need a text-mining system that helps users explore various types of (possibly hidden) associations in an easy and comprehensible manner.Results: This article describes FACTA+, a real-time text-mining system for finding and visualizing indirect associations between biomedical concepts from MEDLINE abstracts. The system can be used as a text search engine like PubMed with additional features to help users discover and visualize indirect associations between important biomedical concepts such as genes, diseases and chemical compounds. FACTA+ inherits all functionality from its predecessor, FACTA, and extends it by incorporating three new features: (i) detecting biomolecular events in text using a machine...
- Benjamin Good
Genome Biology, Vol. 12, No. 6. (2011), R57. We present BioGraph, a data integration and data mining platform for the exploration and discovery of biomedical information. The platform offers prioritizations of putative disease genes, supported by functional hypotheses. We show that BioGraph can retrospectively confirm recently discovered disease genes and identify potential susceptibility genes, outperforming existing technologies, without requiring prior domain knowledge. Additionally, BioGraph allows for generic biomedical applications beyond gene discovery. BioGraph is accessible at http://www.biograph.be. Anthony Liekens, Jeroen De Knijf, Walter Daelemans, Bart Goethals, Peter De Rijk, Jurgen Del Favero
- Benjamin Good
BMC Neuroscience, Vol. 12, No. 1. (2011), 55. BACKGROUND:How oscillatory brain rhythms alone, or in combination, influence cortical information processing to support learning has yet to be fully established. Local field potential and multi-unit neuronal activity recordings were made from 64-electrode arrays in the inferotemporal cortex of conscious sheep during and after visual discrimination learning of face or object pairs. A neural network model has been developed to simulate and aid functional interpretation of learning-evoked changes. RESULTS:Following learning the amplitude of theta (4-8Hz), but not gamma (30-70Hz) oscillations was increased, as was the ratio of theta to gamma. Over 75% of electrodes showed significant coupling between theta phase and gamma amplitude (theta-nested gamma). The strength of this coupling was also increased following learning and this was not simply a consequence of increased theta amplitude. Actual discrimination performance was significantly...
- Benjamin Good
AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium, Vol. 2010 (2010), pp. 797-801. Advanced statistical methods used to analyze high-throughput data (e.g. gene-expression assays) result in long lists of "significant genes." One way to gain insight into the significance of altered expression levels is to determine whether Gene Ontology (GO) terms associated with a particular biological process, molecular function, or cellular component are over- or under-represented in the set of genes deemed significant. This process, referred to as enrichment analysis, profiles a gene-set, and is relevant for and extensible to data analysis with other high-throughput measurement modalities such as proteomics, metabolomics, and tissue-microarray assays. With the availability of tools for automatic ontology-based annotation of datasets with terms from biomedical ontologies besides the GO, we need not restrict enrichment analysis to the GO. We describe, RANSUM - Rich Annotation...
- Benjamin Good
A community-curated consensual annotation that is continuously updated: the Bacillus subtilis centred wiki SubtiWiki - http://www.citeulike.org/user...
Database, Vol. 2009, No. 0. (1 January 2009), bap012. Bacillus subtilis is the model organism for Gram-positive bacteria, with a large amount of publications on all aspects of its biology. To facilitate genome annotation and the collection of comprehensive information on B. subtilis, we created SubtiWiki as a community-oriented annotation tool for information retrieval and continuous maintenance. The wiki is focused on the needs and requirements of scientists doing experimental work. This has implications for the design of the interface and for the layout of the individual pages. The pages can be accessed primarily by the gene designations. All pages have a similar flexible structure and provide links to related gene pages in SubtiWiki or to information in the World Wide Web. Each page gives comprehensive information on the gene, the encoded protein or RNA as well as information related to the current investigation of the gene/protein. The wiki has been seeded with information from...
- Benjamin Good
BMC Bioinformatics, Vol. 11, No. 1. (17 August 2010), 426. BACKGROUND:Many protein structures determined in high-throughput structural genomics centers, despite their significant novelty and importance, are available only as PDB depositions and are not accompanied by a peer-reviewed manuscript. Because of this they are not accessible by the standard tools of literature searches, remaining underutilized by the broad biological community.RESULTS:To address this issue we have developed TOPSAN, The Open Protein Structure Annotation Network, a web-based platform that combines the openness of the wiki model with the quality control of scientific communication. TOPSAN enables research collaborations and scientific dialogue among globally distributed participants, the results of which are reviewed by experts and eventually validated by peer review. The immediate goal of TOPSAN is to harness the combined experience, knowledge, and data from such collaborations in order to enhance the impact of...
- Benjamin Good
BMC Bioinformatics, Vol. 12, No. 1. (2011), 218. BACKGROUND:Ontologies have become an essential asset in the bioinformatics toolbox and a number of ontology access resources are now available, for example the EBI Ontology Lookup Service (OLS) and the NCBO BioPortal. However, these resources differ substantially in mode, ease of access, and ontology content. This makes it relatively difficult to access each ontology source separately, map their contents to research data, and much of this effort is being replicated across different research groups.RESULTS:OntoCAT provides a seamless programming interface to query heterogeneous ontology resources including OLS and BioPortal, as well as user-specified local OWL and OBO files. Each resource is wrapped behind easy to learn Java, Bioconductor/R and REST web service commands enabling reuse and integration of ontology software efforts despite variation in technologies. It is also available as a stand-alone MOLGENIS database and a Google App...
- Benjamin Good
Genome Biology, Vol. 12, No. 5. (30 May 2011), R50. BACKGROUND:Understanding the normal temporal variation in the human microbiome is critical to developing treatments for putative microbiome-related afflictions such as obesity, Crohn's disease, inflammatory bowel disease and malnutrition. Sequencing and computational technologies however have been a limiting factor in performing dense time series analysis of the human microbiome. Here, we present the largest human microbiota time series analysis to date, covering two individuals at four body sites over 396 timepoints.RESULTS:We find that despite stable differences between body sites and individuals, there is pronounced variability in an individual's microbiota across months, weeks and even days. Additionally, only a small fraction of the total taxa found within a single body site appear to be present across all time points, suggesting that no core temporal microbiome exists at high abundance (although some microbes may be present but...
- Benjamin Good
BMC Bioinformatics, Vol. 12, No. 1. (27 May 2011), 212. Applications of Natural Language Processing (NLP) technology to biomedical texts have generated significant interest in recent years. In this paper we identify and investigate the phenomenon of linguistic subdomain variation within the biomedical domain, i.e., the extent to which different subject areas of biomedicine are characterised by different linguistic behaviour. While variation at a coarser domain level such as between newswire and biomedical text is well-studied and known to affect the portability of NLP systems, we are the first to conduct an extensive investigation into more fine-grained levels of variation. Using the large OpenPMC text corpus, which spans the many subdomains of biomedicine, we investigate variation across a number of lexical, syntactic, semantic and discourse-related dimensions. These dimensions are chosen for their relevance to the performance of NLP systems. We use clustering techniques to analyse...
- Benjamin Good