It's somewhat amusing that I still recognize almost all the names in current issues of Proteins: Structure, Function, Bioinformatics? Are no new people doing structural bioinformatics? And we really need a good "open" journal in this space
It doesn't suprise me that its tough for new people to get into these kinds of fields. High entry costs and high project burn rates looking for the big hits. Really tough for new entrants.
- Cameron Neylon
Jean-Claude - Not quite although I haven't been tracking content that closely
- Deepak Singh
Cameron, agreed. The field hasn't really progressed that much either for a while, which really bugs me
- Deepak Singh
Also the concern that has certain pathways become standard and it is too expensive to rebuild those they limit the way people can think about the subject
- Cameron Neylon
from twhirl
A. Deepak: I disagree that the field hasn't really progressed (frowning smiley) it keeps progressing all the time. B. Is "Proteins" really considered such high-end publication target for structural bioinformatics ?
- Nir London
Nir, with few exceptions (and your stuff is in that short list), it hasn't. I find it difficult being drawn in and the improvements have been painfully slow. I should add that my bias is towards more "physical" methods
- Deepak Singh
from IM
Well, you could say I'm somewhat biased ;) I actually revisited a poll I published a while ago (http://bit.ly/617Geh) regarding the prefered journal for computational structural biologists and was amazed to find that indeed proteins ranks much higher than, let's say "Structure", and by the way PLoS CB ranks as high as proteins..
- Nir London
I think a relevant question is how to define "structural bioinformatics" nowadays. There are a lot of gray areas that result from different reasons: 1) there are a lot of relevant papers to the field (as I see it) that come from people in CS or Chemistry. Many of these people don't think they are doing "structural bioinformatics". 2) There are no conferences that bring the majority of...
more...
- Mickey Kosloff
@Nir. "Proteins" might not have the highest impact factor of the relevant options, but it's a very good journal that is read by many people in the field (with the caveats of this definition as I wrote above). As for your comparison with "Structure", what percentage of the papers there do you find *really* relevant to your research vs. "Proteins"? For me Proteins is one of the very few...
more...
- Mickey Kosloff
Going to be.. interesting. Very, very fast talks. Still trying to recruit folks to help :)
- Oliver Hofmann
Using Cameron's 'blogging permitted' slides and a simple sign at the podium that says 'blogging' or 'no blogging'. So far no speaker declined coverage
- Oliver Hofmann
Morning session, chaired by Ron Shamir, actually starts on time :)
- Oliver Hofmann
First synthetic biology talk I've actually been convinced by (Jef D Boeke on Yeast 2.0)
- Oliver Hofmann
Oliver do you mean there are physical versions of the signs that people can choose from? If so could you take a picture?
- Cameron Neylon
from Android
@Cameron, sorry, no -- I just grabbed your slides, added them to the blogging announcement slide that is on rotation during breaks, and made them available to presenters. The physical sign is just a big green/red cardboard that says blogging/no blogging. I'll try the physical signs next time :)
- Oliver Hofmann
That's still very cool. Well done on raising visibility of the issue!
- Cameron Neylon
from Android
Cool - have you got any interesting responses to it? I would guess its probably a pretty open community in that regard at least.
- Cameron Neylon
Manolis (Kellis) keeps advertising. Lots of good feedback, particular from folks stuck in the lab while their PIs are downstairs attending the talks ;) So far not a single person has declined coverage despite lots and lots of unpublished results
- Oliver Hofmann
That's really good to hear! Some sectors aren't so precious as others
- Cameron Neylon
First two groups making use of the 'no blogging' side of the sign. About time, I needed a break :)
- Oliver Hofmann
Thanks for all the great coverage, it must be exhausting all by yourself!
- Ruchira S. Datta
@Ruchira, thanks! Lots of fun, though, and getting good feedback even at the conference. Well worth it.
- Oliver Hofmann
What not to do: be in the process of starting a new lab, giving a neat talk, looking for great postdocs.. and asking for the talk not to be blogged.
- Oliver Hofmann
Different dimensions of interaction networks: time (evolution), condition, space (organelle, process, pathway, complex, protein, domain, amino acid)
- Oliver Hofmann
Generation and analysis of double mutant strains (with their own screening system) for genetic interaction maps measured by colony size
- Oliver Hofmann
Combine with the PPI maps [yep, it's dense]
- Oliver Hofmann
Identify modules a biologist can work with, form hypothesis
- Oliver Hofmann
Abstraction level (collapse into a functional module network that lists processes, can drill down)
- Oliver Hofmann
Complexes as genetic interactions, modules of functional interactions (e.g. different complexes along a linear pathway)
- Oliver Hofmann
Complexes can be comprised of functionally distinct submodules, reflected in genetic interaction map split
- Oliver Hofmann
Information on Individual proteins based on their context
- Oliver Hofmann
Proteins with different function: multiple point mutations hitting different functionalities, behave differently in genetic interaction screen
- Oliver Hofmann
PolII with 71 mutants in four subunits, crossed against 1100 mutants of essential genes (picked to represent major biological processes)
- Oliver Hofmann
Ten replicates. How well does genetic clustering predict PPI? AUC 0.73, similar to previously conducted experiments
- Oliver Hofmann
Negative interaction with splicing factors, cryptic initiation, positive with peroxisome biogenesis
- Oliver Hofmann
13 different mutants in active site of RPB1 that increase or decrease the transcription rate
- Oliver Hofmann
Expectation: positive interactions of TF mutations with increased rate version (and vice versa)
- Oliver Hofmann
Found RMD8, RIC1, SUB1, CDC73 with this behaviour
- Oliver Hofmann
Started looking at other machines (nucleosome, HSP90, Ribosome)
- Oliver Hofmann
Time axis: S.pombe vs S.cerevisiae, complex of interactions conserved, genetic interactions less so
- Oliver Hofmann
Kinase complexes conserved, but not the targets / interactors
- Oliver Hofmann
Mouse map of 150x150, focus on chromatin factors and matched to yeast data
- Oliver Hofmann
Cross-species E-Maps (Host/virus interactions) as the next step, work in progress
- Oliver Hofmann
Problem to identify molecular pathways targeted by a compound (drug mechanism of action or MoA)
- Oliver Hofmann
Also useful for drug repositioning (novel applications for known drugs)
- Oliver Hofmann
Identify either by analyzing wide-genome transcriptional responses
- Oliver Hofmann
Connectivity map data set as starting point (1000+ compounds for 5 cell lines), up/down-regulated genes by rank
- Oliver Hofmann
Novel distance measure between drugs. Previously profile-wise GSEA, profie-wise spearman. Experimental noise results in profiles obtained from the same experiment to be very similar
- Oliver Hofmann
Distance between drug A, B: combine drug effect from all cell lines into single list (Knu-Bor merging method, Iorio, Journal of Comp Biol 2009), GSEA on up/down-regulated genes in combined list, determines weighted edge between drug [Unclear on how GSEA similarity is calculated]
- Oliver Hofmann
Identify community in drug-similarity network (rich club structure, partitioning method), turning pair-wise distances into 'landscape' of drugs
- Oliver Hofmann
Communities enriched for similar drugs (based on drug annotation)
- Oliver Hofmann
"rich clubs" reflect similarity hierarchies. Overall group of response to unfolded protein inducers, subgroups of UPS modulators, proteasome inhibiors, HSP90 inhibitors etc.
- Oliver Hofmann
Test with novel anti-cancer agents classification, classified / clustered with similar drugs
- Oliver Hofmann
Example of drug re-positioning. Neighbours of 2DOG, known autophagy inducer (present in cMap). Four closest neighbours include FDA-approved drug XD (Rho-kinase inhibitor, vasodiator) -- does it induce autophagy as well?
- Oliver Hofmann
Experimental validation confirms increase in autophagy (by LC3-I/II measurements)
- Oliver Hofmann
Conserved interaction patterns help to understand regulatory mechanisms, evolution
- Oliver Hofmann
Requires similarity measure for molecules in different organisms, integrate molecular similarity with network analysis
- Oliver Hofmann
Alignment time complexity usually exponential to network size, restrict insertions/deletions
- Oliver Hofmann
Query pathways using HMM. Given known linear pathway and a reliable network find matching path
- Oliver Hofmann
One-to-many similarity relationships between query, network
- Oliver Hofmann
HMM integrate protein similarity in scoring scheme, have linear complexity and (given reliable biological information) the guarantee of optimality
- Oliver Hofmann
Transition probabilities reflect interaction reliability, emission probabilities depend on sequence similarity
- Oliver Hofmann
HMM with same graph struture as network, each state can emit any node in the query set (dependent on sequence similarity)
- Oliver Hofmann
Allows insertions/deletions (via additional nodes or null emission)
- Oliver Hofmann
Fix maximum number of allowed deletions
- Oliver Hofmann
More general: align two protein interaction networks
- Oliver Hofmann
Find pairs of similar paths, add a virtual query path. Find pair of paths which emits the virtual query with the highest probability
- Oliver Hofmann
[Plug for Science Signaling spinoff magazine[
- Oliver Hofmann
Toolkit for systems analysis of signaling pathways: PCA/PLSR and model breakpoint analysis. How do signals flow through the network?
- Oliver Hofmann
Classical genetics: perturb system, survey phenotype, order pathways by epistasis
- Oliver Hofmann
Network perturbation: vary multiple inputs to signaling pathway, densely sample signals, correlate signals with responses _and test causation vs correlation_
- Oliver Hofmann
Model system cell death vs rescue, TNF or TNF/Insulin/EGF treatment with altered response
- Oliver Hofmann
Problem of univariate approaches to signal-response mapping. JNK involvement with apoptosis depends on the context (signals from other molecules)
- Oliver Hofmann
A view of a single molecule insufficient, needs distributed sampling of signaling networks
- Oliver Hofmann
Combine kinase assays, quantitative western blots, other assays (systematic network-level signaling measurements), keep track of activity, levels
- Oliver Hofmann
19 signals, 3 replicates, 13 time periods, 9 conditions result in 7000 protein measurements
- Oliver Hofmann
But many different ways to measure the phenotypic endpoint of apoptosis. Measured four different indices in 1400 FACs
- Oliver Hofmann
Relate 7000 state measurements with 1400 apoptotic responses
- Oliver Hofmann
also included additional data points (change of activity, max/min/mean)
- Oliver Hofmann
Transfer 760 dimensional input space into 12 dimensional response space. Dimensionality reduction (PCA)
- Oliver Hofmann
3 components in the PCA-PLS model connect signaling network state to apoptotic response, 2 components predict 95% of cell death responses
- Oliver Hofmann
Model building vs biological meaning; we can predict response with two components, but what biological information do they contain?
- Oliver Hofmann
First component the stress axis, second component a survival axis (TNF projects along first axis, Insulin/EGF along the second axis)
- Oliver Hofmann
But: combinatorial stimuli are NOT the linear combination
- Oliver Hofmann
Same molecular signal can send different messages depending on time point
- Oliver Hofmann
Early IKK signaling is direct and directs towards survival, late IKK is indirect and towards stress/death
- Oliver Hofmann
Reinforce or modulate signals from previous time points
- Oliver Hofmann
Given model and biological insight ask additional questions. Do cells utilize the full dynamic range of the signals?
- Oliver Hofmann
Discretize signal by binning (analog -> digital) and observe the model fitness (how well does the model continues to predict the outcome, 1 - no change)
- Oliver Hofmann
IL-1 circuit is analog; at 20 or fewer bins the model fails
- Oliver Hofmann
TGF-a is digital, one/off (2 bins) and the model continues to predict perfectly
- Oliver Hofmann
Suggestion: principal components are a fundamental basis set for molecular signals in apotosis
- Oliver Hofmann
Second test: warp the linear distribution of the dynamic range. Desensitize or saturate the signal (flatten, amplify signal)
- Oliver Hofmann
Result depends on the signal loop. IL-1ra handles flattened signal fine, does not handle amplified signal. Vice versa for C225
- Oliver Hofmann
Failure is not a continuous degrade, but an abrupt failure as signal distortion is increased
- Oliver Hofmann
Theoretical concept or relevant in biology? Transfect cells with WT MK2 or mutated version (dead, always active). Can be used to flatten or amplify response
- Oliver Hofmann
Model makes counter-intuitive predictions for MK2; confirmed in vivo.
- Oliver Hofmann
Model breakpoint analysis can reveal new biological signaling mechanisms. Full dynamic range of signals may be more important than the absolute value of the signal strength for controlling cellular outcomes
- Oliver Hofmann
Applied same approach to DNA damage response to doxorubicin treatment
- Oliver Hofmann
3500 measurements for cells with different drug dosages over multiple timepoints
- Oliver Hofmann
Stepwise regression analysis (alternative to PLSR)
- Oliver Hofmann
Result: dual role for Erk in DNA damage response
- Oliver Hofmann
Erk stops cell cycle in response to drug; if cell is in S state of cell cycle Erk drives apoptosis
- Oliver Hofmann
Survival map-kinase sends cell death signal depending on the cellular context
- Oliver Hofmann
Reminder: anyone in the world can now attend Science Online 2009 London without leaving the comfort of their pyjamas or computer on August 22nd: http://network.nature.com/people.... Please pass the word!
High quality: curated; KEGG/Reactome in addition, high-volume yeast-2-hybrid and other interactions at lower level, can be filtered by quality
- Oliver Hofmann
Data can be shared via XML [no info on format used]
- Oliver Hofmann
Evolvability: the capacity to change, flexibility of gene expression (in this case) encoded in the promoter structure
- Oliver Hofmann
Flexibility: the propensity for changing expression
- Oliver Hofmann
Short term: regulation. Long term: Evolution
- Oliver Hofmann
Promoter structure: Expression divergence vs the fraction of genes with TATA box
- Oliver Hofmann
Genes with high divergence with higher fraction of TATA boxes (in four different yeast species); repeated for different eukaryotic species with similar results
- Oliver Hofmann
Distinct patterns for low, high flexibility genes. Low: no tata box, nucleosome free region with TF binding sites. High: no clear nucleosome free region, nucleosome binding distributed, compete with TF binding
- Oliver Hofmann
Implications: evolvability vs robustness
- Oliver Hofmann
House keeping genes: should be robust. Response to environment, pathogens: should be able to 'respond' quickly
- Oliver Hofmann
Speculation: house keeping -> without TATA box, low dynamic range. Opposite for the environmental genes
- Oliver Hofmann
Distinguish cis and trans-effects with regards to nucleosome positioning: test allele specific effect in (yeast) hybrids
- Oliver Hofmann
If the hybrid alleles match the effect was in trans; if they remain different it is a cis effect (e.g., one allele no longer has a functional binding site)
- Oliver Hofmann
Three observed classes: loss or gain of nucleosome occupancy and shift of position
- Oliver Hofmann
Measure occupancy in original strains, hybrid can identify cis/trans effect for the three events. 70% of the changes in cis
- Oliver Hofmann
cis-dependent losses can be predicted by the sequence
- Oliver Hofmann
Prediction mostly based on AT-richness of the sequence
- Oliver Hofmann
Low predictive power for shift events. No change in sequence as a shift event can extend from the initial site to neighbouring nucleosomes
- Oliver Hofmann
Most changes in nucleosome position between species neutral, not correlated to expression. Largest change observed between cell types
- Oliver Hofmann
Question: mechanism of the TATA box, and causal relationship of the TA-rich regions for the nucleosome differences? Speculate that TATA increases re-initiation of transcription (more RNA for one opening instance). Causality of TA-regions: follow-up in the lab
- Oliver Hofmann
Deepak: This is not normal personal cloud storage. Please see http://tr.im/Gstx on why this costs slightly more than other services.
- Dan Cohen
Dan, I read that and am well aware of the challenges of high availability/high durability storage (I manage business development for Amazon EC2), hence the question, especially since you're using S3. Perhaps more information than you can share in public. Call it intellectual curiosity
- Deepak Singh
Haven't tried it in ages, but when I last looked there was an option to specify the storage directory. Put that in Dropbox/ and you're done. Would give you backup and sync, if not share.
- Neil Saunders
@Neil I do the same, straight into dropbox, job done
- Frank
Not exactly the same thing, but my Papers library resides on dropbox
- Deepak Singh
You've always been able to bring your own storage to Zotero, for free. There's a pref for that. This new storage, in addition to providing extra personal storage, provides real-time sync of files across groups, among other things. That means if you are in a science Zotero group with a thousand people, your 2 GB dataset will propagate to all of those people upon upload. When someone new joins the group, they'll get all the data too. As Deepak knows, those syncs are tricky and also use a lot of bandwidth.
- Dan Cohen
Dan, thanks for the clarification. So this is more of a "advanced" feature and not targeted at personal repositories. This would be a really cool feature for companies.
- Deepak Singh
Neil, great post. And you're right, we do make things too complicated sometimes, but do we do that at the level at which we ask questions, or at the software implementation level? My take is the latter, cause you need to ask questions the way you want to, but that doesn't mean what makes it all come together has to be one complex mess
- Deepak Singh
Glad you like it. One of those that bubbled up out of frustration at inability to achieve! I feel that science is the business of turning complex (real-world) things into simple models - and that we've moved away from that idea.
- Neil Saunders
I'm a sucker for this kind of ambitious thinking. Go Neil!
- Bill Hooker
I think it's a good sign that things like this are now obvious. Things start out as a complex mess of disconnected things, overlapping complicated ways of connecting them are devised, then it becomes obvious what the simpler thing to do is.
- Mr. Gunn
Great ! But aren't you re-inventing something like RDF Neil ? feature/probe/value is nothing but a RDF statement...
- Pierre Lindenbaum
No, I don't want to reinvent anything. If RDF will work for me, I'll use it. I'll also use SQL, NoSQL, key-value pairs, document-oriented or whatever it takes. I just think that trying to integrate data by combining other peoples large, complex representations is not working. We need to simplify the whole business.
- Neil Saunders
I think there is a middle road here - we need high level generic descriptions like what Neil is proposing (and like my "We have stuff, we do stuff to it, which makes stuff"), but also a way of pointing to more sophisticated information that might be useful in specific contexts. I think we can have the best of both worlds as long as the data representation is separated from the metadata and the organization of each can be described in a machine readable (and agreed!) form
- Cameron Neylon
I'm too old school, leaving comments on blogs... who does that any more. I’m sure you’re aware that you’ve just described a model using *triples*. Which means you could start storing these kinds of simple relationships in a triple store like virtuoso etc. As you say, you don't have to reinvent anything, just simplify the use (conventions) of existing approaches (e.g. RDF). I would like...
more...
- Greg Tyrelle
I like blog comments :-) Yes, my example looks like RDF triples. No, that was not really my intention. Let's ask these questions: (1) what data relationships would make sense to a biologist? (2) what are the commonalities in the data, which a biologist may not have considered at an abstract level? As I wrote in the post, many datasets that look different are really different ways of looking at the same thing.
- Neil Saunders
The joys of data modelling :-) For (1): I'm afraid asking for a definition of some data relationships is building an(other?)) ontology.
- Pierre Lindenbaum
Let's put it another way. What we have, presently, are quite complete, often large and complex, but useful and usable descriptions of individual experiment types. "Integration" essentially means "parse them individually and mash-up the results". That's what makes it difficult. Perhaps we need an "ontology of integration" :-) But let's keep it really, really minimal.
- Neil Saunders
I actually think you will struggle to find data commonalities across bioscience. Even the simple proposal of target, measurement, value could break down in many cases e.g. we tried ages ago to get some intensity data from a bunch of microarray experiments and we gave up because we couldn't get across what we needed. What are you really measuring? Does it mean the same thing to different...
more...
- Cameron Neylon
I think there's a good case for storing, in the first instance, raw values. Figure out how to process them later (that's statistics). Focus on trends (up, down, stayed the same). Focus on well-defined variables that do mean the same to everyone (intensity, in theory = amount of transcript, regardless of the very real difficulties). And I think more experiments fall into...
more...
- Neil Saunders
@Pierre freebase is exactly what I had in mind, however the web client (the best part) is not open. @Neil Store the data first, ask questions later. Nice. One of my hopes for semantic web technology was that is could be come a universal mashup system (RDF+ontologies+triplestores). But you start down that path, and you suddenly realise that the semweb is asking you to get your data...
more...
- Greg Tyrelle
But for me your example of a gel isn't raw data. The raw data is the image. Which might have several targets or assays on it. Up/down stayed the same is only really of interest in particular types of science. And I challenge you to find any well defined variables :-) Intensity to me is a measure of optical density but questions of background, object size, masking, averaging algorithm...
more...
- Cameron Neylon
from twhirl
But agree with what you and Greg are saying, first thing get the data somewhere, with allt the metadata you can automatically collect. Then worry about capturing more metadata as people do stuff with the data. Writing this grant proposal right at the moment.
- Cameron Neylon
from twhirl
And in microarrays, "raw" data is the image of the slide. But aside from a cursory inspection to ensure that it isn't complete rubbish, nobody much cares about that. I'd argue that there's a point in the preprocessing at which a numerical value emerges which could be called "useful" and which encapsulates the object being measured. It needs more work (e.g. normalization) to get information from it, but it's the "value" in feature/reporter/value.
- Neil Saunders
To me this about finding something a bit like an upper ontology that describes the general category that objects (targets, assay, value, inputs, outputs, data, process, sample) fall into. That lets you do the general integration, and the more detailed local data structures become more useful as you can agree more and more on what details are important. So I absolutely agree with what...
more...
- Cameron Neylon
Heh heh It was exactly that image that we did care about - which was the problem :-) I will admit to being an edge case, but in some ways we're all edge cases, they're just different edges...
- Cameron Neylon
Neil, may I link to this FF thread from Book of Trogool?
- D0r0th34
:-) Sure, different questions, different "levels" of data. I guess my angle is more a statistical one: how do I compare (seemingly) quite different datasets - what numbers can I extract and crunch? Less interested in the capture and description of data at every stage in the process.
- Neil Saunders
Sure, and those are very complete descriptions of experimental components. But what I want is: "I saw A on my gel, B in my LC/MS, C on my expression array and D on my SNP array and when I plug all that into some Bayesian predictor, it says cancer" :-)
- Neil Saunders
Ontologies are not the issue, it's more low level than that. I also work with microarrays, proteomics, metabolomics, and numerous physiological data sets. To keep all the data in one place I use a relational database, in this case postgresql because I like to store raw intensity values in array datatypes, along with pylons based web interfaces to display various views of the data to my...
more...
- Greg Tyrelle
My argument would be that the reason you're less productive is not because of the RDF and ontologies per se, but because the ontologies aren't really built for what we want to do. They're for describing certain types of outcomes, not for integrating data in a discovery phase. But Neil's (entity, probe, value) is still an ontology of sorts. It is just a higher level one. My belief is...
more...
- Cameron Neylon
But keep the discussion going - this is exactly the problem that e.g the SAGE project will have - http://sagebase.org - and as a notional member of the data working group I could do with all the ideas and help that's out there...
- Cameron Neylon
We are thinking too much in terms of data representation here. In the end what you are looking at is a data warehousing problem. You have different front end systems and you want to be able to pull data in for offline processing into a warehouse. That's pretty much what you do at any company doing a lot of analytics/business intelligence. Different types of data being collected in...
more...
- Deepak Singh
Neil, I was under the impression that normalization across arrays and labs wasn't actually a solved problem, yet. Surely that would have to come first before stripping things down to just assay-key-value?
- Mr. Gunn
Normalization ... aaargh! Most definitely not a solved problem
- Rajarshi Guha
Normalizing within your own experiments is hard enough, never mind across unrelated datasets. It's something we have to solve though, to make the most of public data.
- Neil Saunders
Neil, you may be intersted in looking at the Ontology-Based eXtensible Data Model (OBX) that was developed by Richard Scheuermann's group at UT Southwestern. It is being used for the ImmPort database (www.immport.org) The OBX model utilizes the BFO / OBI ontology as guides in creating a data model that is robust to new datatypes. You can see a presentation about it here:...
more...
- Burke Squires
Thanks Burke. ImmPort looks very impressive, I must say.
- Neil Saunders
This reminds me of what the TCGA is starting to do, by defining "data levels". For microarray data, Level 1 might be the raw images, Level 2, the intensity calls, Level 3, the normalized intensities, and Level 4 information on whether it's up or down regulated across multiple samples. For people like me, doing integrative analyses, it's easy to focus just on the higher level data and...
more...
- Chris Miller
which is exactly why you need separation of the layers and tools to bring data together for the downstream stuff
- Deepak Singh
from IM
Neil, I think you have just explained why tab-delimited files are often more useful than complex XML representations of the same data ;-)
- Lars Juhl Jensen
Tab-delimitted files would be grrrreat for me in my lab. If any of the rest of you would like to share our data, however, then you're completely screwed. Is the problem not that we're all duplicating each other's work by writing the same kind of parsers for the same kind of data? Proteomics (for example) has a standard (http://www.ebi.ac.uk/pride/). Is it really so hard to use / develop the community-based tools that are being generated around this standard?!?
- Neil Swainston
Well, the ratio of usable tools to schemas/ontologies is a whole other debate :-) But sure, in principle the tools are there - for individual types of data. What I highlight in the post is the difficulty of genuine data integration, as opposed to the current "write a parser for everything and mash it up" approach.
- Neil Saunders
#1 rule of data integration - if a format exists, it will be used
- Deepak Singh
...and if it doesn't exist there is a 70% chance someone will create it :-)
- Cameron Neylon
Chris M makes an important point wrt data levels, analogous to trace archives vs sequence dbs. Extending the sequence analogy, obsoleting levels will become important (it will rapidly become cheaper to resequence rather than store sequence).
- Chris Cotsapas
If I have my story right I think this came out of a criticism from a review panel that the structures and computational bio department was not collaborating enough. They came up with the mycoplasma collaboration that Luis Serrano in particular was very excited about. 3 science papers is not a bad way to show results :). I still have to read them.
- Pedro Beltrao
Terrific. Are we still maintaining that list of "outputs resulting from FriendFeed"?
- Neil Saunders
I was planning on doing a demo of annotation at PLoS before the end of the year - perhaps this article would be a good candidate. As always, anyone willing to join is welcome.
- Daniel Mietchen
i added a note once, but now it won't let me add any other notes :( I don't see a rule about one note per person. I should have held off for a good one.
- Christina Pikas
I also just noticed that my "annotation" - provided the link to StackOverflow - shows up in the general discussion, where the title "Link" certainly is not helpful, and there is no way I can edit it.
- Daniel Mietchen
maybe something is broken, my note appears in general comments but also in that portion of the text as a comment. maybe that's why I couldn't add other notes?
- Christina Pikas
Not sure why you can't add more notes. Certainly been able to in the past. I see both notes where they are supposed to be I think. But they will also appear in the general comments as well I think.
- Cameron Neylon
Great article! I really need to add some comments or notes, just to prove the authors' point :-)
- Björn Brembs
BTW, when does PLoS finally get karma? I've been asking for proper 'show off' userprofiles for like ever :-)
- Björn Brembs
Cameron, et al. - What's the most useful thing I could do to nurture and support this renewed interest in article level metrics? (not from a competing data product point of view, but a let's get some good technologies out there with good visibility)
- Mr. Gunn
@Cameron: Exactly! I even think having a profile where you can post a pic and see how many papers and comments were published, papers edited, etc.was the very first thing I asked for when I signed up :-)
- Björn Brembs
But it needs to be federated across publishers... :-)
- Cameron Neylon
if authors put in their 'customer' weight, this will go faster, so why not go syndicate :-)
- Claudia Koltzenburg
I think I'll use this paper in my spring thesis class -- this is the main one where I discuss publishing models -- and maybe I'll demo Diigo with this as a class project next to an article that discusses IF.
- Mickey Schafer
While we're on the subject of functionality wish lists, I would also like an embed functionality for PLoS papers. Collecting my publications together but don't want to duplicate copies and reduce googlejuice for the journal - at least not for the OA papers anyway...
- Cameron Neylon
BTW, why isn't there a way to register this thread with the article? Why are we posting here and not on the article? There's got to be a lesson to be learned from this :-)
- Björn Brembs
from iPhone
I've included a link to this thread in a blog post: Article-level metrics getting attention http://ff.im/bGuNY
- Jim Till
+1 Bjoern :-) another question along these lines would be: why does Cameron's intial FF message link to CiteULike and not to http://www.plosbiology.org/article..., or plainly doi:10.1371/journal.pbio.1000242 ?
- Claudia Koltzenburg
Because that was the way I brought the link in. I think that that pointer is appropriate. It is a pointer to the fact that I bookmarked it. Other people linked to the paper directly. Perhaps the issue is that we accidentally aggregated around the "wrong" item to talk about the paper. I'm not sure this is a problem as long as the referral works - its a UI irritation not a problem with...
more...
- Cameron Neylon
well, not directly, maybe in this ff-thread we're just providing some material for what you say in your paragraph "Technical Solutions to Social Problems", namely: "approaches that gather information from processes that are already part of the typical research workflow are also much more likely to succeed." - even though ff may not be part of 'the typical research workflow' (yet?) - and...
more...
- Claudia Koltzenburg
That's true, and certainly conversation sparked by the paper. But how to capture that in a way that is useful further down the line might be tough...
- Cameron Neylon
There are hundreds of papers on normalization techniques and gene selection methods. And each one claims to be better than the others. But in most cases, the improvements seem incremental. Is the difference really significant? It’s not always clear.
- Deepak Singh
Help! My Mendeley has gone haywire. It seems every time I open it, it thinks I have 20,000 more documents to add? Now it's up to 63,000. I think I only have 1 thousand something actually. This happened after I'd noticed that "folder watching" had somehow stopped working. I reselected the folder, and now this is happening. Any ideas?
More than odd, to say the least! I'll ask support to check... sorry for that!
- Victor / Mendeley Team
Thanks, Victor. Also, FYI I just reopened it, and it only incremented to 62,976, which is good. FYI: I checked my web account, and there are only 98 pages (about 1960 files), which seems correct. If I reinstall Mendeley, will I lose all of the information I've added to fix citations that were generated from PDFs in my library?
- Steve Koch
Steve, you won't lose anything, because you should be able to get the correct info from Mendeley Web. I assume someone from support has gotten in touch with you by now.
- Mr. Gunn
@Mr Gunn, no tech support yet. To clarify: If I reinstall, and then watch those folders again: will Mendely recognize that the PDFs represent the article already in my online database? Or will it essentially double the size of my library with the newly recognized PDFs?
- Steve Koch
Whoops: just noticed a tweet from Mendley support from 11 hours ago. I don't check twitter that often.
- Steve Koch
I would uninstall the-software-that-should-not-be-named and use some actual good service/program, like Citeulike and Zotero.
- Paulo Nuin
I'm neutral on Zotero (tried it too early, probably). Citeulike is very useful and quite different than Mendeley. The folder watching / auotdetect feature of Mendeley is great, especially for extremely lazy people (99% of everyone). I suppose Zotero is similar in that regard. Mendeley looks like they will be the first service to offer simultaneous markup of shared PDFs by multiple...
more...
- Steve Koch
The only problem with Mendeley is that it sucks. And nothing can change that.
- Paulo Nuin
Proceedings of the National Academy of Sciences (16 November 2009) 10.1073/pnas.0901989106 The protein kinase haspin/Gsg2 plays an important role in mitosis, where it specifically phosphorylates Thr-3 in histone H3 (H3T3). Its protein sequence is only weakly homologous to other protein kinases and lacks the highly conserved motifs normally required for kinase activity. Here we report structures of human haspin in complex with ATP and the inhibitor iodotubercidin. These structures reveal a constitutively active kinase conformation, stabilized by haspin-specific inserts. Haspin also has a highly atypical activation segment well adapted for specific recognition of the basic histone tail. Despite the lack of a DFG motif, ATP binding to haspin is similar to that in classical kinases; however, the ATP γ-phosphate forms hydrogen bonds with the conserved catalytic loop residues Asp-649 and His-651, and a His651Ala haspin mutant is inactive, suggesting a direct role for the catalytic loop in...
- Neil Saunders