Egon Willighagen
can you post some simple queries? My attempts yeild not hits :( - Rajarshi Guha
SELECT DISTINCT ?protein WHERE { ?protein <> "Isomerase" } - Egon Willighagen
Or, to select the molecules with an activity against these isomerases: SELECT DISTINCT ?activity ?compound WHERE { ?protein <> "Isomerase" . ?activity <> ?assay . ?assay <> ?protein . ?activity <> ?compound . } LIMIT 5 - Egon Willighagen
This Gist has it more readable: - Egon Willighagen
thanks. So the properties that you've specified (as seen in the SNORQL view) were defined by you? In other words, could somebody else who puts up an RDF version of Chembl come up with a different set of properties? - Rajarshi Guha - find articles on assays targeting isomerases in mice - Rajarshi Guha
Yes, very likely. It is actually recommended to use Open Specifications (which do not really exist yet, though initiatives are working on this), or otherwise something under a controlled domain (hence But that's not the end of it. With OWL we can define equivalence of instances. ... - Egon Willighagen
Yes, just realized that owl:sameAs will let me jump from one class of identifier (a chembl target) to a another class of identifier (UniProt ID). But, if I understood correctly, you have actually defined all these equivalences (ie chembl target -> UniProt, chembl target -> EC num etc) by hand? - Rajarshi Guha
no, unique identifiers :) uniprot refs are in the ChEMBL db... - Egon Willighagen
Aah, got it. OK, so it's all quite nice. But from my 30 min of playing, all my examples could be rewritten as traditional SQL. Can you give an example where SPARQL lets me do something that SQL will not (or will be very difficult)? Some things that come to mind: connect to another triple store and use owl:sameAs to automatically map fields and connect say UniProt ids from Chembl to some values in the other DB. Also, the owl:sameAs certainly seems very powerful and I have a feeling that might be the key to a lot of RDF magic :) But can't come up with a concrete example :( - Rajarshi Guha
Well, you're thinking in the right direction... SPARQL does not allow much more than SQL, but it is embedded in the power of having RDF all over the internet (e.g. embedded in individual HTML pages)... it is not so much that it can do more than SQL, it just defines a clean, simple interface to query all of your RDF. So, it's more like a reformulation... - Egon Willighagen
I think I even saw that SPARQL end points can even query other end points... but have not tried something like that. - Egon Willighagen
that's what seems worrisome to me. Essentially what you're saying is that all RDF data will be of a 'standard format' (?) Now, in reality that may not be the case, hence owl:sameAs and its friends let us make equivalence associations in lieu of a standard set of properties. So either you somehow have to force everybody into a fixed set of identifiers. or else have a massively connected graph, mapping out equivalences (which looks like it must be done manually) between various properties from different sources - Rajarshi Guha
No, RDF is not a format... nor is it forcing you to use a particular ontology with classes and/or properties. Defining OWL axioms allows you to 'remove' links, and SPARQL also allows you to rewrite patterns. - Egon Willighagen
if that is the case, how does "SPARQL does not allow much more than SQL, but it is embedded in the power of having RDF all over the internet" become an advantage? If you can have arbitrary RDF all over the place, you still have to define a set of OWL axioms (equivalences, constraints, ...) manually right? Doesn't this lead, over time, to the second scenario I mentioned above (i.e., a massively connected graph)? - Rajarshi Guha
The use of standards is needed, but linking to relational databases is at least equally hard and do not even explicitly define field and/or table equivalence... a 'massively connected graph' in RDF is easily rewritten into a graph using a single ontology... - Egon Willighagen
re RDBMS equivalences, I absolutely agree. But isn't the process of defining owl axioms over for different RDF sources the same as mapping RDBMS fields? - Rajarshi Guha
one cool thing I'd like to see is a link between scientific facts and properties to people. One way I can see this happening would be to have an RDF dump of PubMed such that scientist names, locations are available. I could see some non-obvious inferences coming up from queries merging say ChEMBL and such a dump - Rajarshi Guha
OWL axioms are a rather powerful way of mapping... you can say (natural language): anything with :hasInChI = foo:Molecule ... in other words, you can define new tables... - Egon Willighagen
Yes, there is a very serious risk of mash-ups in all this :) - Egon Willighagen
So would you say that the main thing holding back interesing results, is the availability of RDF data sources? - Rajarshi Guha
Yes, with the API now clearly defined, it is back to Open Data in chemistry :) - Egon Willighagen
It is interesting to see that something like XML (think CML) will likely be obsoleted by RDF... - Egon Willighagen
Why should that be? XML is just a format - it wasn't designed to impose semantics. - Rajarshi Guha
Indeed. At the same time, N3 seems to be the more popular syntax to format RDF. And a properly crafted ontology for chemical graphs could be more readable than CML... - Egon Willighagen
Do you know whether anybody (Bio2RDF?) has created a RDF'ized version of PubMed? - Rajarshi Guha
Aah, OK. What if one wanted to add extra information? I could see a RDF resource that take institution information from PubMed id's, links it to geography (via GeoNames?). So you find research on a topic being done in a certain region - Rajarshi Guha
That's where the mashing up comes in... ? - Egon Willighagen
hmm, got it. interesting things to do, but I need the earths rotation to slow down - Rajarshi Guha
Got pointed to which seems to have aggregated a lot of DB's. I'd assume that their ontology, say for the ChEMBL stuff differs from yours - Rajarshi Guha
the sparql endpoint for this is really helpful. When an organization wants to federate a set of ontologies, say their own internal model ("myCompunds") and chembl, then [I think] it would be necessary to have access to the two rdf ontologioes for the datasets. For example, both chembl.rdf and myCompunds.rdf would be read into a modelling tool such as topBraid or protege and the organization can create a federated ontology. Does this mean that rdf for the ontology should be made available wherever a sparql endpoint is made available? - Derek S
I think ideally, the used ontologies should be available via the SPARQL end point too - Egon Willighagen
is the ontology for chembl accessible from for owl:imports - Derek S
No, not at this moment... and not as .owl file either. please file a bug report at: - Egon Willighagen