SWAT4LS2009 – Barend Mons: The meta-analysed semantic web, getting rid of ambiguity and redundancy
Introducing Concept Wiki – a semantic wiki and insulting his audience repeatedly.
Problems with getting the community to do annotation:
- everybody wants structured data, but nobody wants to do structured data entry. Not working.
- Everybody likes free text and cut and paste.
Now shows suggestion of ontology terms in authoring tools for introduction of structure in unstructured data.
Now talking about redundancy? Is it a problem? His point:
- no reviewer would accept the exact same paper twice let alone several times
- But same assertions are published over and over
- Oh dear – hopeless confusion between names, people, identifiers etc…..they are all “concepts” according to Barend Mons.
- The “essence of a nanopublication” is an annotated triple…i.e an assertion together with metadata about it (provenance, time etc…)
- Now points out that human language grammer is kind of similar to triples….subject predicate object…
- An assertion should only be accepted if it has value and advances human knowledge. The mind boggles….who decides what is interesting when….
- Triples vs “smart triples” apparently “smart triples” are curated/observed/hypothetical
Mentions deposit of ChemSpider Content into concept wiki.
Now shows some screenshots of use cases.
SWAT4LS2009 – A.L. Lamprecht: Semantics-Based Composition of EMBOSS Services with Bio-jETI
Bio-jETI: framework for model-based graphical design execution and management of bioinformatics processes
PROPHETS Plugin: visual semantic domain modeling, lose specification within the process model, non-formal specification of constrains using natural language templates, automatic generation of model checking formulae.
SWAT4LS2009 – James Eales: Mining Semantic Networks of Bioinformatics eResources from Literature
eResource Annotations could help with
- making better choices: which resource is best?
- which is available?
- reduce curation
- help with service discovery
Approach: link bioinformatics resources using semantic descriptors generated from text mining….head terms for services can be used to assign services to types..e.g. applications, data sources etc.
SWAT4LS2009 – Michael Schroeder: Predicton of Drug Target Interactions from Literature by Context Similarity
Typical researcher spends 12.4 hours a week searching for information. Why not use Google? ‘Cause Google is not semantic.
Go PubMed – Filter PubMed contents against all the terms in the Gene Ontology. If you use simple categorisation for information retrieval potentially increase search burden due to compartmentalisation. However works the other way round too…useful filtering.
Showing some examples of faceted browsing of PubMed content and systematic drilldown into search results. Not easy to blog, but literature exploration in this way is always fascinating. Examples include the analysis of research trends, networks of colaborators etc..new tool in Go PubMed also allows the discovery of indirect links or inferred links.
Have developed a similar system for the web: Go Web (works on the top yahoo search results).
Remarks on Ontology Generation: have developed a plugin for OBO Edit…search for term and plugin makes suggestions for terms that might be included in new ontologies. Points out terms in existing ontologies. Also helps with the generation of definitions for terms…wow this is extremely useful in SO many ways….
Now let’s talk about drugs and targets….
Try and mine for gene mentions in text…find a gene term and then use context to decide what it is we are talking about. Once gene has found look for statistically significant co-occurences. The results have been made available in GoGene. Again can do bibliometric trend analysis – genes are ranked by community interest.
From drugs to genes..what is the link between a gene and a drug using context profiles: what are the disease terms related to a given drug…then to genes.
Gotta stop blogging…enjoying this talk far too much…….
SWAT4LS2009 – Sonja Zillner: Towards the Ontology Based Classification of Lymphoma Patients using Semantic Image Annotation
(Again, these are notes as the talk happens)
This has to do with the Siemens Project Theseus Medico – Semantic Medical Image Understanding (towards flexible and scalable access to medical images)
Different images from many different sources: e.g. X-ray, MRI etc…use this and combine with treatment plans, patient data etc and integrate with external knowledge sources.
Example Clinical Query:” Show me theCT scans and records of patiens with a Lymph Node enlargement in the neck area” – at the moment query over several disjoint systems is required
Current Motivation: generic and flexible understanding of images is missing
Final Goal: Enhance medical image annotations by integrating clinical data with images
This talk: introduce a formal classification system for patients (ontological model)
Used Knowledge Sources:
- Ann-Arbor Staging System – particularly suitable for lymphoma patients
- RadLex
- Foundational Model of Anatomy
- Semantic Image Annotation
Requirements of the Ontological Model
- Capture the rationale of the Ann Arbor Staging system
- Integrate external ontologies
- Ontology must describe the patient record
Now showing an example axiomatisation for the counting and location of lymphatic occurences and discussses problems relating to extending existing ontologies….
Now talking about annotating patient records: typical problems are abbreviations, clinical codes, fragments of sentences etc…difficult for NLP people to deal with….
Now showing detailed patient example where application of their classification system led to reclassification of patient in terms of staging system.
SWAT4LS: Demo Preview NeuroLex.org.
- online wiki-bases ontology for neuroscience
- built on top of mediawiki
- domain scientists can make contributions and a curation process turns this into formal representations
SWAT4LS2009 – Keynote Alan Ruttenberg: Semantic Web Technology to Support Studying the Relation of HLA Structure Variation to Disease
(These are live-blogging notes from Alan’s keynote…so don’t expect any coherent text….use them as bullt points to follow the gist of the argument.)
The Science Commons:
- a project of the Creative Commons
- 6 people
- CC specializes CC to science
- information discovery and re-use
- establish legal clarity around data sharing and encourage automated attribution and provenance
Semantic Web for Biologist because it maximizes value o scientific work by removing repeat experimentation.
ImmPort Semantic Integration Feasibility Project
- Immport is an immunology database and analysis portal
- Goals:metaanalysis
- Question: how can ontology help data integration for data from many sources
Using semantics to help integrate sequence features of HLA with disorders
Challenges:
- Curation of sequence features
- Linking to disorders
- Associating allele sequences with peptide structures with nomenclature with secondary structure with human phenotype etc etc etc…
Talks about elements of representation
- pdb structures translated into ontology-bases respresentations
- canonical MHC molecule instances constructed from IMGT
- relate each residue in pdb to the canonical residue if exists
- use existing ontologies
- contact points between peptide and other chains computed using JMOL following IMGT. Represented as relation between residue instances.
- Structural features have fiat parts
Connecting Allele Names to Disease Names
- use papers as join factors: papers mention both disease and allele – noisy
- use regex and rewrites applied to titles and abstracts to fish out links between diseases and alleles
Correspondence of molecules with allele structures is difficult.
- use blast to fiind closest allele match between pdb and allele sequence
- every pdb and allele residue has URI
- relate matching molecules
- relate each allele residue to the canonical allele
- annotate various residoes with various coordinate systems
This creates massive map that can be navigated and queried. Example queries:
- What autoimmune diseases can de indexed against a given allele?
- What are the variant residues at a position?
- Classification of amino acids
- Show alleles perturned at contacts of 1AGB
Summary of Progress to Date:
Elements of Approach in Place: Structure, Variation, transfer of annotation via alignment, information extraction from literature etc…
Nuts and Bolts:
- Primary source
- Local copy of souce
- Scripts transforms to RDF
- Exports RDF Bundles
- Get selected RDF Bundles and load into triple store
- Parsers generate in memory structures (python, java)
- Template files are instructions to fomat these into owl
- Modeling is iteratively refined by editiing templates
- RDF loaded into Neurocommons, some amount of reasoning
RDFHerd package management for data
neurocommons.org/bundles
Can we reduce the burden of data integration?
- Too many people are doing data integration – wasting effort
- Use web as platform
- Too many ontologies…here’s the social pressure again
Challenges
- have lawyers bless every bit of data integration
- reasoning over triple stores
- SPARQL over HTTP
- Understand and exploit ontology and reasoning
- Grow a software ecosystem like Firefox
Licences for Ontologies

- Image via Wikipedia
One of the things that I have been grappling with for quite some time is the whole notion of licences for ontologies. Of course, neither I – nor anybody else for that matter, should have to worry about this. But the world is the way it is and so the question is: what would an appropriate licence for an ontology be? The answer to that question would mainly depend on what an ontology actually is. Is it a piece of software? Is it a database? A structured document (whatever that means in the context of licensing)?
I have spent quite some time talking to my colleagues about this and we haven’t been able to come up with a satisfactory answer. Even emailing the good folks at the Open Knowledge foundation did not ellicit a response. Now, it seems that the Science Commons have made an attempt to provide some answers on their website.
They state that whether an ontology is protected by copyright law will mainly depend on whether the ontology “contains a sufficient degree of creative expression” or whether it draws entirely on fact. In the latter case, it might not be protected. Now such a statement in itself is intriguing – in the communities in which I and many of the Science Commons people tend to spend most of my time, ontologies are usually understood to be representational artefacts, “whose representational units are intended to designate universals in reality and the relations between them.” Just how much “creative expression” that would allow is an interesting debate in itself, which is probably best had in the pub. But I digress.
Science Commons then goes on to quote some legal precedence in which US courts have upheld copyright in medical ontologies. So really, we don’t know. Science Commons then counsels “pre-emptive” licencing: if in doubt, slap a Creative Commons licence on your ontology (CC0 is explicitly recommended) – if it is later found that copyright cannot subsist in ontologies and that your licence is therefore invalid, you haven’t lost anything, but if it turns out that copyright does indeed subsist in an/your ontology, your bottom is covered. small surprise, too, that the Science Commons would wish to promote the licences of their sister organisation the Creative Commons.
Again, I am not convinced that Creative Commons Licences are an appropriate form of licence for ontologies any more than I am convinced that the GPL licence attached to ChemAxiom is an entirely appropriate licence for an ontology. I would be interested in what the OKF experts have to say about this. The bottom line, for now at least, seems to be that we just won’t know until someone does a lot of deep thinking or it will be tested in court.
Any comments and opinions would be extremely welcome!
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=36fa0518-d461-4cb7-b36a-9515630525eb)
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=d0896e7b-a271-439e-84ca-90936694a8ae)
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=a0093563-12f4-4678-a13f-61b92a0b891a)
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=45d5876e-ff23-4d60-9540-76f63ee0636a)
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=c926b277-0abf-4e9b-a5cd-7c57351c734d)
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=068d98c5-9453-45cd-a6d3-4ceb94ecda61)
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=3c9bdcdd-128b-41bd-bb79-8f7d18614618)
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=f9c06396-c9df-414c-b60a-945834326cdf)