Visualisation of Ontologies and Large Scale Graphs

{{en|A phylogenetic tree of life, showing the ...
Image via Wikipedia

For a whole number of reasons, I am currently looking into the visualisation of large-scale graphs and ontologies and to that end, I have made some notes concerning tools and concepts which might be useful for others. Here they are:

Visualisation by Node-Link and Tree

jOWL: jQuery Plugin for the navigation and visualisation of OWL ontologies and RDFS documents. Visualisations mainly as trees, navigation bars.

OntoViz: Plugin into Protege…at the moment supports Protege 3.4 and doesn’t seem to work with Protege 4.

IsaViz: Much the same as OntoViz really. Last stable version 2004 and does not seem to see active development.

NeOn Toolkit: The Neon toolkit also has some visualisation capability, but not independent of the editor. Under active development with a growing user base.

OntoTrack: OntoTrack is a graphical OWL editor and as such has visualisation capabilities. Meager though and it does not seem to be supported or developed anymore either…the current version seems about 5 years old.

Cone Trees: Cone trees are three-dimensional extensions of 2D tree structures and have been designed to allow for a greater amount odf information to be visualised and navigated. Not found any software for download at the moment, but the idea is so interesting that we should bear it in mind. Examples are here, here and the key reference is Robertson, George G. and Mackinlay, Jock D. and Card, Stuart K., Cone Trees: animated 3D visualizations of hierarchical information, CHI ’91: Proceedings of the SIGCHI conference on Human factors in computing systems, 1991, ISBN = 0-89791-383-3, pp.189-194. (DOI here)

PhyloWidget: PhyloWidget is software for the visualisation of phylogenetic trees, but should be repurposable for ontology trees. Javascript – so appropriate for websites. Student project as part of the Phyloinformatics Summer of Code 2007.

The JavaScript Information Visualization Toolkit: Extremely pretty JS toolkit for the visualisation of graphs etc…..Dynamic and interactive visualisations too…just pretty. Have spent some time hacking with it and I am becoming a fan.

Welkin: Standalone application for the visualisation of RDF graphs. Allows dynamic filtering, colour coding of resources etc…

Three-Dimensional Visualisation

Ontosphere3D: Visualisation of ontologies on 3D spheres. Does not seem to be supported anymore and requires Java 3D, which is just a bad nightmare in itself.

Cone Trees (see above) with their extension of Disc Trees (for an example of disc trees, see here

3D Hyperbolic Tree as exemplified by the Walrus software. Originally developed for website visualisation, results in stunnign images. Not under active development anymore, but source code available for download.

Cytoscape: The 1000 pound gorilla in the room of large-scale graph visualization. There are several plugins available for interaction with the Gene Ontology, such as BiNGO and ClueGO. Both tools consider the ontologies as annotation rather than a knowledgebase of its own and can be used for the identification of GO terms, which are overrepresented in a cluster/network. In terms of visualisation of ontologies themselves, there is there is the RDFScape plugin, which can visualize ontologies.

Zoomable Visualisations

Jamabalaya – Protege Plugin, but can also run as a browser applet. Uses Shrimp to visualise class hierarchies in ontologies and arrows between boxes to represent relationships.

CropCircles (link is to the paper describing it): CropCircles have been implemented in the SWOOP ontology editor which is not under active development anymore, but where the source code is available.

Information Landscapes – again, no software, just papers.

Reblog this post [with Zemanta]

SWAT4LS2009 – Michael Schroeder: Predicton of Drug Target Interactions from Literature by Context Similarity

Typical researcher spends 12.4 hours a week searching for information. Why not use Google? ‘Cause Google is not semantic.

Go PubMed – Filter PubMed contents against all the terms in the Gene Ontology. If you use simple categorisation for information retrieval potentially increase search burden due to compartmentalisation. However works the other way round too…useful filtering.

Showing some examples of faceted browsing of PubMed content and systematic drilldown into search results. Not easy to blog, but literature exploration in this way is always fascinating. Examples include the analysis of research trends, networks of colaborators etc..new tool in Go PubMed also allows the discovery of indirect links or inferred links.

Have developed a similar system for the web: Go Web (works on the top yahoo search results).

Remarks on Ontology Generation: have developed a plugin for OBO Edit…search for term and plugin makes suggestions for terms that might be included in new ontologies. Points out terms in existing ontologies. Also helps with the generation of definitions for terms…wow this is extremely useful in SO many ways….

Now let’s talk about drugs and targets….

Try and mine for gene mentions in text…find a gene term and then use context to decide what it is we are talking about. Once gene has found look for statistically significant co-occurences. The results have been made available in GoGene. Again can do bibliometric trend analysis – genes are ranked by community interest.

From drugs to genes..what is the link between a gene and a drug using context profiles: what are the disease terms related to a given drug…then to genes.

Gotta stop blogging…enjoying this talk far too much…….

Reblog this post [with Zemanta]

SWAT4LS2009 – Sonja Zillner: Towards the Ontology Based Classification of Lymphoma Patients using Semantic Image Annotation

(Again, these are notes as the talk happens)

This has to do with the Siemens Project Theseus Medico – Semantic Medical Image Understanding (towards flexible and scalable access to medical images)

Different images from many different sources: e.g. X-ray, MRI etc…use this and combine with treatment plans, patient data etc and integrate with external knowledge sources.

Example Clinical Query:” Show me theCT scans and records of patiens with a Lymph Node enlargement in the neck area” – at the moment query over several disjoint systems is required

Current Motivation: generic and flexible understanding of images is missing
Final Goal: Enhance medical image annotations by integrating clinical data with images
This talk: introduce a formal classification system for patients (ontological model)

Used Knowledge Sources:

Requirements of the Ontological Model

Now showing an example axiomatisation for the counting and location of lymphatic occurences and discussses problems relating to extending existing ontologies….

Now talking about annotating patient records: typical problems are abbreviations, clinical codes, fragments of sentences etc…difficult for NLP people to deal with….

Now showing detailed patient example where application of their classification system led to reclassification of patient in terms of staging system.

Reblog this post [with Zemanta]

SWAT4LS2009 – Keynote Alan Ruttenberg: Semantic Web Technology to Support Studying the Relation of HLA Structure Variation to Disease

(These are live-blogging notes from Alan’s keynote…so don’t expect any coherent text….use them as bullt points to follow the gist of the argument.)

The Science Commons:

  • a project of the Creative Commons
  • 6 people
  • CC specializes CC to science
  • information discovery and re-use
  • establish legal clarity around data sharing and encourage automated attribution and provenance

Semantic Web for Biologist because it maximizes value o scientific work by removing repeat experimentation.

ImmPort Semantic Integration Feasibility Project

  • Immport is an immunology database and analysis portal
  • Goals:metaanalysis
  • Question: how can ontology help data integration for data from many sources

Using semantics to help integrate sequence features of HLA with disorders
Challenges:

  • Curation of sequence features
  • Linking to disorders
  • Associating allele sequences with peptide structures with nomenclature with secondary structure with human phenotype etc etc etc…

Talks about elements of representation

  • pdb structures translated into ontology-bases respresentations
  • canonical MHC molecule instances constructed from IMGT
  • relate each residue in pdb to the canonical residue if exists
  • use existing ontologies
  • contact points between peptide and other chains computed using JMOL following IMGT. Represented as relation between residue instances.
  • Structural features have fiat parts

Connecting Allele Names to Disease Names

  • use papers as join factors: papers mention both disease and allele – noisy
  • use regex and rewrites applied to titles and abstracts to fish out links between diseases and alleles

Correspondence of molecules with allele structures is difficult.

  • use blast to fiind closest allele match between pdb and allele sequence
  • every pdb and allele residue has URI
  • relate matching molecules
  • relate each allele residue to the canonical allele
  • annotate various residoes with various coordinate systems

This creates massive map that can be navigated and queried. Example queries:

  • What autoimmune diseases can de indexed against a given allele?
  • What are the variant residues at a position?
  • Classification of amino acids
  • Show alleles perturned at contacts of 1AGB

Summary of Progress to Date:
Elements of Approach in Place: Structure, Variation, transfer of annotation via alignment, information extraction from literature etc…

Nuts and Bolts:

  • Primary source
  • Local copy of souce
  • Scripts transforms to RDF
  • Exports RDF Bundles
  • Get selected RDF Bundles and load into triple store
  • Parsers generate in memory structures (python, java)
  • Template files are instructions to fomat these into owl
  • Modeling is iteratively refined by editiing templates
  • RDF loaded into Neurocommons, some amount of reasoning

RDFHerd package management for data

neurocommons.org/bundles

Can we reduce the burden of data integration?

  • Too many people are doing data integration – wasting effort
  • Use web as platform
  • Too many ontologies…here’s the social pressure again

Challenges

  • have lawyers bless every bit of data integration
  • reasoning over triple stores
  • SPARQL over HTTP
  • Understand and exploit ontology and reasoning
  • Grow a software ecosystem like Firefox
Reblog this post [with Zemanta]

Just a quick note from the International Conference on Biomedical Ontology

I am normally pretty noisy these days when it comes to blogging or tweeting conferences….but haven’t produced anything for ICBO so far. This has mainly to do with the fact that I am far too busy learning, thinking and absorbing people’s ideas and yet again realising just how far ahead biology/biomedicine is in thinking how to deal with data properly. In any case, all I wanted to say was that ICBO has its own friendfeed group where Robert Hoehndorf and others are doing a sterling job documenting the conference and discussing what is said during the tutorial sessions that are currently going on. The friendfeed page is here:

http://friendfeed.com/icbo

So do read along if you want to follow what is going on from afar!