about semantic web, software architecture and life in general

This is an archived version of my blog.
See also my homepage.

Post details: SIOC, SPARQL and TimeLine

2006-07-14

Permalink 13:21:28, Categories: Semantic Web   English (EU)

SIOC, SPARQL and TimeLine

Putting Blogs on TimeLine.

Following the release of excellent SIMILE TimeLine visualisation tool here is what can be done using some SIOC data that's out there:


Fig 1. Blog posts on a TimeLine

Different icons show what blog the post belongs to (Christoph Görn, Harry Chen, John Breslin from DERI, ...).
If your posts do not have an icon and use a blue bullet instead please let me know what icon to use.

You can go back in history (see below). Notice posts from Danny Ayers (watch out for cats!) appearing - from the time when he was using WordPress + WP SIOC plugin.


Fig 2. December 2005 on the same TimeLine.

Taking it to extremes: SIOC TimeLine from year 1997 showing the origins of John Breslin's blog.

Technical Details:

  1. Master list for the crawler taken from SIOC-enabled sites list, converted to RDF using Alex Passant's Wiki->RDF script.
  2. Gathered RDF data (using my crawler.py). Added Danny's SIOC data from the archive.
  3. Stored data in a Joseki 3.0 beta server (SPARQL endpoint is here).
  4. Perform a SPARQL query (script), used XSLT (sparql-tline.xslt) to convert SPARQL XML result set to TimeLine XML format (converted XML file). Icons for blogs are also added via XSLT (adding statements to the RDF store is another option).
  5. ... and here is the result - TimeLine doing its AJAX-y magic. :)

For more info read How to Create Timelines on SIMILE site.

See Also:
- Danny Ayers: SPARQL Timeline ps.
- Alex Passant: SPARQL/JSON into Timeline

Notes:

There are some things that need to be improved.

  • Visual bugs - icons are getting cropped and text labels are wrapped to next line and cropped at the bottom. Making icons smaller makes then unrecognizeable, so that is not a solution (unless there's an icon graphics wizard who can teach me how to make good, small icons)
  • Visual appearance - after bugs are fixed there are improvements that can be done - e.g. make the posts (small vertical lines) appear at the monthly band in one line creating a "feel" of density of posts in time; ...
  • Performance / data volume - the amount of data including full text of blog posts can be quite large (it is amazing how much a small group of people can write in couple of months). Loading data on demand can be a solution - both for metadata of posts that are outside of viewable area and for post contents.
  • Reliability - Joseki 3.0 is still in beta and crashes from time to time. It is not a problem now since data for TimeLine are queries for only once, in "attended" mode. But it will be a problem if doing data loading on demand. If you notice the store is down, please let me (captsolo @ gmail.com) know.
  • Dynamic updates - currently data are crawled and stored in one batch. How to do incremental crawling of new SIOC data?
  • Use SIOC - timeline as is now does not use much of relations available in SIOC. There is more information in the data store - posts linking to comments, site pointing to posts it contains, topics of posts, etc. It would be good to visualize this richer data, although I am not 100% sure timeline is best fit for that.

Comments, Pingbacks:

Comment from: captsolo [Member]
See "Timeline feedback and questions" for David Huynh's suggestions how to solve some of the problems described here.
PermalinkPermalink 2006-07-25 @ 21:05

Page served in 0.443 seconds

Valid XHTML 1.0! Valid CSS! Valid RSS! Valid Atom!