Post details: Extended WordPress SIOC Exporter
2009-02-18
Extended WordPress SIOC Exporter
Claudia has extended our WordPress SIOC exporter and written up a nice report about it: Extended WP SIOC Exporter
She gave her permission to republish this report over here and would be happy to get your feedback.
I extended Uldis Bojars WP SIOC exporter to also export semantic metadata which are embedded in the HTML content of a Posting.
Why is this useful?
Semantic metadata embedded in the HTML of a posting’s content can reveal more information about the topic of a posting (i.e. about what a posting is about). Tools such as Structured Blogging (http://structuredblogging.org) or the Semantic Reblog prototype I am working on, embed semantic metadata directly into the HTML content of postings.
The WP SIOC Exporter relates at the moment the whole plain text of a post’s content with the resource representing the posting itself via the sioc:content property. The html representation of the post’s content is related with the resource representing the post via the content:encoded property. Additionally links are extracted from the post’s content and related with the post via the sioc:links_to property.
My extended version of the exporter also extracts images from posts’s content and relates them via the sioc:embeds property with the resource representing the post. If the image is a flickr image an rdf:seeAlso link is generated that points to the RDF description of the image obtained via Masahide Kanzaki’s wrapper. Furthermore semantic metadata, which are embedded in the HTML content of a post, are extracted and relate with the post via a sioc:embeds property. I am not sure if sioc:embeds is the best property to relate the embedded entities with its container post. Maybe something like sioc:topic would be better. However the URI of the embedded resources are related with the post URI and the parts of the resource description, which has been embedded, are also exposed (because if only parts of a resource’s description are reused or embedded in a post’s content, it might be also interesting for machines to know which parts have been reused/embedded in the posting and if the reused/embedded resource is described via microformats, it might not have an URI which identifies the resource).
I use the ARC2 library (version from 2009-02-12 -> it is important to use this version or higher) which provides a parser to extract different embedded semantic metadata formats such as RDFa, eRDF and MF. I modified the method declaration of the toRDFXML method in the ARC2_Class.php file . Thats why at the moment “my” version of the ARC2_Class.php must be included to the SIOC Exporter arc folder. But Benjamin already told me that the modification will be included in the next ARC2 version.
If you fancy to test this version of the WP SIOC Exporter, download it here.
Any thoughts are of course welcome!
Comments, Pingbacks:
This post has 7 feedbacks awaiting moderation...
Leave a comment:
captsolo weblog
See also:
- My homepage (captsolo.net)
- @CaptSolo on Twitter
- FriendFeed profile
| Mon | Tue | Wed | Thu | Fri | Sat | Sun |
|---|---|---|---|---|---|---|
| << < | > >> | |||||
| 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| 8 | 9 | 10 | 11 | 12 | 13 | 14 |
| 15 | 16 | 17 | 18 | 19 | 20 | 21 |
| 22 | 23 | 24 | 25 | 26 | 27 | 28 |
| 29 | 30 | 31 | ||||
Search
Gallery
www.flickr.com
|
Categories
Archives
- October 2009 (1)
- March 2009 (2)
- February 2009 (4)
- January 2009 (2)
- December 2008 (2)
- November 2008 (5)
- October 2008 (10)
- August 2008 (1)
- July 2008 (4)
- June 2008 (1)
- May 2008 (5)
- April 2008 (2)
- More...
- more...

