about semantic web, software architecture and life in general

This is an archived version of my blog.
See also my homepage.

Post details: FOAF for Social Network Portability

2007-10-04

FOAF for Social Network Portability

When Dopplr announced importing your social network from other sites the single most requested feature in comments was FOAF import (with CSV import being second).

Note: Please add your feedback. Comments are working, but may be held for moderation and will appear after being approved.

So - what’s the use of FOAF for Social Network Portability?

1. Existing FOAF data

LiveJournal is one of the largest sources of FOAF data and every user automatically gets a FOAF profile (http://username.livejournal.com/data/foaf/, same works for communities). It creates a large amount of data which other application can make a use of. Exporting FOAF is a simple task and other community sites may start exporting it should they see a good reason to do so (which is what this article aims to address).

You don’t have to look at “raw” data, though. Let a computer do the job and take a look at a more human-friendly rendering of FOAF (and navigate the data by following links indicated with arrows) using one of the many RDF browsers. This will show you what data are in there, but can still be quite nerdy.

In the end it will be other applications and sites like Dopplr that can use FOAF and other structured data to provide better services to its users. E.g., Apple Safari RSS reader uses FOAF to display a list of person’s friends when browsing her LiveJournal feed (a screenshot here).

P.S. If you want to be aware of web pages containing FOAF, SIOC and DOAP machine-readable information while browsing the web, the Semantic Radar extension for Firefox may be for you.

2. What about hCard + XFN ?

Many may think that vCard or hCard + XFN is the only / best choice for this. As a format for representing structured data they are a good thing, but at the same time are only and not always the best solution.

Most of the entries in a list of Services with hCard+XFN supporting friends lists start with “login and …".

A public hCard+XFN usually will not contain enough data to identify your friends. It would need to provide information for linking their identities across sites, such as an email address. hCard expresses it in clear text and this is something that spammers will like but your friends may not.

You could require a user to enter a login name and a password to access a private profile with all the required details. (An alternative is to only use public information, but this is mainly limited to exact nickname matches.) Will you always trust a site enough to give it passwords for all other sites you want to port your data from? Really?

Also, once the sites are talking directly, they don’t necessarily need an HTML-based data format and can as well use the APIs provided by these sites.

3. Useful properties of FOAF

FOAF has some useful properties that make it particularly interesting for social network portability.

FOAF was created with identification of objects in mind. It is built on RDF, which is a generic format for linked data on the web. As such, it gives us some flexibility on how pieces of data can be distributed across the web. We’ll use that later. (But enough about RDF)

Take a look at any LiveJournal FOAF profile (or a human-readable rendering of it). You will notice there some basic information about the page (title, feed, …) and quite a lot of information about its author. This includes core FOAF properties (name, homepage, birth date, image, interest(s), …) and LiveJournal extensions to it (city, country, school, bio, …). foaf:knows allows to link together people and their friends.

There vocabularies may use different namespaces, but FOAF + RDF allows to freely use them together and make sense of this data. What matters it that this is some rich data applications can work with.

4. Identifying objects in FOAF

We need a way to identify people (your friends) based on the public FOAF profile. While you can express an email address in FOAF we don’t want to do that on a public web page. Take a look at foaf:mbox_sha1sum instead.

It is a unique hash generated from person’s email address (using SHA1 algorithm). There is a one-way correspondence between an email address and its SHA1 hash, but no way to recreate an email address using a hash.

This makes foaf:mbox_sha1sum very useful when you need to identify a person by her email address, but do not want to put this email on the web in clear text. Many LiveJournal profiles (but not all) contain a foaf:mbox_sha1sum. This information can be used by other sites to show you if your friends are already registered there. All they have to do is compare email SHA1 hashes of registered users (which can be easily calculated) with those of your friends.

P.S. There are other ways to identify a person (e.g., via a homepage URL), but let’s concentrate on email and its hash now.

5. What’s stopping Dopplr ?

If people are asking for FOAF import and if this format is worth considering, what is stopping Dopplr and other social media / network sites?

Best if they can share insights into what the real problems are. While waiting for comments, here’s what I think: it’s practical data access issues. It makes sense to start with large sources of FOAF data and look how they are structured.

LiveJournal is probably the largest of them. Notice that a LJ FOAF profile contains a lot of information about a person itself, but not that much info about friends. Fear not - reference to every friend contains a link (rdfs:seeAlso) to the full FOAF profile and you can follow it to retrieve all the details required.

And that is a problem. To check for all your friends a site will need to make (n+1) HTTP requests where n = a number of your friends. LiveJournal policy for bots requires not to make more than 5 requests per second. Doing those many requests takes time and bandwidth and may be something that the sites you are migrating want to avoid.

6. Solution

How to make LiveJournal FOAF data more useful?

Let’s just take all the properties needed (foaf:mbox_sha1sum in this case) from friends’ FOAF profiles and copy them to where they are referenced to in your FOAF profile. Remember the flexibility of FOAF and RDF? That makes it perfectly valid - just copy’n'paste.

This would require a simple change at LiveJournal’s side, but would make using this data much more efficient - now just 1 HTTP request is needed to move a network of all your friends to a new site.

What about other FOAF data sources? Many already contain information needed to identify their contants. For example, every person in FOAF profile of Tim Berners-Lee has an email address, its SHA1 hash or a unique identifier (URI which is another option how to identify objects) assigned. These sources are rich enough for our needs, but may not provide the critical mass needed to get the ball rolling.

That’s why we need to get better FOAF data from large social media sites. And to get a clear understanding of how FOAF can be best used for social network portability.

Related links

“Thoughts on the Social Graph” by Brad Fitzpatrick

… will add suggested links here …

Comments

I would love to hear from you - is this information interesting or useful? Is it all wrong, perhaps? Do you want to add something or ask a question? Go ahead! :)

P.S. This is not an attack on microformats. Some of the things described (e.g., a hash of the email address) can be easily added to them, if needed. Data can also be converted from one data format to another. The goal of this article is to provide some information about using FOAF for social network portability. I hope you will find that it has some power we can use.

Comments, Pingbacks:

Comment from: cu [Visitor] · http://monkeyseemonkeydo.lv
this post, really inspires to look around for where to apply FOAF goodness - both acting as FOAF content creator and as consumer. I currently use FOAF in my klab.lv (livejournal clone) friendship change monitor (http://monkeyseemonkeydo.lv/ieraksts/1497/cibin-draudzibu-izmainu-rss). But maybe I can do more with this data, like mash it together with other FOAF sources e.g. blogiem.lv or something. Brainstorm brainstorm.
PermalinkPermalink 2007-10-04 @ 17:40
Comment from: Tom Heath [Visitor] · http://kmi.open.ac.uk/people/tom/html
Hey captsolo. Nice post, and great to see someone keeping this issue high on the agenda :D I have a few comments, so will take them one at a time.

First off I think it's great that LiveJournal does FOAF export. However, there are a few shortcomings of the FOAF export on LJ (no URIs for people/loads of Blank Nodes, no use of foaf:PersonalProfileDocument to link the FOAF file itself to the person it's about), which means that something like MyOpera makes a better example.

Secondly, I really strongly disagree with your suggestion in 6. that we replicate other people's data in our own FOAF files. This is totally anti-Web, and makes for a data management headache for you if I change my data. In contrast I would say the solution is that we build apps that are better at aggregating, managing and using data from across the Web, and do so in ways that don't bring down the sites publishing the data. Alternatively, middle-people that can do this for us may be the solution.

Thirdly, (what restraint!), the shameless plug... If you're looking for a social media site that both consumes and publishes FOAF data then Revyu at http://revyu.com/ is your answer. OK, it may not be the size of LiveJournal, or have the startup funding of Dopplr, but it alows you to create a "network" which is published in FOAF, consumes your existing FOAF file to supplement your profile page with more data about you, and does all the right Linked Data stuff with URIs etc. It's not the ultimate demo of what can be done with portable social networks (we know we could do so much more if time allowed), but it's a start in the right direction. My Revyu network is fast becoming at least as rich as that defined in my hand-crafted FOAF file.

Cheers, Tom.
PermalinkPermalink 2007-10-04 @ 18:03
Comment from: Julian Bond [Visitor] · http://www.ecademy.com
Ecademy has huge quantities of FOAF available. Here's mine
http://www.ecademy.com/module.php?mod=network&op=foafrdf&uid=1
PermalinkPermalink 2007-10-05 @ 17:16
Comment from: Stephanie Booth [Visitor] · http://climbtothestars.org
I'm a bit wiped out at the moment, so I didn't manage to follow all the technicalities. One thing I'm concerned about when it comes to Portable Social Networks is that they should also be structured: http://climbtothestars.org/archives/2007/08/16/we-need-structured-portable-social-networks-spsn/.

This structure should be embedded in the portable information about the network in some way, and:

- be highly personalized and flexible (tagging people, basically)
- be private (for reasons explained in that post -- it's too sensitive from a human relations point of view to show)

Does what you're describing allow for that?
PermalinkPermalink 2007-10-09 @ 18:46
Comment from: Ozgur Cem Sen [Visitor] · http://SemanticWebFeeds.com/
Thanks for this resourceful site. Recently I've been trolling search engines on good semantic web related sites so that I can put together a neat list of feeds from them.

The site that I've been working on should appear next to my name.

I don't want to pollute the comments with a link to my site, but at least I am not selling ED pills, right :)

Kindest regards

PermalinkPermalink 2007-10-25 @ 08:16
Comment from: Nouveau Riche [Visitor] · http://nouveauricheuniversity.blogharbor.com
Although some people may still think that internet offers some degree of privacy through some sites that are said to offer privacy and anonymity this thing is a lie and a big one... if you search google, you can easily find ways to get your hand on private information, tutorials, software, hardware etc.
PermalinkPermalink 2008-03-14 @ 22:00

Page served in 0.449 seconds

Valid XHTML 1.0! Valid CSS! Valid RSS! Valid Atom!