8 Sep 2009
Nepomuk is a name that’s thrown around a lot. It’s claimed to be the answer to just about anything and at the same time is bashed for ‘not having a real effect’ on anything. With all of this buzz (for lack of a better term) surrounding the framework, I felt it might be time for me to figure out what’s going on myself. The following is a recollection of my learning process. It might not all work for you, heck it might not even be the correct way to do things. But it’s how I pieced the documentation together to comprehend what’s happening behind the scenes and why it matters.
First off, some explanations. From what I’ve gathered, Nepomuk is a database of relations and a framework for accessing that database. On a conceptual level, the triples that are contained within the database are simple, they’re “something — verb — something else” for example “My Task — is due — tomorrow”. If I understand it correctly, these relations (the verb parts) create a graph of all of the things contained in the database because they link all of the subjects together. Nepomuk/Soprano then allows you to search that graph and glean useful information from it.
Ok, so what does that mean to the application developer? To the end user? Let me try and explain with an example. One thing that I’ve always disliked about tags is their simplicity. A tag is a piece of text, no more. The only thing a computer knows about it is that it is in this huge general concept of ‘tag’ and that it contains some characters that may or not mean something to humans. So lets assume for example that I have a task (todo) to go and finish my engineering lab report with my lab group comprised of Bob and Sydney and myself. If I only have tags at my disposal, the best I could do would be to create a task with a subject such as “Finish Engin Lab” and then tag it with “Engineering”, “Lab”, “Sydney”, “Bob”, “Room 312″. If I’m looking at the task it’s reasonably clear what I meant, I need to do it with those two people and it’s located in room 312. However, what does the computer know about the lab? Almost nothing. I might have other labs that I’ve tagged with Sydney and Bob (seeing as they are my lab partners for the entire semester) but I might also have photos of a recent trip to Sydney, Australia also tagged with “Sydney”. Uh oh, now we have a problem. How can the computer figure out that my photos are not related to my lab? Nepomuk provides an answer to this.
Nepomuk contains a number of types that let you describe things more accurately. These types are included in things called Ontologies, but don’t worry about that just yet, just get familiar with the word. Anyways, instead of tagging my todo with just the text “Sydney” I could instead tag it with the PIMO::Person “Sydney” (when I use PIMO I mean Personal Information Model Ontology, one of a handful of namespaces for nepomuk). Now the computer knows that I’ve associated my lab with a person rather than just some text. Additionally, I could give it a location and so on. The real power in this is twofold. It opens up a whole new world for data visualization and it provides a good system for locating data-two things that we’re doing more and more of. Lets start with the data visualization. Instead of just seeing a list of tasks and hovering over/selecting them to see a list of tags, I could design an application that shows the task and then shows the people assigned to that task (and their faces or email addresses/phone numbers), the location it will take place in on the map (marble!) and the due date on a calendar instead of just a series of numbers/letters.
This leads into finding my data later on. Search is becoming a critically important part of the desktop and web experience. However, in a lot of ways our search is still quite primitive. It’s a keyword search that looks through the text of documents and filenames. Imagine a search client that let you specify that you wanted to find PIMO::Task objects that are assigned to a certain PIMO::Person and are due before a certain time. This is already way more useful than just searching for “Sydney” because I won’t get any of my photo results, and I won’t get any results that are of the wrong type (i.e. not tasks).
So, without touching any code I’ve tried to abstract away one use-case showing why Nepomuk might be useful. Hopefully it is a decent introduction, but if you’ve noticed errors or flaws in my understanding of Nepomuk feel free to point them out in the comments. I’m still learning about it myself. In my next post I’ll go into some details on my first experiences with actually working with Nepomuk and Soprano APIs and how I created a (very) mini example program.