8 Sep 2009

Conceptualizing Nepomuk

Posted by astromme

Update: Whoops, the wordpress cache was disabled. Not such a good idea as I’ve now found out all should be working again.

Nepomuk is a name that’s thrown around a lot. It’s claimed to be the answer to just about anything and at the same time is bashed for ‘not having a real effect’ on anything. With all of this buzz (for lack of a better term) surrounding the framework, I felt it might be time for me to figure out what’s going on myself. The following is a recollection of my learning process. It might not all work for you, heck it might not even be the correct way to do things. But it’s how I pieced the documentation together to comprehend what’s happening behind the scenes and why it matters.

First off, some explanations. From what I’ve gathered, Nepomuk is a database of relations and a framework for accessing that database. On a conceptual level, the triples that are contained within the database are simple, they’re “something — verb — something else” for example “My Task — is due — tomorrow”. If I understand it correctly, these relations (the verb parts) create a graph of all of the things contained in the database because they link all of the subjects together. Nepomuk/Soprano then allows you to search that graph and glean useful information from it.

Ok, so what does that mean to the application developer? To the end user? Let me try and explain with an example. One thing that I’ve always disliked about tags is their simplicity. A tag is a piece of text, no more. The only thing a computer knows about it is that it is in this huge general concept of ‘tag’ and that it contains some characters that may or not mean something to humans. So lets assume for example that I have a task (todo) to go and finish my engineering lab report with my lab group comprised of Bob and Sydney and myself. If I only have tags at my disposal, the best I could do would be to create a task with a subject such as “Finish Engin Lab” and then tag it with “Engineering”, “Lab”, “Sydney”, “Bob”, “Room 312″. If I’m looking at the task it’s reasonably clear what I meant, I need to do it with those two people and it’s located in room 312. However, what does the computer know about the lab? Almost nothing. I might have other labs that I’ve tagged with Sydney and Bob (seeing as they are my lab partners for the entire semester) but I might also have photos of a recent trip to Sydney, Australia also tagged with “Sydney”. Uh oh, now we have a problem. How can the computer figure out that my photos are not related to my lab? Nepomuk provides an answer to this.

Nepomuk contains a number of types that let you describe things more accurately. These types are included in things called Ontologies, but don’t worry about that just yet, just get familiar with the word. Anyways, instead of tagging my todo with just the text “Sydney” I could instead tag it with the PIMO::Person “Sydney” (when I use PIMO I mean Personal Information Model Ontology, one of a handful of namespaces for nepomuk). Now the computer knows that I’ve associated my lab with a person rather than just some text. Additionally, I could give it a location and so on. The real power in this is twofold. It opens up a whole new world for data visualization and it provides a good system for locating data-two things that we’re doing more and more of. Lets start with the data visualization. Instead of just seeing a list of tasks and hovering over/selecting them to see a list of tags, I could design an application that shows the task and then shows the people assigned to that task (and their faces or email addresses/phone numbers, site), the location it will take place in on the map (marble!) and the due date on a calendar instead of just a series of numbers/letters.

This leads into finding my data later on. Search is becoming a critically important part of the desktop and web experience. However, in a lot of ways our search is still quite primitive. It’s a keyword search that looks through the text of documents and filenames. Imagine a search client that let you specify that you wanted to find PIMO::Task objects that are assigned to a certain PIMO::Person and are due before a certain time. This is already way more useful than just searching for “Sydney” because I won’t get any of my photo results, and I won’t get any results that are of the wrong type (i.e. not tasks).

So, without touching any code I’ve tried to abstract away one use-case showing why Nepomuk might be useful. Hopefully it is a decent introduction, but if you’ve noticed errors or flaws in my understanding of Nepomuk feel free to point them out in the comments. I’m still learning about it myself. In my next post I’ll go into some details on my first experiences with actually working with Nepomuk and Soprano APIs and how I created a (very) mini example program.

Tags: KDE4 , Nepomuk

Subscribe to Comments

7 Responses to “Conceptualizing Nepomuk”

Nice post. I’m looking forward to your post about your use of the APIs!

Socceroos

September 8th, 2009 at 11:42 pmpermalink
nice post thank you
I like that approach of discovering, trying yourself at it, building stuff around… while documenting all along for community benefits.

myselfhimself

September 9th, 2009 at 2:22 ampermalink
Well I don’t learn Ontologies. I want my task manager app to provide nepomuk with data on who is involved and where it happens.

I think all KDE4 apps should finally start providing this information. It has been nearly 4 releases and still no app is doing anything useful with nepomuk. You can’t rely on the user to do all this complicated stuff.

At this rate of adoption Nepomuk won’t be useful before KDE 4.16 in 2014.

I thought KDE agreed on the pillars, so why is nobody using them?

Tom

September 9th, 2009 at 3:52 ampermalink
I don’t expect end users to ‘learn’ or even know what an ontology is. But that doesn’t mean that they aren’t a useful tool for application developers.

You’re absolutely right, the applications should be providing nepomuk with as much data as they can. It’s not happening (much) now, and I’d love to see that change. It might be a good goal to have for 4.5 where almost every app is plugged into nepomuk giving some data.

astromme

September 9th, 2009 at 10:52 ampermalink
IMHO nepomuk is a solution searching for problems. On AmigaOS 3.1 (about 12 years ago) almost every application used the filesystem’s “comment field” for tagging files. For example, if it’s a jpeg attachment saved from a mail, in the comment there was the sender name and email address. If it was a downloaded file, there was the source url where it came from, etc. Ok it wasnt searchable, but there were some really useful information, and this old technology is still nowhere else today.

simca

September 9th, 2009 at 11:00 ampermalink
The user should not choose that they want to search for a contact when entering the contacts name, that would just take too long and would be avoided as often as possible — so if no pictures are tagged with Sydney it would be avoided. Instead it should work on a trial and error base, like when you explain something to someone, you talk about “Sydney” and later say “no, not this Sydney, that Sydney who knows this person …”:

Enter “Sydney”, now as you’ve mentioned you are provided with anything that has “Sydney” anywhere, be it file name, tag etc.

Now the search-dialog should provide you with something like a sidebar, where all occurrences of Sydney are mentioned, something like “Contact”, “Files”, “Tag” …
Choosing “Contact” would provide you with a link to that contact and everything that is associated with that contact, so choosing now “Task” in the sidebar would show you all task that are associated to “Sydney”.

It could look like similar to the new sorting system that has been created for Amarok with breadcrumbs –>
(Picture of person or contact icon) Sydney > (Task Icon) Tasks

So baseline is

mat69

September 10th, 2009 at 8:59 ampermalink
Ups forgot the last paragraph.

Baseline is that users are lazy and thus most likely will only enter a keyword, the same is true for tagging a lot of users won’t use it that much.

Imo the only way for Nepomuk to succeed is automatic getting of semantic information (a lot of programs use that already), making it easy for the user to associate information to a file, like associating a place and a contact to a task and then making it easy to find the stuff.

mat69

September 10th, 2009 at 9:08 ampermalink