Another year, another Google Summer of Code, another 4 (yes, four!) semantic desktop projects. It is amazing. After two very successful projects in 2009 we now take it one step further with three Nepomuk projects and one Strigi project. Without further ado I give you the Nepomuk Google Summer of Code 2010 projects:

Metadata Backup, Sync and Sharing by Vishesh Handa

Ever since we started to create meta data on the desktop (by this I mean tags, ratings, and relations between resources that can not be recreated easily) we also had the need for backup and syncing of this data. So far this area is lacking in Nepomuk. Vishesh sets out to change this situation and develop ways to sync meta data between different clients (imagine syncing your laptop with the desktop computer or the phone) or simply to back it up. This does not simply mean to code some backup GUI – it actually includes changes on the ontology (the data-) level. When syncing data between two clients (or syncing data between a client and a backup – the principle is the same) the two most complicated matters are: 1. identifying the resources which need to be merged on both ends and 2. deciding which data needs to be removed and which to be added.

Well, it suffices to say that Vishesh has an ambitious project ahead of him. But looking at his enthusiasm and his early involvement in KDE (he is already commiting one patch after the other) I am very confident that he will succeed.

Web Metadata Extractor Framework and Service by Artem Serebriyskiy

In Nepomuk we use the Strigi system to extract meta data from files and store them in the Nepomuk database, allowing the user to search files based on their meta data. This is very useful. However, there are certain types of files that do not provide much or no meta data at all. Typical examples are video files. It would be very interesting to be able to search for video files by title, actors, directors, or release year. All this information is available on the Internet. So why not make use of it?

This is exactly what Artem’s project is about: extract meta data from the web and associate it with local files. Of course he will implement this as a Nepomuk service that provides a plugin system allowing for different types of extractors and being able to handle uncertainties and information duplicates as smoothly as possible. Look out for more cool information on your fingertips.

Nepomuk Dedicated Desktop Search GUI by Oszkar Ambrus

Let’s face it: today desktop search is still the number one use case for Nepomuk (although it was not the original motivation. But that is another story.) So having a good and convenient user interface is essential for the success of the system. We have several interfaces in KDE including the search bar in Dolphin and the search runner. But all are lacking in at least two main areas: 1. the query building: so far one has to know a lot about the underlying data structures to write powerful queries; and 2. the presentation of the search results: currently the results are presented like any other folder excluding interesting information like a hit score or details on why the result was returned. (Actually there is a number three which I hope Oszkar will have the time to attack: since we have more than file results we need a good way to open and present these resources.)

Oszkar sets out to improve this situation and create reusable components to let the user create powerful queries without much knowledge of the data and to present the results in a convenient way. An important project that will undoubtedly yield great results.

Strigi: Stream Analyzer based on Data Structure Descriptions

Jos was kind enough to write a paragraph on the Strigi project:

Yet another project has been granted. Yulia Medvedeva will work on a new type of file analyzer for Strigi. The goal of the project is to write the structure of files down in a grammar file and generate code from the grammar or parse the grammar at runtime. Writing analyzers usually involves quite a bit of repetitive error-prone code. It also requires knowledge of C++. By writing the format in a grammar language, coding errors are avoided. In adddition to that, the independence of the programming language allows the grammars to be shared with other projects.

Well, that is it for the four projects that should give Nepomuk a good push forward. I am very happy about the selection and have to say thank you to Google and the rest of the KDE mentor team for giving us this much support. It will be legendary!