For years, KDE software has included a semantic (relationship-based) searching infrastructure. KDE's Semantic Search was built around concepts previously developed in a European Union-funded research project NEPOMUK which explored the use of relationships between data to improve search results. Based on these ideas, KDE's implementation of Semantic Search made it possible to search for all pictures - taken in - a particular place. On top of that, it added text search and tagging.

Incremental improvements

Since its implementation, our developers received and digested a lot of feedback. Application developers requested and received easier to use APIs (Application Programming Interfaces, glue for integration) and widgets (such as the star rating and tagging user interface). For end users, stability and performance were crucial. Much work was put into improving the speed of indexing, keeping it out of the way of users and making Search more reliable.



Vishesh Handa talking about relationships* at conf.kde.in

(*the technical kind)

What is coming

The upcoming release of KDE Applications (version 4.13) will introduce the next step in the effort to improve the performance and stability of search features in KDE software. The improved Semantic Search is lighter on resources and more reliable than it was previously, but, thanks to considerable reuse of existing code, it is mature and offers a complete feature set. Users will find that features such as search are exposed in the same, familiar manner - but searching in a variety of applications will be faster and more reliable.

To accomplish this, developers looked at how Search was being used in practice. The major use-cases they identified are:

Finding objects (like files) based on their content, requiring a full text index of files on a system

Storing and retrieving simple objects such as tags, ratings, activities etc.

Storing and searching through relationships like this file is related to this contact

With a better understanding of use-cases that developed during years of deployment and development, the improved Search technology was specifically designed to do these three things and do them well.

Advancements for end users

The improved Semantic Search brings our users a number of tangible benefits. Its design is more robust, delivering search results quicker and with less overhead. The simplicity of the design will not only reduce failures, but will also make it easier for current and new contributors to add and improve functionality.

Improvements that users will notice:

Faster searching and indexing

Searching is more accurate

More reliability

Faster software development

Applications like the Kontact Suite, Dolphin and Gwenview, as well as the Plasma Desktop itself, already benefit from the changes.

For developers

The changes are small enough to make it relatively easy for application developers to move their applications over to the improved Semantic Search, which many have already done for the 4.13 release of KDE Applications. Instead of having a single RDF-based database for all information, Semantic Search now provides separate data stores and search interfaces. This allows it to store and search each type of content in an optimal way. Under the hood, the Semantic Search infrastructure uses SQLite and Xapian to index and retrieve data. More information about the information retrieval architecture can be found on the Community Wiki.

As of today, Semantic Search offers developers:

An API for searching

A way of storing relations between entities

File indexing

Email and contact indexing

Timeline KIO slave

Developers can find more information on the Baloo wiki page.

KDE Platform 4 and KDE Frameworks 5

When upgrading to KDE Platform 4.13, existing tags, ratings and comments will be transparently migrated to the new storage system. Looking forward even further, Semantic Search is in the process of being ported to Frameworks 5. This Frameworks 5 version will use the same storage system as the version included in Platform 4.13 (and newer) and will be fully compatible with it.

Learn more about Frameworks 5 in the tech preview announcement.

Conclusion

The change to Semantic Search in KDE is a natural next step in the process of taking technology that came out of an academic research project and adapting it to real world use cases. KDE's Semantic Search is at a point where it has become a core part of our infrastructure. It is now well positioned to provide the required robustness and functionality.

Article contributed by Vishesh Handa (KDE Search project maintainer), Stuart Jarvis, Aaron Seigo and Jos Poortvliet