{August 1, 2011} Beyond Activities: Cross-Device Sessions

Update: I’ve scheduled the BoF session for thursday, 14:00 in room 1.404/1. I’ll pop into the systemd bof on wednesday, too.

Update 2: It turns out systemd is solving an entirely different problem. :) there’s no overlap between it and sessions-as-in-restoring-windows.

I’ve been trying to write this blog post for months, and always hitting writers’ block or other distractions. So, screw it – I’m just going to start writing and see what comes out. :)

What this is about, is sessions, and XSMP, and wayland. Activities use XSMP, the X Session Management Protocol, to save and restore groups of windows. Before that, it was only used for the login session. It’s actually a better protocol than people give it credit for, and it served well for a few decades – but times have changed. If we want to move forward and do awesome things like sharing sessions between devices, we need something new. Even Activities as they are now push the limits of XSMP – there are a few ugly little hacks hiding in there that I’d prefer not to have. :)

Now, the key here is, if we’re going to replace an ancient-but-reliable technology with something new, we want the new one to be not just better but a lot better. Something worth switching to. Small issues like XSMP’s lack of autosave will be easy to correct; the big one is that it’s process-based. Session keys are handed out per-process, and when things get restored, they’re done by calling that particular binary with the session key as an argument. This is the source of two major pains; the first is that when a process has multiple windows (like konsole), you can’t properly put those windows in different sessions – so Activities have to do ugly things to hide this. The second, and larger problem, is that it’s ridiculously non-portable and opaque. What if the program’s binary gets moved elsewhere? what if the user switches to a competing app? And of course, their phone is not going to have the same programs installed as their laptop! Even if they have a meego phone and use Calligra on all their devices, the binary for the phone will be named differently. :P

So, what do we do? We make it resource-based! :) Store data in a standard place and standard format, so that the session manager, instead of seeing “this session had /usr/bin/firefox, /usr/bin/konsole and /usr/bin/okular” sees “this session has web pages X Y and Z, which this device last used firefox to open, with X and Y in the same window; two terminals, which this device last used konsole for; and foo.pdf on page 3, which this device last used okular for”. Then if okular isn’t available it can ask the system for a program that handles pdf, and when the user sends the session to their phone, it doesn’t matter that there’s a different pdf reader, or even that the phone browser doesn’t support tabs. Not only that, but if there were 20 urls open instead of three, the phone could go “whoa, that’s a bit much; I’ll just open the first three and display links to the rest somewhere.” And I could go “hmm, when the heck did I create an activity called ‘shambles’? wtf is in it?” and get an answer without actually having to load it. ;)

Another side of this is, well, the small stuff does add up. I just heard that OSX gained some decent session support; they’ve taken lessons from the iPhone and applied them to the desktop, making a system where it doesn’t matter if an app crashes – the OS may even kill it whenever it pleases – the session support is so good it’ll come back just as it was. I’d love to see that sort of solid session support in linux; KDE apps with XSMP are pretty good right now, but they could be better (and an autosaving protocol would go a long way towards fixing that).

Now, replacing XSMP isn’t an entirely new idea. Gnome people have been sick of XSMP for quite a while, although I don’t think anyone stepped forward with an alternative (and for their purposes, it does work, anyways). When I talked to the Nepomuk guys many months ago, about cross-device sessions, they had already thought of it – but decided the political side of it was too hard. They might still be right – we’ll see. But it’s worth a try. :)

So, what is the hard part, you ask? It’s the reason I stuck with XSMP for activities: a new protocol means persuading apps to support said protocol. XSMP’s support, while patchy in places, is at least decently widespread and un-controversial. A new protocol… well, I’m fairly sure KDE would embrace it, but I want it to be a part of gtk apps too, and pure qt apps, and meego apps, and even the weird proprietary fringe ones (like maple) someday. That is a challenge. :)

So how do we meet this challenge? First, we explain the benefits it brings – I hope I’ve done that well enough above. Second, we offer app developers a well-designed, easy-to-use API that’s had some feedback from actual app developers; just the other day I saw someone on IRC complaining about XSMP being confusing. :) Third, once we’ve got enough support to know this is worth trying, we show them a solid implementation that works (without any fatal bugs). Seeing is believing, right? Nobody wants to spend precious time writing code for a system before it exists, so we get the minimal feature set done, port a couple of apps, and show it off. Fourth – the secret weapon. ;) Wayland is gaining popularity, and when apps start porting to it, they may as well port to this too, right? Wayland doesn’t have any session protocol of its own, so this could be it. :)

So, who’s with me? :) The desktop summit is only a few days away, and surprisingly I will make it (yay!) so I’d love to have a conversation about this there. It’s the perfect place, after all, for something that aims to be cross-device, cross-desktop and all.

TL;DR: a resource-based session protocol would let us do really awesome things, so it’s worth the effort to replace XSMP

Now, I’m going to go into technical details below the cut (I have enough material to write a paper on this but for that damn writer’s block…)



…

So, technical details. right. What actually needs to be implemented? What API might it need?

There are three parts to it, all interlocking. First, the common storage area. This can actually come before the rest, because XSMP doesn’t say anything about how session data is stored; apps can start using it and still be started by an XSMP server, which is great for the transition time (it’ll be long. nothing this big changes fast).

Anyways… I haven’t done much research on storage. It could be stored in nepomuk/tracker/zeitgeist, or in text files like kconfig, or even in a registry (eek) – all that matters is that the system does its job well, and that all the apps on the device are using the same one. It might turn out that different backends are better for different devices – I don’t know, and I would love to hear information about the pros and cons of various approaches. Apps will be updating certain data on a regular basis – page number, position in a movie, etc – so I’d like numbers on efficiency too.

I do at least have a rough sketch for the format of the data, though. The vast majority of resources will be a URI of some kind (usually local or http) and a position within the document. the uri becomes the key, the location one of the fields. other fields will be the window it’s in (after all, windows are still the unit that window managers work with, and their position and size is important session data too), the program last used to open it, custom data fields for that program (preferably kept to a minimum), and perhaps data from other devices (either to help with decisions where this device is unsure, or in case the session is transported to a third device where that info could matter). Session data for windows would be written out, perhaps by the window manager… Really I need a whiteboard and other programmers to come up with good data structures, so I want to go over this in berlin, and I’m sure it’ll change once again during implementation. :) The most important thing is just to settle on some resource-based structure that works.

Second is the session manager. This is a program – likely a daemon, but not necessarily – that loads up the apps when a session is opened, and tells them to close when appropriate. If there is a user-interaction feature (“would you like to save this document?”), it would handle that too. This is what ksm-server currently does for XSMP. It’s actually not that much work, mostly reading and writing config files, but it’s complicated by the apps’ ability to request user interaction (stalling session-close) or even cancel it entirely.

Third is the application API, the part most developers will see. I want this to be good. I’m also wondering about backwards and forwards compatibility here; the XSMP protocol hasn’t changed in forever, but a new protocol will have changes, if only to fix bugs or correct design mistakes. Most apps will be sharing the same dynamically linked library, but should we worry about statically-linked ones? Hopefully we can come up with a design that is fairly robust against such cases. I’m reminded of dbus here; it specifies its wire format, which has the advantage that all apps speak the same language forever, but the disadvantage that they can never improve that format ever. :/

Speaking of dbus, one of the things I’m wondering is, how should the session manager and apps communicate? I think the storage should be written to as directly as possible, for efficiency (write-write conflicts will be exceedingly rare, if they happen at all, and I expect the same for read-write conflicts). We might want some sort of choking if the filesystem’s buffer’s aren’t enough; as a worst-case scenario, consider a dozen apps each updating their state every 10 seconds, out of sync; that could be more than a write a second. Hopefully there won’t be more than one or two apps doing regular updates like that (you can only watch one movie at a time, and even when reading there’ll only be one source of background music) but it’s something to consider later.

Whoops, I got side-tracked; I was meaning to talk about communication with the session manager. While upgrading ksmserver and kwin to support sub-sessions (aka activities), I discovered the hell of dbus timeouts. I’d rather not go down that road again, but I’m not aware of the alternatives. XSMP uses some ancient protocol called ICE, which seems to have close ties to X11; it’s not the only protocol with that name, though, so googling is a bit of a pain. In any case, I’d like to sever the ties to X11 so that this can be easily used with wayland too – what do they use for communication?

Wandering onwards (my wrist’s getting a bit tired now despite breaks), the api offered is partially defined by the features offered. I think that the first version should omit all but the most vital features; some, like user-interaction and cancellation of the session-close, I’m leaning towards abolishing altogether. This is 2011, not 1990; apps ought to be capable of behaving sensibly when it’s time to quit. Even kate now has swap files to recover data after a crash; those can be used just as well to restore a session. The only downside is they’re not so portable… that’s a problem for later, I think. :) Heck, I might even leave out the ability to put a window in N sessions at once; it does so complicate things, I’m not sure if it’s actually a worthwhile feature in activities, and it would still be possible to have the feature in activities by creating extra sessions under the hood.

So, what features are really needed?

-either the app or the window manager ought to record the size and position of windows. If it ends up being done in the app, it should be entirely automatic, within the library, not something the app developer needs to fuss with at all.

-applications need to record that they’re displaying a resource in a certain window.

-they need to record that they’ve stopped displaying it too. :)

-either the window or resource needs to be associated with a particular session when it shows up.

-apps need to be told when to close.

-apps need to be able to restore themselves from session data.

-apps need a method to store custom session data

That’s just the most basic of basics, mirroring XSMP’s abilities. To make the thing actually cool, we’ll also need to:

-tell the app exactly which windows to close, in case it’s spread across sessions

-tell the app whether it should restore the whole session or just a part of it

-allow and encourage apps to store common, portable session data (like the position in a document) in a standard place that any app can use

I’m sure I’ve missed a thing or two, but you get the idea. :)

Actually, there’s a fourth part to this, too: the device sync. How will two devices share their session data? Sending a list of resources and associated data sounds easy, but there are plenty of details to figure out – which resources need copying to the other device, how to manage the change of URI (a file at /home/chani/Documents/foo.pdf on my laptop will end up somewhere else on a phone), whether to try and resolve conflicts… :) I’m pretty sure there are people who have given this part more thought than me, though. And as a bonus, the sync code will come in handy for migration should a distro change to a vastly different storage backend someday.

All this is going to take time to implement, and even more time to be adopted. It’s a multi-year project. But if the people want it to happen, it can be done. :)