In this article, I will show just how little effort that it takes to specify graphics and window management sufficient enough to provide features that surpass kmscon and the ‘regular’ linux console with a directFB like API for clients to boot. It comes with the added bonus that it should work on OSX, FreeBSD, OpenBSD, and within a normal Arcan, X or Wayland desktop as well. Thus, it is not an entry in the pointlessly gimmicky ‘how small can you make XYZ’ but rather something useful yet instructional.

Two things motivated me to write this. The first is simply that some avid readers asked for it after the article on approaching feature parity with Xorg. The second is that this fits the necessary compatibility work needed with the TUI API subproject – a ‘terminal emulator’ (but in reality much more) that will be able to transition from a legacy terminal to handling terminal-freed applications.

The final git repository can be found here: https://github.com/letoram/console

Here are shortcuts to the steps that we will go through:

Each Part starts by referencing the relevant Git commit, and elaborates on some part of the commit that might be less than obvious. To get something out of this post, you should really look at both side by side.

Prelude: Build / Setup

For setup, we need two things – a working Arcan build and a directory tree that Arcan can interpret as the set of scripts (‘appl’) to use. Building from source:

git clone https://github.com/letoram/arcan arcan cd arcan/external ; ./clone.sh ; mkdir ../build ; cd ../build cmake -DVIDEO_PLATFORM=XXX ../src ; make

There are a lot of other options, but the important one here is marked with XXX. Some of them are for embedded purposes, some for debugging or performance versus security trade offs.

Even though Arcan support many output methods, the choice of the video platform has far reaching effects and it is hardcoded into the build and thus the supporting tools and libraries. Why this is kept like this and not say, “dynamically loadable plugins” is a long topic, but suffice it to say that it saves a lot of headache in deal-breaking edge cases.

There are two big options for video platform (and others, like input and audio are derived from the needs of the video platform). Option one is sdl or sdl2, which is a lot simpler as it relies on an outer display server to do much of the job, which also limits its features quite a bit.

Option two is the ‘egl-dri’ platform which is the more complicated beast needed to act the part of a display server. The ‘egl-dri’ platform build is the one packaged on voidlinux (xbps-install arcan). There is a smaller ‘egl-gles’ platform hacked together for some sordid binary-blob embedded platforms, but disregard that one for now.

The directory structure is simple: the name of the project as a folder, a .lua file with the same name inside that folder, with a function with the same name. The one we will use will simply be called ‘console’, so:

mkdir console echo "function console() end" > console/console.lua

This is actually enough to get something that would be launchable with:

arcan /path/to/console

But it won’t do much–and if you use the native egl-dri platform, you need to somehow kill the process for the resources to be released or rely on the normal keybindings to switch virtual terminal.

The first entry point for execution will always be named the same as the appl, tied to the name of the script file and the name of the folder. This is enforced to make it easy to find where things ‘start’. Any code executed in the scope outside of that will not have access to anything but a bare minimum Lua API.

All event hooks (like input, display hotplug, …) are implemented as suffixes to these naming rules, e.g. the engine will look for a console_input when providing input events.

Part 1: Hello Terminal

Git Commit #1

Breaking down the first function:

function console() KEYBOARD = system_load("builtin/keyboard.lua")() system_load("builtin/mouse.lua")() KEYBOARD:load_keymap(get_key("keymap") or "devmaps/keyboard/default.lua") selected = spawn_terminal() show_image(selected) end

The words in bold are reserved Lua keywords and the cursive ones return to Arcan specific functions. The Arcan specific functions are documented in the /doc/*.lua files, with one file per available function.

The first called, system_load is used to pull in other scripts or native code (.dll/.so). The way it searches for resources is a bit special, as the engine works on a hierarchy of ‘namespaces’, basic file system paths that are processed in a specific order. Depending on what kind of a resource you are looking for, different namespaces may be consulted. The ones relevant here are system scripts, shared resources and appl.

System scripts are shared scripts for common features that should be usable to most projects but are not forced in. These are things like keyboard map translation, mouse state machine and gestures and so on. ‘appl’ is the namespace of our own scripts here.

The get_key function is used for persistent configuration data storage. This is stored in a database with a table for arcan specific configuration, for shared launch targets (programs that are allowed to be executed by the engine itself) and for appl specific configuration (that is what we want here). These can be handled in script, but there are also a command-line tool, arcan_db where you can modify these keys by yourself.

The show_image function takes a vid and sets its opacity to fully opaque (visible). A ‘vid’ is a central part in Arcan, and is a numeric reference to a video object. We will use these throughout the walkthrough as they also work as a ‘process/resource’ handle for talking to clients.

When created, VIDs start out invisible to encourage animation to ‘fade them in’ – something that can be achieved by switching to blend_image and provide a duration and optionally an interpolation method. The fancy animations and visuals are out of scope for now though.

Next we add the missing function referenced here, spawn_terminal:

function spawn_terminal() local term_arg = get_key("terminal") or "palette=solarized-white" return launch_avfeed(term_arg, "terminal", client_event_handler) end

The get_key part has already been covered, and if we don’t find a ‘terminal’ key to use as an argument to our built-in terminal emulator, the palette=solarized-white argument will be selected.

launch_avfeed(arg, fsrv_type, handler) is where things get interesting. It is used to spawn one of the frameservers, or included external programs that we treat as special ‘one-purpose’ clients. There is one for encoding things, decoding things, networking and so on. There is also one that is a terminal emulator, which is what we are after here. The argument order is a bit out of whack due to legacy and how the function evolved, in hindsight, arg and type should have been swapped. Oh well.

Time for a really critical part, the event handler for a client. This can of course be shared between clients, unique to individual clients based on something like authenticated identity or type but also swapped out at runtime with target_updatehandler.

function client_event_handler(source, status) if status.kind == "terminated" then return shutdown() elseif status.kind == "resized" then resize_image(source, status.width, status.height) elseif status.kind == "preroll" then target_displayhint(source, VRESW, VRESH, TD_HINT_IGNORE, {ppcm = VPPCM}) end end

There are a ton of possible events that can be handled here, and you can read launch_target for more information. Most of them are related to more advanced opt-in desktop features and can be safely ignored. The ones we handle here are:

‘terminated‘ meaning that the client has, for some reason, died. You can read its last_words field from the status table for a user-presentable motivation. The backing store, that is the video object and the last submitted frame, is kept alive so that we can still render, compose or do other things with the data so normally, you would delete_image here but we chose to just shutdown.

‘resized’ is kind of important. It means that the size of the backing store has changed, and the next frame drawn will actually be scaled unless you do something. That is there is a separation between the presentation size that the scripts set with a resize_image call and whatever size the backing store has. Here we just synch the presentation to the store.

‘preroll’ is a special Arcan construction client communication design. Basically, the client is synchronously blocking and waiting for you to tell as much as you care to tell about its parameters and instead of reacting to each as part of an event loop, they get collected into a structure of parameters like presentation language, display density and so on. Here we only use target_displayhint to tell about the preferred output dimensions, focus state and specific display properties like density.

Finally, we need some input:

function console_input(input) if input.translated then KEYBOARD:patch(input) end target_input(selected, input) end

This is an event handler that was talked about before, and one of the more varying one as it pertains to all possible input states. This grabs everything it can from the lower system levels and can be as diverse as sensors, game controllers, touch screens, etc.

The more common two from the perspective here though is ‘translated’ devices (your keyboard) and a mouse. Here we just apply the keyboard translation table (map) that we loaded earlier and forward everything to the selected window. The target_input function is responsible for that, with the possibility to forego routing a source table at all and synthesise the input that you want to ‘inject’.

Part 2: Workspaces and Keybindings

Git Commit #2

While this commit is meatier than the last one, most of it is the refactoring needed to go from one client fullscreen to multiple workspaces and workspace switching and barely anything of it is Arcan specific. The two small details of note here would be the calls to valid_vid and decode_modifiers.

Decode_modifiers is trivial, though its context is not. Keyboards are full of states, and they are transmitted as a bitmap. This function call helps decompose that bitmap as a more manageable type. There are much more to be said about the input model itself as it is much more refined and necessarily complex, and will span multiple articles.

Valid_vid will be used a lot due to a ‘Fail-Early-Often-Hard’ principle in the API design. A lot of functions has a ‘terminal state transition’ note somewhere, meaning that if the arguments you provide mismatch with what is expected, the engine will terminate, generate a snapshot, traceback etc. Depending on how the program was started, it is likely that it will also switch to one of the crash recovery strategies.

This is to make things easier to debug and preserve as much relevant program state as possible. Misuse of VIDs is the more common API mistake, and valid_vid calls can be used as a safeguard against that. It is also the function you need to distinguish a video object with say, a static image source, from one with an external client tied to it.

Part 3: Clipboard and Pasteboard

Git Commit #3

More interesting things in this one and it is a rather complex feature to boot. In fact, it is the most complex part of this entire story. In the client_event_handler you can spot the following:

elseif status.kind == "segment_request" and status.segkind == "clipboard" then local vid = accept_target(clipboard_handler) if not valid_vid(vid) then return end link_image(vid, source) end

The event itself is that the client (asynchronously) wants a new subsegment tied to its primary one. If we do not handle this event, a reject will be sent instead and the client will have to go on without one.

By calling accept_target we say that the requested type is something we can handle. This function is context sensitive and only valid within the scope of an event handler processing a segment request. Since all VID allocations can fail (there is a soft configurable limit defaulting to a few thousand, and a hard one at 64k) we verify the result.

The link_image call is also vastly important as it ties properties like coordinate space and lifecycle management of one object to another. When building more complex server side UIs and decorations, this is typically responsible for hierarchically tying things together. Here we use it to make sure the new allocated clipboard resources are destroyed automatically when the client VID is deleted.

Looking at the clipboard handler:

elseif status.kind == "message" then tbl, _ = find_client(image_parent(source)) tbl.clipboard_temp = tbl.clipboard_temp .. status.message if not status.multipart then clipboard_last = tbl.clipboard_temp tbl.clipboard_temp = "" end end

This is a simple text-only clipboard, there are many facilities for enabling more advanced type and data retrieval — both client to client directly and intercepted. Here we stick to just short UTF-8 messages. On the lower levels, every event is actually transmitted and packed as a fixed size in a fixed ring buffer in shared memory.

This serves multiple purposes, one is to avoid the copy-in-copy-out semantics that the write/read calls over a socket like X or Wayland would do. Other reasons are to allow ‘peek/out-of-order’ event processing as a heavy optimisation for costly event types like resize, but also to act as a rate-limit and punish noisy clients that try to saturate event queues to stall / delay / provoke race conditions in other clients or the WM itself. For this reason, larger paste events need to be split up into multiple messages, and really large ones will likely stall this part of the client in favour of saving the world.

Long story short, the WM thus has to explicitly concatenate these messages, and, optionally, say when enough is enough. Here we just buffer indefinitely, but a normal approach would be to cut at a certain length and just kill the per-client clipboard as punishment. For longer data streams, we have the ability to open up asynchronous pipes, either intercepted by the WM or by sending the read end to one clipboard, and the write end to another.

The call to image_parent simply retrieves the parent object of the clipboard, which is the one we linked to earlier and the code itself just pairs a per-client table where we build the clipboard message that the client wants to add.

Lastly, pasting. In the clipboard_paste function we can spot the following:

if not valid_vid(dst_ws.clipboard) then dst_ws.clipboard = define_nulltarget(dst_ws.vid, "clipboard", function(source, status) if status.kind == "terminated" then delete_image(source) end end )

The important one here is define_nulltarget. There are a number of define_XXXtarget functions depending on how information is to be shared. The common denominator is that they are about allocating and sending data to a client, while most other functions deal with presentation or data coming from a client.

The nulltarget is simple in the form that it only really allocates and uses the event queues, no costly audio or video sharing. It allocates the IPC primitives and forces unto the client, saying ‘here is a new window of a certain type, do something with me!’. Here we use that to create a clipboard inside the recipient (if one doesn’t exist for the purpose already).

We can then target_input the VID of this nulltarget to act as our ‘paste’ operation.

Part 4: External Clients

Git Commit #4

There are quite a few interesting things in this one as well. In the initial console() function, we added a call to target_alloc. This is a special one as it opens up a connection point — a way for an external client to connect to the server. All our previous terminals have been spawned by initiative of the WM itself and using a special code path to ensure that it is the terminal we are getting.

With this connection point, a custom socket name is opened up that a client can access using the ARCAN_CONNPATH environment (or by specifying it explicitly with a low level API). Otherwise it behaves just like normal, with the addition of an event or two.

Thus in the client_event_handler, we add a handler for “registered” and “connected”.

“Connected” is simply that someone has opened the socket and it is now consumed and unlinked. No other client can connect using it. This is by design to encourage rate limiting, tight resource controls and segmenting the UI into multiple different connection groups with a different policy based on connection point used. For old X- like roles, think of having one for an external wallpaper, statusbar or launcher.

Our behaviour here is simply to re-open it by calling ‘target_alloc’ again in the connected stage of the event handler.

The “Registered” means that now the client has provided some kind of authentication primitive (optional) and a type. This type can also be used to further segment the policy that is applied to a connection. In the ‘whitelisted’ function below, we select the ones we accept, and assign a relevant handler. If the type is not whitelisted, the connection is killed by deleting the VID associated with the connection.

Lastly we add a new event handler, _adopt (so console_adopt). This one is a bit special.

function console_adopt(vid, kind, title, have_parent, last) ... end

We will only concern ourselves with the prototype here. Adopt is called just after the main function entry point, if the engine is in a recovery state. It covers three use cases:

Crash Recovery on Scripting Error ‘Reset / Reload’ feature for a WM (using system_collapse) Switch WMs (also using system_collapse)

The engine will save / hide the VIDs of each externally bound connection, and re-expose them to the scripts via this function. The ‘last’ argument will be set on the last one in the chain, and have_parent if it is a subsegment linked to another one, like clipboards.

It is possible for the WM to tag more state in a vid using image_tracetag which can also be recovered here, and that is one way that Durden keeps track of window positions etc. so that they can survive crashes (along with store_key and get_key to get database persistence).

In the handler for this WM, we keep only the primary segment, and we filter type through whitelisted so that we do not inherit connections from a WM switch that we do not know what to do with.

Part 5: Audio

Git Commit #5

Time for a really short one. Arcan is not just a Display Server, and there are reasons for why it is described as a Multimedia Server or a Desktop Engine. One such reason is that it also handles audio. This goes against the grain and traditional wisdom (or lack thereof) of separating a display server and audio server and then spend a ton of effort getting broken synch, device routing and meta-IPC as an effect.

With every event, you can extract the ‘source_audio’ field with gives an AID, the audio identifier that match a VID one. Though the interface is currently much more primitive as advanced audio is later on in the roadmap, the basic is being able to pair an audio source with a video one, and be able to control the volume.

This in this patch, we simply add a keybinding to call audio_gain and extract the AID to store with other state in the workspace structure.

Part 6: Extra Font Controls

Git Commit #6

Concluding with another short one. To get into the details somewhat on this one. you should read the Font section in the Arcan vs Xorg article.

We simply add calls to target_fonthint during the ‘preroll’ stage in the client event handler:

local font = get_key("terminal_font") local font_sz = get_key("font_size") if font and (status.segkind == "tui" or status.segkind == "terminal") then target_fonthint(source, font, (tonumber(font_sz) or 12) * FONT_PT_SZ, 2) else target_fonthint(source, (tonumber(font_sz) or 12) * FONT_PT_SZ, 2) end

The main quirk is possibly that the size is expressed in cm (as density is expressed in ppcm and not imperial garbage), thus a constant multiplier (built-in) is needed to convert from the familiar font PT size.

That is it for now. As stated before, this is supposed to be a small (not minimal) viable WM model that would support many ‘I just need to do XYZ’ users, but at the same time most of the building block needed for our intended terminal emulator replacement project. For that case, we will revisit this WM about once more a little later.

At this stage, most kinds of clients should be working, the fsrv_game, fsrv_decode etc. can be used for libretro cores and video decoding. Other arcan scripts can be built and tested with the arcan_lwa binary. aloadimage for image viewing and Xarcan for running an X server.

The “only” things missing client- support wise is the arcan-wayland bridge for wayland client support due to additional complexity from the wayland allocation scheme and that specific window management behaviour has such a strong presence in the protocol.