Hello once again! If you’ve read my previous post, you should know that I’m in an endeavor to analyze keyboard shortcuts in most major DEs just so that I can provide insight into what the best defaults would be for KDE Plasma.

Now let’s start our analysis of GNOME keyboard shortcuts.

Preparations

For testing GNOME, I installed vanilla-gnome-desktop on my work machine which includes Kubuntu 19.04 (with backports), and later on I removed vanilla and installed ubuntu-desktop with –no-install-suggests and –no-install-recommends and later on without. I didn’t see that much difference GUI-wise aside from a few tweaks made by Canonical.

I also used a Fedora 31 liveUSB on both my work and home machines. It works significantly more snappy than Ubuntu 19.04 GNOME and several levels of magnitude better than that of 18.04.

I used literally all sources available for checking out on GNOME shortcuts:

https://www.cheatography.com/frieser/cheat-sheets/gnome/

https://wiki.gnome.org/Design/OS/KeyboardShortcuts

https://help.gnome.org/users/gnome-help/stable/keyboard-shortcuts-set.html.en

https://developer.gnome.org/hig/stable/keyboard-input.html.en

https://help.gnome.org/users/gnome-help/stable/shell-keyboard-shortcuts.html

https://help.gnome.org/users/gnome-help/stable/keyboard-nav.html.en

Moving around the place

As mentioned before, GNOME uses a 1D workspace layout, meaning that the only movement allowed between workspaces is through one axis, which is vertical. This solves several issues related to axes which would be typical of 2D: the separation between workspace movement and screen movement.

If workspace movement can only go up and down and window/screen movement can only go left and right, then it means that all movement in the desktop fits perfectly with the most important combination of keys, the arrow keys.

Under a 2D model, that is, one where workspace movement is allowed within two axes, then all possible arrow keys are by default assigned to only one full set; hence why it’s so problematic to find defaults for them.

Interestingly, this is not what GNOME does, because GNOME found optimal solutions to most of its challenges.

Meta+Up maximizes windows, Meta+Down restores windows, Meta+Left tiles windows to the left, Meta+Right tiles windows to the right. Workspace movement occurs by Meta+PgUp and Meta+PgDown or Ctrl+Alt+Up and Ctrl+Alt+Down. Screen movement occurs by Meta+Shift+Left and Meta+Shift+Right.

Several things in the use of modifiers here are interesting:

Meta+arrow does not match a full set of tiling like Plasma (in which Meta+Up/Down tiles up and down), that is, it does not match semantically, because GNOME acknowledges that tiling up and down is a waste of vertical space; instead, the more commonly-used maximize+restore is default.

One thing of note is this linguistics paper, which explains the kinds of directional association that we find in English, particularly that of Up being preferable to Down (pp.82-83). Notice how Up for maximize and Down for restore feel intuitive; Up takes precedence and maximize essentially means turning the current window into the most important one (which should use all screen space available), whereas restore is less important, thus fitting the Up/Down relationship. Also consider that western cultures─or at least those with left-to-right languages─have implicit design rules related to reading. When “reading” a screen, information is first processed at the top and then downwards, and from left to right, in accordance to typical writing.

Meta+PgUp/PgDown are also a fitting combination that compensates for the absence of Meta+Up/Down for workspace movement. It’s the only other set of keys that provides the semantic association of direction aside from the arrow keys. In addition to that, the other default of Ctrl+Alt+Up/Down follows the guidelines that I proposed, namely those of “Using both hands to do a keyboard shortcut is preferable as it prevents RSI” and “The key combo should fit the position/format of the hand”.

I say these keyboard shortcuts are interesting because they make great use of available keys, and yet two combinations are left out, namely Ctrl+Alt+Left/Right. My suggestion would be that they be bound to left/right tiling or left/right screen movement, which would render a perfect distinction between workspace movement and window/screen movement, while also retaining the defaults with maximize and restore assigned to Meta+Up/Down.

Put it on my tab!

Another rather complex but interesting thing GNOME does is window management with Tab.

Alt+Tab and Meta+Tab switch between applications, Alt+Shift+Tab and Meta+Shift+Tab switch to previous application, Alt+Esc switches between windows without animation feedback, Meta+Tab+’ and Alt+’ (key above Tab) switch between windows of the same application, Meta+Shift+’ and Alt+Shift+’ switches to previous window of the same application, Alt+F6 switches between windows of same application without animation feedback, Ctrl+Alt+Tab switches between system controls, Ctrl+Alt+Esc switches between system controls without animation feedback.

Oof! Quite a lot of them, but let’s see them a bit more in depth.

Alt+Tab is canon behavior in all desktop environments I’ve seen, and it makes sense to have Meta+Tab as an alternative: firstly, it improves precision since if the user misses Alt and presses Meta, they’ll get the same behavior regarless; it accomodates different hand sizes, which is a plus for typing comfort; it is consistent.

GNOME allows for window switching without animation feedback, which is great since animations, while pleasant and useful, are inherently slower than direct switching due to the mere fact they exist and as such consume time. The most optimal keyboard-driven workflows have zero or a minimal amount of animations for every combo, despite being a niche (absolute time economy) within a niche (keyboard-driven workflows) within a niche (Linux users). That said, I see absolutely no discoverability in Esc and F6. I did not manage to find any sort of relation between those keys that would allow to find the other by the mere existence of the first.

The use of the key above Tab to allow physically close and parallel behavior to that of Alt is genius. Indeed, if one were to analyze how many layers of switching there exist in computers, they would find 8 layers: between elements of the same application (1), between tabs of the same application (2), between windows of the same application (3), between applications (4), between screen areas a.k.a. tiling (5), between screens (6), between workspaces (7) and between Plasma Activities (8). A decent keyboard-driven application requires at least 1 and 2, a minimal keyboard-driven workflow should be able to at least use 1-4, while a generally complete workflow would require 1-7, and the maximum that can be reached would be 1-8.

Wait. There’s actually a 9th layer: switching between system controls. Indeed, the key combo is not the best, as I’ll mention in the next paragraph, but the concept itself is something that is quite desirable to Plasma. A clearer way to describe it would be switching between desktop elements. For instance, in the Overview, this allows to transfer focus between the top panel (including the application menu and the tray), the dock, the search, the windows and the workspaces. While on the desktop, it will switch focus primarily between the desktop itself and the top panel. While the implementation may still not be optimal, this is genius: it allows to control absolutely every interactive element available on the desktop. This is desirable in keyboard-driven workflows insofar as the tray is useful, that is, since the tray is unarguably useful for mouse-driven workflows (otherwise it wouldn’t exist), so it must be true that users with keyboard-driven workflows would require their use if no alternative to them is provided. This key combo fills that gap. In addition to that, key combos also exist for managing tray elements individually, namely Meta+M for showing the message/notificaiton tray, Meta+N to give keyboard focus to the active notification.

Now, I have some harsh criticism concerning some of those key combos. Meta+Shift+’, Alt+Shift+’ and Ctrl+Alt+Tab are completely unnatural to the format of the hand. The fact keyboards are inherently shorter than the average shoulder length means that the hands will always point inwards when typing, so much so that even an alternative to the “home row” has been suggested that accomodates for the hand’s angle; these GNOME key combos push the hand outwards, thus breaking the “The key combo should fit the position/format of the hand” guideline, which leads to strain, which increase the chances of lesion and RSI. In addition, since those keys are quite concentrated on the left side of the keyboard, users tend to accomplish the combo with one hand, in contrast to using the left hand for Ctrl and Tab (or Shift and ‘) while pressing Alt with the right hand, which is also unnatural/uncommon.

That said, overall, GNOME makes great use of keyboard shortcuts and it does so in a manner fitting to its own environment.

Ctrl, Alt, Ctrl+Alt, Shift, Ctrl+Shift+Alt, Meta, Meta+Shift, huhhhh

The following will be quite opinionated: I personally like the use of different combinations of Ctrl, Alt, Shift and Meta. After memorizing them, all modifier keys can have specific semantics that are easy to remember, some of which can be seen in both GNOME and i3: Shift indicates transfer/movement, so adding Shift to a navigation key combo will move the focused window elsewhere; Alt indicates alternation, so the canon Alt+Tab alternates between windows while not moving them; Ctrl indicates, well, control, or management if you will, and as such, Ctrl+Tab allows you to manage tabs in a browser, and Ctrl+W allows you to quit them. However, GNOME makes such extensive use of modifier keys that it can be quite overwhelming. The aforementioned navigation- and Tab-related keys work generally well because of the aforementioned semantics, but the end result is huge, and screen capturing can be confusing in that it’s not discoverable, but is consistent. Screen capturing itself is quite the interesting case too:

Print opens a popup to save the screenshot of the whole desktop as a file (or optionally put in the clipboard), Alt+Print works the same but for the focused window, Shift+Print likewise but with a selected area; then by adding Ctrl to the second and third cases, instead of a popup, the screenshot will automatically be saved in the clipboard.

It is consistent, as can be seen, but it’s not possible to discover them aside from memorizing or a priori investigation. Unlike the navigation and Tab sets, Alt does not have any semantic association with window nor does Shift have any semantic association with area. This is a minor issue, however.

The caveats

The first small caveat I see is the use of Alt+Space for activating the window menu. There is barely any differing functionality there which isn’t available elsewhere as a keyboard shortcut. Alt+Space is an incredibly useful keyboard shortcut merely due to the physical proximity between keys, and its pressing is optimal: two fingers together, one press. Plasma assigned Alt+Space to one of its most useful features, namely KRunner; I’d like to see the same happening in GNOME.

A minor issue that is also worth mentioning is the key combo Ctrl+Alt+Shift+R. It breaks one of the guidelines I propose: “However, avoid three or more keystrokes whenever possible”. This could be turned into a 3-stroke key combo since you really don’t want to activate it by accident. Avoid 4-stroke key combos like the plague, I’d say; it leads to more strain, more effort, and in this particular case, it’s unnecessary.

Now, the F# keys are what break the delightful consistency seen throughout the course of this analysis. Let’s get the upper row of the keyboard together first to get the idea, shall we?

Alt+Esc switches windows without animation feedback, Alt+F1 opens the menu, Alt+F2 opens the command prompt, Alt+F4 closes the window, Alt+F6 switches windows of the same application without animation feedback, Alt+F7 moves the window, Alt+F8 resizes the window, Alt+F10 toggles maximization.

Now let’s remove Esc because we’re talking about the F# keys, despite it making a pair with F6 (which is an issue in itself). Let’s also remove F1, F2 and F4 since they are canon. Then we’ll get the important ones:

Alt+F6 switches windows of the same application without animation feedback, Alt+F7 moves the window, Alt+F8 resizes the window, Alt+F10 toggles maximization.

F6 has seemingly been set so as to stay distant from F4 to avoid destructive behavior while being closer to Esc than they keys to its right side; F10 reproduces the same behavior as Meta+Up/Down combined, and so it simply stays in a position distant from everything else, alone. F7 and F8 do make sense, but there doesn’t seem to be any particular reason for them to be there aside from being what’s left for them.

I can’t blame GNOME for that: the F# keys, despite being a sequential set, have several issues: it’s too far from the home row, no semantic association, not memorable, only 1/3 is standardized among Linux environments.

Another issue I see is universal access key combos.

Meta+Alt+8 toggles zoom on/off, Meta+Alt+- zooms out, Meta+Alt+= zooms in, Meta+Alt+S toggles screen reader.

I like the fact that – indicates less zoom and = indicates more zoom since the = key is also the one that has +, so a mere look on the keyboard allows for this immediate association, which is great. However, why set Meta+Alt+8 seemingly arbitrarily when it could be Meta+Alt+0, namely the closest key to – and =? This way, no precision is required by the user to toggle zoom. This is a keyboard shortcut made for visually impaired people, they should be able to recall the position of the keys by the distance between them and backspace or the upper right corner of the QWERTY pad, not by looking at the keys! In addition, Meta+Alt+S is the only key combo of its kind; this is made so to associate Meta+Alt with accessibility, so it does make sense, but this also means it’s not discoverable, for three reasons: there is no reason whatsoever for the user to look among the letters when all other accessibility key combos are located among the numbers and mathematical symbols, Meta+letter is used way more times (5 in fact), and Meta+Alt+letter is not used anywhere.

Conclusion

Well, aside from my rather opinionated points near the end of this analysis, I’d say GNOME does at least 80% of things right in terms of keyboard shortcuts. It’s not that it has a particularly powerful keyboard-driven workflow, it’s just that the main keyboard shortcuts are there and are consistent. Most make sense and are well thought out; and I might be wrong about the seemingly arbitrary key combos I’ve found, given that I haven’t yet chatted with their design team, and so I might make a new blog post in the future after receiving feedback. Rather than discussing with them one by one all points mentioned here, I decided to write this information in this blog: it’s generally cleaner to do so, and Phabricator discussion, for instance, can simply link to specific parts of my series on keyboard shortcuts.

I will file feature requests for changing the accessibility keyboard shortcuts in GNOME in the (hopefully) near future.

This has certainly not been a 100% complete analysis, but it’s one I’ve been delaying for some time and should suffice for the time being.