Entering the mosh pit

LWN.net needs you! Without subscribers, LWN would simply not exist. Please consider signing up for a subscription and helping to keep LWN publishing

For some years now, your editor has heard glowing reviews of Mosh — the "mobile shell" — as a replacement for SSH. The Mosh developers make a number of claims about its reconnection ability, performance, and security; at least some of those are relatively easily testable. After a bit of moshing, a few clear conclusions have come to the fore.

Mosh has been around since 2011 or so; the 1.3 release came out in March. It appears to be under active development and to be reasonably widely used. Mosh runs on just about any type of operating system one can imagine, and binary packages are readily available. Setup is nearly zero-effort, with one small exception noted below. In short, if Mosh meets your needs, there is little excuse for not using it.

Calling Mosh an "SSH replacement" (as the project's web site does) is not quite correct, though, for a couple of reasons. The first of those is that you still need SSH around to set up a Mosh connection. Mosh doesn't bother with little details like listening for new connections or authentication; it relies on SSH for those. Typing " mosh remote-host " will set up a connection but, behind the scenes, SSH will be used to make the initial connection to the remote host. Thus, one still has to go through the usual authentication dance — entering passwords, distributing public keys, etc. — before using Mosh. The good news being, of course, that one's existing SSH authentication setup will work unchanged with Mosh.

Once an SSH connection is established, though, the only thing Mosh uses it for is to start a mosh-server process at the remote end and to exchange keys with that process. Once that's going, the SSH link is closed and discarded; the client and server talk directly using a UDP-based protocol thereafter. That is one place where one can encounter minor setup difficulties. Installing Mosh itself can be an entirely unprivileged operation, but that may not be true of the task of ensuring that its UDP packets make it through any firewalls between the local and remote systems.

Each UDP packet is separately encrypted (using the OpenSSL AES implementation) before sending and authenticated on receipt; if all works well, that should make it impossible for an attacker to eavesdrop on the connection or inject packets into it. In theory, encryption at this level is more robust than the TCP-based encryption used by SSH; only the latter is subject to TCP reset attacks, for example. The protocol is designed with its own flow control that, among other things, is designed to avoid filling up network buffers along the path between the systems. Any longtime SSH user knows the annoyance that comes from interrupting a verbose command with ^C and having to sit through a long wait while all of the already-transmitted output makes its way across the link. Mosh claims to avoid that behavior — and your editor's tests suggest that the claim is warranted.

The actual protocol is called "state synchronization protocol" (SSP); it is built on the idea that the endpoints are maintaining copies of the same objects and communicating changes to the state of those objects. One object, maintained primarily on the client side, represents the keystrokes that the user has typed. The server end, instead, owns an object describing the state of the screen. The two ends exchange packets whenever the state of one of those objects changes, allowing the other end to update its idea of their state.

"TELNET", the project's web site proclaims, "had some good things going for it". One of those was local echo. Users typing over an SSH connection will not see their keystrokes until they are echoed by the remote end; on a flaky connection, that can make typing painful. Mosh, instead, has some tricky-seeming code in the client that can anticipate the changes in the state of the screen object and echo keystrokes ahead of the remote end. "Tricky" because the client has to figure out when not to echo — when a screen-mode application is running, say, or when a password is being typed. Your editor did not have access to a network slow enough to exercise this feature; it will be interesting to give it a try the next time it's necessary to try to get some work done over an airplane WiFi network.

Since UDP is connectionless, packets can come from anywhere. Mosh uses this feature to implement its reconnection mechanism. The Mosh "connection" exists for as long as the mosh-server process is running; the client, meanwhile, can disappear and reappear at will. So, for example, one's laptop can switch from one wireless network to another, getting a new IP address in the process. The Mosh server will notice that packets are now coming from a different address; if those packets authenticate properly, the server will start sending its own packets to that address. So the Mosh connection will persist, uninterrupted, possibly without the user even noticing.

Your editor ran some tests of this feature, switching the laptop between a local network and tethering via a phone handset. The Mosh client puts up a notification when connectivity goes away (it seems to notice before the laptop itself figures out that its WiFi access point has vanished); once the network returns, the Mosh connection is back up and running. If an output-producing command is running while the change happens, none of the output will be lost (modulo one detail).

Mosh tries to provide the best terminal experience it can. It seems that quite a bit of effort has, for example, gone into proper handling of UTF-8 character sequences and escape sequences. But there is an implication of the state synchronization protocol that affects the terminal experience in a significant way. Since Mosh models the entire state of the screen, it wants to be in control of that state, with the result that it overrides and changes the behavior of the terminal emulator that one might be running Mosh within.

This aspect of Mosh is perhaps most visible when it comes to scrolling back through the terminal output. The Mosh model of the screen encompasses the screen itself, but not anything that has scrolled off the top. When using gnome-terminal and SSH, for example, the mouse wheel can be used to move back through the output; that does not work with Mosh. Instead, the wheel moves through the command history. The gnome-terminal scrollbar becomes inoperable while Mosh is running as well. The problem was first reported in 2011 — it is issue #2 — but it remains unfixed. The workaround is to use a utility like tmux or screen .

Mosh's focus on the terminal experience also ties into the other reason why it is not truly an SSH replacement. It can be used to run an interactive shell on a remote system. It can also be used to run a command non-interactively, but only if the user doesn't care about seeing the result after Mosh exits and resets the screen. But it cannot be used for tasks like port forwarding. Those of us who use SSH as a sort of poor hacker's VPN will need to continue doing so.

Mosh, thus, is aimed at a fairly specific use case: users who engage in interactive terminal sessions with remote hosts over variable and possibly flaky network connections. That is a description that, for example, fits many attendees of Linux-oriented conferences. But, if the truth be told, Mosh is not a tool that will make a day-to-day difference in your editor's life. In a world where many resources are local and the rest are accessed via higher-level protocols, many of us probably do not need a highly optimized, extra-robust remote terminal emulator. For those situations where it is useful, though, it would seem that, as your editor has heard for years, there is nothing better.

