Our general strategy is for the Mosh client to make

an echo prediction each time the user hits a key , but not

necessarily to display this prediction immediately .

The predictions are made in groups kno wn as

“epochs, ” with the intention that either all of the pre-

dictions in an epoch will be correct, or none will. An

epoch begins tentati vely , making predictions only in the

background. If any prediction from a certain epoch is

conﬁrmed by the server , the rest of the predictions in that

epoch are immediately displayed to the user, along with

any future predictions in the same epoch.

Some user k eystrok es are lik ely to alter the host’ s echo

state from echoing to not, or are otherwise hard to pre-

dict, including the up- and down-arro w keys and control

characters. These cause Mosh to lose conﬁdence and in-

crement the epoch, so that future predictions are made in

the background again.

In practice, this approach accommodates a wide vari-

ety of application behaviors, including multi-mode edi-

tors like vi (which sometimes echo con ventionally and

sometimes don’t), and the possibility that the user might

type a command at the prompt ( e.g. , passwd ) that stops

server -side echoes after the E N T E R key is typed.

Because the decision to perform local echo is made en-

tirely based on the application’ s observed behavior , ap-

plications need not be rewritten to accommodate local

echo. Unlike prior work, Mosh’ s local echo works e ven

with full-screen programs (like emacs ) that put the ter-

minal driv er in “raw” mode and do their o wn echoing.

In typical use, Mosh can display immediately the ef-

fects of almost all “typing, ” which constitutes more than

two-thirds of user keystrokes in our captures. The re-

maining ke ystrokes are principally “navig ation” (such as

“n” to move to the next e-mail message in a mail reader),

which cannot be predicted locally .

Server -side assistance for prediction evaluation

For the abo ve algorithm to work properly , the Mosh

client must be able to reliably determine whether its

echo predictions are correct. Early versions of Mosh at-

tempted to do this with the client only , by simply exam-

ining whether a predicted echo was present on the screen

by the time the Mosh server had acknowledged the cor-

responding keystrok e.

Unfortunately , in trials, we found that applications

sometimes take tens of milliseconds after input is pre-

sented to them before echoing to the screen. This can

lead the Mosh server to acknowledge an input keystroke

before the echo is present in the screen state, and causes

the client to conclude that its prediction was incorrect,

ev en though the echo is on the way . This produces an-

noying ﬂicker as the echo is (mistakenly) remo ved from

the screen, then reinstated when it ev entually arrives

from the server .

Our initial solution to this problem was a client-side

timeout, so that a prediction is not considered incor-

rect until the corresponding keystroke has been acknowl-

edged by the server and a certain amount of time has

elapsed. Unfortunately , because of network jitter that can

delay the e ventual echo be yond the timeout, this too pro-

duced an annoying number of false-ne gati ves and result-

ing ﬂicker . (By contrast, setting the timeout long enough

to accommodate large amounts of jitter causes mistaken

predictions to linger on the screen for too long.)

Our ﬁnal solution was to implement a server-side time-

out of 50 ms, chosen to contain the vast majority of le-

gitimate application echoes on loaded serv ers, while still

fast enough to rapidly detect mistaken predictions. The

terminal object that is synchronized to the client contains

an “echo ack” ﬁeld, representing the latest keystrok e that

has been presented to the application for at least 50 ms

and whose ef fects ought to be reﬂected in the current

screen. The client has no timeouts of its own, and con-

sequently network jitter does not adversely af fect the

client’ s ability to ev aluate whether a prediction is correct.

The cost is increased network traf ﬁc, because the server

often sends an extra datagram 50 ms after a keystroke to

con vey the echo ack.

In practice, this has eliminated the ﬂicker caused by

false-neg atives.

4 Results

W e ev aluated Mosh using traces contributed by six users,

cov ering about 40 hours of real-world usage and includ-

ing 9,986 total ke ystrokes. These traces included the

timing and contents of all writes from the user to a re-

mote host and vice versa. The users were asked to con-

tribute “typical, real-world sessions. ” In practice, the

traces include use of popular programs such as the bash

and zsh shells, the alpine and mutt e-mail clients, the

emacs and vim text editors, the irssi and barnowl chat

clients, the links text-mode W eb browser , and se veral

programs unique to each user .

T o ev aluate typical usage of a “mobile” terminal, we

replayed the traces ov er an otherwise unloaded Sprint

commercial EV -DO (3G) cellular Internet connection in

Cambridge, Mass. A client-side process played the user

portion of the traces, and a server -side process waited for

the expected user input and then replied (in time) with

the prerecorded server output. W e sped up long periods

with no activity . The average round-trip time on the link

was about half a second.

W e replayed the traces o ver two dif ferent remote shell

applications, SSH and Mosh, and recorded the user inter-

face response latency to each simulated user keystroke,

as seen by the user . The Mosh predictiv e algorithm and

4