T able 2: Recov ery Rate and Lo op Times

Lo op Times One Three Fiv e Te n

Recov ery Rate 68.3% 73.3% 78.3% 81.7%

Fixed T yping Gesture. Currently , WindT alker can

only work for the situation that the victim can only touch

the screen with a relativ ely ﬁxed gesture and the phone needs

to b e placed in a relative stable environmen t (e.g., a table).

In reality , the user may t ype in an ad-hoc wa y (e.g., the vic-

tim may hold and shake the phone, or even p erform some

other actions while typing). W e argue that is a common

problem for most of the side channel based keystrok e in-

ference schemes such as [2, 13, 16]. This problem can b e

partially circumv ented b y proﬁling the victim ahead or p er-

forming a targeted attack b y applying the relev ant mo ve-

men t mo del as p oin ted out by [13].

User Sp eciﬁc T raining. Using WindT alk er, the vic-

tim’s input can b e recognized via the classiﬁers trained from

the same user. In the real-world exp erimen ts, it is hard to

adopt the classiﬁers trained by other p eople to infer the vic-

tim’s input. This is b ecause diﬀerent p eople hav e diﬀerent

ﬁnger co verage and clicking mo del. A large num b er of train-

ing data based on a wide range of training samples ma y o v er-

come this limitation. In practice, the attack ers hav e more

c hoices to achiev e the user sp eciﬁc training. F or example, it

can simply oﬀer the user free WiFi access and, as the return,

the victim should ﬁnish the online training by clicking the

designated num b ers. It can also mimic a T ext Captchas to

require the victim to input the c hosen num b ers. W e further

analyze the impact of the n umber of training data on re-

co very rate in WindT alker. T able.2 shows the recov ery rate

increases with the training lo op increases. Even if there is

only one training sample for one keystrok e, WindT alker can

still achiev e whole recov ery rate of 68.3%.

7.2 Defending Strategies

One of the most straigh tforward defense strategies is to

randomize the lay outs of the PIN keypad, such that the

attac ker cannot reco ver the typed PIN num b er even if he can

infer the keystrok e p ositions on the touchscreen. As p oin ted

out by [23], randomizing the keyboards is the eﬀective at the

cost of the user exp erience since the user needs to ﬁnd ev ery

ke y on a random keyboard lay out for every key typing.

A more practical defense strategy is prev enting the collec-

tion of CSI data. F or example, the user refuses to connect

to free public WiFi or pays attention to the deploy ed WiFi

devices nearby . Note that, to hav e the successful CSI based

ke ystroke inference, the sender WiFi device should be de-

plo yed close enough to the victim (e.g., 30 cm as sho wn in

[2]). T o preven t the accurate CSI data collection, another

strategy is obfuscating the CSI data by adding some ran-

domized noises to CSI data. In particular, the user can

in tentionally change his typing gestures or clicking patterns,

since ﬁnger cov erage and click pattern are considered as t wo

ma jor factors that aﬀect CSI v alue for the keystrok e. F ur-

ther, since CSI reﬂects the change of multi-path propagation

of WiFi signals, the users can take some actions to introduce

the unexp ected interferences to the CSI data. F or example,

the randomized human b eha viors (e.g., h uman mobility) or

wireless signals will reduce the successful chance of the ad-

ve rsary . Lastly , for the prop osed ICMP based CSI collection

approac h, CSI based typing inference requires collecting CSI

data with a high frequency . Therefore, detecting and pre-

ve nting a high-frequency ICMP ping represent a practical

and ease of use countermeasure.

8. RELA TED WORK

In this section, we review tw o domains of prior works that

are tightly related to WindT alker.

8.1 Public free Wi-Fi with malicious behav-

iors

F ree Wi-Fi services provided by public hotspots are at-

tractive to users in a mobile en vironment when their mobile

devices hav e limited Cellular connection. Existing w orks

[5, 6, 11, 21] hav e demonstrated it is feasible to deploy

a malicious Wi-Fi hotsp ot in a public area. F or exam-

ple, an iPhone can turn itself into a Wi-Fi hotsp ot. If the

iPhone user changes the session ID to “Starbucks F ree Wi-

Fi” , other p eople ma y connect their phones to the iPhone

while wrongly b eliev e they are using free WiFi services from

a nearby Starbucks.

In our considered scenarios, attack ers ma y make use of

user’s trusts on pubic WiFi and lure the the users to con-

nect their devices to a fak e access p oin t. Then, the attack er

ea vesdrops the WiFi traﬃc to identify the sensitive win-

do ws and selectively analyzes the CSI information to infer

the keystrok e information.

8.2 K eystroke Inference methods

Prior k eystroke inference methods ha ve b een developed

based on the information from v arious sensors and commu-

nication channels, such as motion, camera, acoustic signals,

and WiFi signals.

Motion: Owusu et al. [16] presented an accelerometer-

based keystrok e inference metho d, which aims to reco ver

six-c haracter passwords on smartphones. Later, Liu et al.

[13] applied a similar idea to the smartw atch scenario. Their

ob jective is to trac k user’s hand mov ement ov er the k eyb oard

using the accelerometer readings from the smartw atch, and

the keystrok e inference achiev es 65% recognition accuracy .

Acoustic signals: Zhu et al. [25] presented a context-

free and geometry-based k eystrok e inference. They use the

microphones at a smartphone to record keystrok es’ acoustic

emanations. Liu et al. [12] further prop osed a k eystroke

sno oping system by exploiting the audio hardw are to dis-

tinguish mm-level p osition diﬀerence. Their exp erimen ts

sho wed the system can recov er 94% of k eystrok es.

Camera based: Y ue et al. [23] introduces a camera-

based k eystrok e inference using Google Glass or oﬀ-the-shelf

we b cam. This metho d can ac hieve a p er-input success rate

of ov er 90%. Shukla et al. [18] also presented a video-based

attac k relies on the spatio-temp oral dynamics of the hands

during t yping. The pap er can breaks an av erage of o ver

50% of the PINs. Sun et al. [19] use camera to record tablet

bac kside motion and infer the victim’s typing conten t.

WiFi signal based: Using Wi-Fi signals to infer the

ke ystroke recently draws a large research attention b ecause

it oﬀers device-free and non-inv asion adv antages. The chan-

nel state information (CSI) are obtained from the commer-

cial Wi-Fi netw ork interface cards. Man y research works

ha ve demonstrated suc h ﬁne-grained information can be very

eﬀective in detecting the am bient ph ysical mov ement be-

cause it well captures the reﬂected multi-path WiFi signals.