Though Microsoft may yet have some trick up its sleeve, there's a growing body of evidence from leaked screenshots that Windows 8's taskbar will omit one mainstay of the Windows user interface: the Start button. To get to the Start menu's replacement (the "Start screen") from the desktop, you can press the Windows key on your keyboard, hit the hardware Windows key if your tablet has one, swipe your touch-screen from the side, or, if you have a mouse, move it to the bottom-left corner or right-hand edge. What you won't be able to do is actually click or tap a permanently-visible Start button on the taskbar.

The rumor may or may not turn out to be true, but if it is, we shouldn't be surprised. That's because you've entered the Metro zone, where the rules of human interaction have changed—and it's a change that will be felt not just by tablet users, but by traditional desktop denizens, too. Microsoft will need to tread carefully.

The Metro factor

From the outset, Microsoft designed Metro to be different from the Windows of old. Metro applications, downloaded and purchased through the Windows store, will eschew the Windows, Icons, Menus, and Pointers we're used to in favor of a new text-heavy, low-chrome, windowless Metro style.

The Metro aesthetic is essentially chromeless. The trappings of the traditional window—borders, title bars, permanently visible scrollbars, toolbars—are gone or at least substantially scaled back. Instead, we get much more emphasis on the use of, for example, juxtaposition and layout to convey information. Metro doesn't do away with chrome entirely (it still has buttons, etc.), but the chrome is much reduced compared to desktop Windows.

With the new aesthetic comes a new attitude towards user interaction. Traditionally, Windows has only used a small repertoire of learned interactions: click, double click, right click, drag-select, drag-and-drop. Almost everything else is visually cued on-screen. With Metro, a whole bunch of extra learned interactions are necessary, and the number of visual cues is greatly reduced.

Windows 8's gesture vocabulary is for the most part simple, with swipes and taps being the essential operations. The most important uses for these actions are swipe from the left and right edges of the screen (to task switch and bring up the "charms," respectively), and from the top or bottom (to bring up toolbars). Applications can be closed by dragging them off the bottom of the screen. Then there are some smaller gestures, like nudging items on the Start screen to select them for editing. But many of these interactions won't be prompted; they must be learned.

The swipe gestures to switch between applications, bring up the charms, or close tasks—none of these have any on-screen prompting. Nor do elements of the touch vernacular like "pinch zoom" (depending on form factor, at least; pinch zoom is peculiar on a phone that's mostly held and used one-handed, much more obvious on a tablet that's held and used two-handed).

It's possible that Windows 8 will have some kind of a tutorial, as Windows 3.1 once did, but this is unlikely. Such tutorials are not exactly fashionable, and the competition doesn't have to teach users how to use the platform (in spite of some non-obvious commands, such as "double tap home button" and "long hold").

Explicit versus implicit

It seems almost laughable today, but Windows 3.1 included a tutorial to explain how to use the mouse. Windows 95 didn't have the mouse tutorial. By 1995, mice were common enough to take mousing skills for granted, but it did have to teach users all the same. To encourage users to click the Start button, a bouncing message, "Click here to begin," would slide along the taskbar if the user didn't do anything after starting the operating system. Even the "Start" name itself was a bid to lure users into clicking; early prototypes had an unlabeled button.

The bouncing prompt didn't survive past Windows 95, and even the Start label was finally dropped in Windows Vista; the Windows logo on the left-hand side of the taskbar was enough of a cue that people knew to click it. Windows 7 has all manner of hidden, unprompted user interface elements. Some are new, such as the jump lists that appear when you right click a taskbar button, and others old and long-standing, such as the alt-tab application switcher.

The trend to strip away explicit visual cues and rely more heavily on a common set of learned interactions is not a new one, but Metro is more extreme in this regard than any prior Windows version (though not substantially more extreme than other tablet platforms). It's an industry-wide trend.

The upside to this is slicker, lighter user interfaces. On some level, it's silly to waste screen space on user interface gadgets that prompt us to do things we know how to do anyway. For desktop machines with big screens, perhaps an X in the top right corner of every window is a small price to pay. But on a tablet (or a smartphone) where the constraints are much tighter (and where accidental presses are much more likely), a gestural method of closing applications saves precious pixels.

Learned versus intuitive

For actions that we do day in, day out, these prompts and on-screen cues are more than just a waste of space. They're straight up pointless. I don't need an on-screen cue to know where to resize a window, for example, because I know the border is where I resize the window. Windows 7's fat window borders, showing me where to grab hold of them to resize, are wasting space and telling me nothing I don't already know. A Mac OS X 10.7-style non-existent border would work just as well.

These basic interactions are all learned. It might seem "unintuitive" that Mac OS X 10.7 lets you grab a window border that "isn't there," but in practice it doesn't matter: we learn the interface and move on.

The learned interface no longer needs to be cluttered with affordances: visual cues that there's some interactive, functional item on the screen. There is no need for borders to "grab hold of," or of raised, pseudo-3D buttons to "depress."

There are many who'd argue that we learn every interface (except, perhaps, the very first one we use), and for all the emphasis put on "intuitive" interfaces, the use of learned interactions is downright expected. An operating system released in 2012 that tried to teach people how to double click or use a mouse would be laughed at.

There's nothing wrong with learned UI as such. A learned interface no longer has to explain every aspect of its operation to users, and can be a lot more streamlined and efficient as a result. But affordances do provide a great degree of discoverability. Users know which bits of the interface to experiment with because they are familiar with the on-screen prompts that the interface provides.

And not every interface can be learned. Tasks that are rare and infrequent are poor candidates for learning. Perhaps the most extreme demonstration of this is the command line compared to a wizard interface. Wizards are designed to be unlearned interfaces; they should be used for processes that are only performed infrequently, and so have to explain terminology and guide unfamiliar users through the process. The command line provides no guidance or explanation, and it's only worth learning the esoterica of individual commands if you're going to use them regularly. But ultimately, once the learning hurdle has been overcome, the command line can be far more productive and flexible than the wizard interface.

So what does this mean for Windows 8?