Stay on Top of Emerging Technology Trends Get updates impacting your industry from our GigaOm Research Community

The last weekend saw a fair amount of freaking out over the privacy policy associated with Samsung’s smart TVs, which warns customers that “if your spoken words include personal or other sensitive information, that information will be among the data captured and transmitted to a third party through your use of voice recognition.”

Well, yeah: that’s how cloud-based speech recognition works. You say stuff and it goes off to a powerful computer somewhere for voice-to-text translation and interpretation. The issue here is of course the idea of Samsung’s TV listening in all the time – if chatter is being constantly monitored and parsed, that isn’t just reminiscent of 1984; it’s pretty much described in Orwell’s book.

I’ll come back to that issue in a moment — spoiler: I suspect the fears are overblown — but first it’s worth mentioning the wider picture. With quite superb timing, on Monday the EU Agency for Network and Information Security (ENISA) issued a report about the security of the smart home. In the agency’s view, smart TVs will likely act as the main interface for this nascent concept, which introduces dozens of potential threats.

Sensors everywhere

The smart home is all about sensors – cameras, temperature sensors, motion sensors, humidity sensors, and of course microphones. They will probably come from a variety of manufacturers, because the cost of making a device connected is quite low, but ENISA reckons that the smart TV will become the coordinating hub.

This is debatable — there’s also the smartphone as a potential hub — but nonetheless plausible. As the agency said, the screen size allows for the display of a lot of information, there’s a good amount of space for processing power, memory and storage, and TVs are pretty good at integrating with other devices such as consoles and external storage. TVs are already used as hub interfaces in hotel rooms, ENISA noted. Also, TV manufacturers are really keen on their products becoming home gateways of this kind.

As ENISA put it:

The physical location of the smart TV, often in the centre of a home, provides a good position for monitoring a location and the activity within it. Lifestyle data gathered from the smart home is likely to be very attractive to advertisers and data-miners… It is difficult to learn that much about individual behaviour from a single smart device, but with multiple devices and some contextual knowledge it becomes easier to make inferences about behaviour. At least sufficient to support aggressive advertising, reminders, deals etc. and this can influence the inhabitants’ way of living.

And that’s only the planned, commercially-minded spying. Hackers and intelligence agencies might also want to spy for other purposes. There’s loads more that can go wrong — even bearing in mind that ENISA’s report is intended as an exhaustive list of warnings, it still makes for unsettling reading:

Incorrect settings could cause physical damage, depending on what’s being controlled, and “multiple errors can occur through voice-controlled smart home systems.”

All kinds of outages can temporarily brick home functionality, from electricity and internet outages to remote problems in the cloud. There’s also a risk of signals being jammed, accidentally (by neighbors with the same system, for instance) or otherwise. Also, connected things and the cloud services that keep them running can be hacked or hit with denial-of-service attacks.

The wireless protocols used to connect everything could be vulnerable to things like man-in-the-middle attacks, where someone close-by can snoop on and alter communications, or replay attacks, where they can capture and replay signals so as to bypass locks and security systems.

Hoaxers could have fun: “For instance, the system that inserts adverts into streamed content on a smart TV could be exploited to push hoax content to the viewer, or web-enabled displays in the home could display false information… Attackers with access to the smart home components could fake a system crash or error, or virus, and then offer to repair this as a method of gaining physical access to the home or further access to other components.”

Securing the smart home will probably be a complex matter due to the various players that are involved. As the report pointed out, some devices may belong to the occupant, while others (such a set-top box) may be leased and under some company’s control. The occupant will probably want to preserve her privacy as much as possible, while vendors might be after as much saleable data as they can get their hands on. And anyway, most of them are more experienced at designing appliances than they are at managing the security implications of those devices being connected.

Still, ENISA suggested that keeping things simple might help. The more automation and data storage is handled locally and under the owner’s control, and the fewer external services that are thrown into the mix, the less “attack surface” there will be. Critical and non-critical software should run on separate systems, and manufacturers should follow good security practices around things like authentication and encryption.

Privacy by design

Now here’s where we circle back to Samsung’s smart TV – the report also noted that vendors could try to bake in privacy-by-design principles from the start. These principles were put together a few years back by former Canadian data protection regulator Ann Cavoukian, and personally I would not trust any smart home equipment from a vendor that can’t demonstrate how they comply with them.

This is basic stuff, particularly when you’re making gadgets for various kinds of monitoring. Maximum privacy protection should be in the default settings, as opposed to opt-in settings. Privacy should be a key consideration from the start, rather than an add-on feature. The trade-offs from choosing high privacy levels should be kept to a minimum. Data should be protected all the way, wherever it goes, and destroyed as soon as it’s not needed. And what happens to that data should be transparent to the people generating it.

So how does Samsung’s smart TV system stack up in this regard? On the plus side, the company said in a statement that it encrypts the data it collects to “prevent unauthorized collection or use.” A microphone icon also appears on the TV when the speech recognition feature is active, so users can be aware that their words are going into the system.

From there on, things get shakier. The user can turn the feature on and off, which is good, though it’s not clear what the default setting is. They “can also disconnect the TV from the Wi-Fi network” – a move that would of course kill much more functionality, such as that of smart TV apps. Use of the voice recognition feature means that “voice data is provided to a third party during a requested voice command search” so that content can be returned to the TV. Fine (again, that’s how this works), but who’s the third party? Could Samsung perhaps be a bit more specific about what happens with this data?

The confidence game

To be clear, I don’t think Samsung is doing anything especially egregious here. There are many always-listening devices out there now, such as Amazon’s Echo speaker, that are in a constant passive listening state. They’re waiting to hear a wake command that puts them into an active listening state (in Samsung’s case, it’s “Hi, TV.”) Until that point, they’re only storing the voice data long enough to analyze it for that phrase, generally on the device itself – it’s only when the word is spoken that data starts getting sent into the cloud, to those third parties.

This distinction between these listening states is super-important and, unless I’m very much mistaken, Samsung’s smart TV voice recognition feature isn’t quite the privacy-munching monster some are painting it as. If you’re consciously searching via voice command, just as when you’re searching in a browser, you should be aware that the words contained in your search request will be whisked off to some distant server so you can get a result. C’est la vie.

However, important questions remain about other aspects of Samsung’s system, and I’ve put a few to the company: Once the data goes to that “third party”, is it also encrypted in their systems? How long is it stored for? What does Samsung do to ensure that hackers can’t access the microphone in the TV? Is any of this data available to law enforcement or intelligence services brandishing a warrant?

Anyone who wants customers and users to offer up potentially sensitive data should be prepared to answer questions such as these. That applies to web services and apps as well, of course, but those making products for the smart home – products whose entire purpose is to observe and record – had better be particularly sensitive to privacy worries. The smart home could turn out to be a very vulnerable thing, and vendors should do all they can to set their customers’ minds at rest.

UPDATE (10 February): Samsung has confirmed to me that the passive voice recognition system on the TV is for voice commands, so “Hi, TV” would be followed by a volume adjustment command, for example. This is not connected to the internet at all. The cloud-based voice recognition feature everyone is so upset about uses a separate microphone in the remote control. It is only activated by pressing a button on the remote control (good) and the third party voice-to-text provider is Nuance, which has a worryingly liberal privacy policy (bad) that is referenced at initial set-up (good, but only for the person setting it up.)

I have asked for details of the Nuance privacy policy that Samsung’s customers are apparently shown at set-up, but the manufacturer has frustratingly tried to palm me off on Nuance for further details. This is precisely the kind of transparency failure I’m talking about. If it’s so hard for me as a journalist to find this stuff out, what hope is there for a user of this shared device who wasn’t there during the initial set-up?