(Earlier this month, a local news report about a young child who ordered a dollhouse through Amazon’s voice assistant triggered Amazon Echo devices sitting near viewers’ TVs to place the same order during the segment.)

The primary way people interact with smartphones is by touching them. That’s why smartphone screens can be thoroughly locked down, requiring a passcode or thumbprint to access. But voice is becoming an increasingly important interface, too, turning devices into always-listening assistants ready to take on any task their owner yells their way. Put in Apple’s new wireless earphones, and Siri becomes your point of contact for interacting with your smartphone without taking it out of your pocket or bag.

The more sensors get packed into our ubiquitous pocket-computers, the more avenues someone can use to control them. (In the field of security research, this is known as an ‘increased attack surface.’) Microphones can be hijacked with ultrasonic tones for market research. Cameras can receive messages from rapidly flickering lights, which can be used for surveillance and connectivity or even to disable or alter a phone’s features.

Most assistants include some safeguards against overheard or malicious commands. The phrases I suggested you shout out earlier will prompt phones within earshot to ask for confirmation. Siri, for example, will read back the contents of the text or tweet a user dictates before actually sending it off. But a determined attacker could conceivably defeat the confirmation, too. All it would take is a simple “yes” before a device’s owner realizes what’s going on and says “no.”

Hidden voice commands can cause more damage than just a false text or silly tweet. An iPhone whose owner has already linked Siri to a Venmo account, for example, will send money in response to a spoken instruction. Or a voice command could tell a device to visit a website that automatically downloads malware.

The researchers developed two different sets of hidden commands to work on two different types of victims. One set was created to work on Google Assistant, which is challenging to hoodwink because the inner workings of how it processes human speech aren’t public. To start, the researchers used obfuscating algorithms to make computer-spoken commands less recognizable to human ears but still understandable by the digital assistants. They kept iterating until they found the sweet spot where the audio was least recognizable to people but most consistently picked up by the devices.

The resulting hidden commands aren’t complete gibberish. They mostly just sound like they’re spoken by a fearsome demon rather than your average human.

If you know you’re about to hear a masked voice command, you’re probably more likely to be able to parse it. So to avoid those priming effects, the Georgetown and Berkeley researchers enlisted Americans through Mechanical Turk, Amazon’s service for hiring workers for small projects, to listen to the original and garbled commands and write down what they heard.