Twimlets and Zapier

Twilio has this website called Twimlets which gives you very basic Twilio applications that you can basically create with a form that it gives you a URL and you can just copy-paste url onto a Twilio phone number. I’ve gotten an unreasonable amount of value from Twimlets over the years. I use them for all sorts of things, but running my company’s voice mail to use them for monitoring task. For example, if my queue workers are down, there’s a cron job that checks for that every five minutes. If it detects that state of affairs, it fires a phone call against a Twimlet, which does a very simple thing that says “The queue workers are down. The queue workers are down. The queue workers are down.”

Theoretically I can host that myself, but one can imagine circumstances where the reason the queue workers are down is something that also brings down my web infrastructure. But there’s virtually nothing that can cause my queue workers to go down at the same time as the Twilio Twimlets service go down. Having that logic be somewhat external to my own application makes it a little more reliable. Diversifying away from single points of failure is pretty important to my reliability story.

That’s something I would encourage you to do, just to go look at Twimlets. They’re easy to get and they provide a surprising amount of functionality if you use them creatively. You can also chain them together, which is really fun. They have a simultaneous ring system. If nobody picks up, then send them to another Twimlet and then another Twimlet. Send an SMS message to someone who called but you hadn’t gotten to them before. That sort of thing.

Another fun thing… If you use any sort of cloud service, you really need to look at Zapier. It’s a company that helps plug APIs into each other. The notion is that you have one API which is the source of events, then one API which is the source of action which should take an action in an event happening with the first API. Typically Twilio is not the source of an event, but rather it’s the action I want to take. Typically I want Twilio to send an SMS in response to something happening. The sky is the limit in terms of what APIs you can hook up via Zapier.

For example, when a new lead is added to my CRM system, send an SMS to my sales representative saying, “Hey a new lead was added to the CRM system. Click here on your iPhone to open that up.” Or when we get an e-mail which matches a certain regex, send an SMS. When a new ticket is added to our customer support system, send an SMS to our customer support folks. When a new ticket that with priority greater than X was added to the customer support system, send an SMS to me. The sky is basically the limit.

As Zapier integrates with hundreds of APIs, if there’s anything where you ever thought “Man, it would be great if this cloud service and this cloud service played together,” Zapier makes it happen. Often the surface I want to play together is “Twilio plus everything else in the world.” I really truly believe after you have the box of capabilities that Twilio opens up to you, you see Twilio apps everywhere. My friends make fun of me for this, because my answer to everything is, “I could probably make that better with a Twilio app.” They say, “That’s your answer to everything. Your answer to everything is a Twilio app.”

Twilio makes magical experiences

It’s such a magical experience. It gets into some deep cultural anthropology, but we have a very deep relationship with our cell phones these days. I’ve never seen so much giddiness as when I’m demoing a boring, B2B software application on my iPad and I say, “Hey, give me your phone number. I’ll type it in here.”

Doo-tee-doo-tee-doo. Hit go. The phone rings. I’m like, “Whoa, that’s fun. What is it?” Then they slide it down and are like, “Hello?” Then have a voice actress who recorded a short mp3 file for me that just plays in their ear. They’re going, “She’s talking to me! She’s talking to me! Like the computer is talking to me!” They light up with excitement.

“You want to see something real exciting? Hit ‘1’ on that phone call.”

They hit ‘1’ and it shows up immediately on the iPad. It’s obviously gone from Twilio, Twilio has hit a webhook on my site which has pushed something via socket.io to the web page that’s opened on the iPad. It shows a notification within a fraction of a second after the button press happens on their cell phone. “Wow, that’s amazing!” I’ve done the demo from my application. I’ve done other folks applications. Even with hardened, “I have seen it all, done it all” software folk, giggles happen.

Use Voice Actors instead of the <Say> verb

When you’re doing Twilio apps, you’ll be using the <say> verb a lot. The <say> verb has a text to speech option to read out things. They’re good enough for development so that you can verify that the logic and flow control is working correctly.

Once you’re exposing these phone calls to customers, you typically don’t want a very robotic phone call to be representing your business or your customers’ businesses. Say for an appointment reminder, you don’t want someone’s first point of contact with their dentist to be a robotic voice that says, “This is an automated Appointment Reminder from Happy Teeth Dental. Your appointment with Dr. Benedict is at 4:47 pm on Tuesday the 12th.” It’s just not a great experience.

Instead, to the maximum extent possible, after we know what text we’re going to use, I like to have that recorded by an actual human. There’s a few ways you can do this. There’s a service called Voice Bunny, which actually has an API available. It’s a little expensive, because they’re using professionals for this. Worth the money for things which you’re going to be reusing a lot.

If you want to do things a little cheaper, for $5 you can go on Fiverr where you get basically a bored college student to record the same thing for you. The online demo of my application, which represents my company and has for the last four years, was recorded by someone in two Fiverr engagements, for a total of $10. It’s closed tens of thousands of dollars in business for us. That’s obviously an option.

One subtlety about that is that typically it’s economically non-viable to record a unique mp3 file for every call. You’ll often stitch together things via a Frankensteiny mix of different mp3 files together. For the logic for my application we basically wanted to mix and match a selection of introductions: the body of the appointment details, the date and time, and the selection of instructions to use: “Press ‘1’ to confirm your appointment. Press ‘5’ to cancel your appointment.” Or if a customer doesn’t want the cancellation, you have a seperate mp3 file that we can play which doesn’t mention the cancellation option.

Now, the part in the middle there where there’s dynamic information getting generated: “4:47 pm on Tuesday the 12th.” As as a stop gap measure, you can have the computer narrate that. But here’s a little bit of trick: if there’s a restricted range of data that can possibly be, you just pay your voice actress a little bit of extra money and have her record an mp3 file which says “One, two, three, four… twelve, fifteen, thirty, forty-five, o’clock.” Splice it up into a bunch of mp3 files and then you do some Frankenstein stitching on the fly. Since Twilio will play any number of mp3 files in a row, it won’t sound completely natural, but it will definitely sound like human talk with perhaps a little bit of weird pauses. Depending on how good you are at mp3 editing. Folks actually think that it is a real human leaving all the messages. It’s kind of awesome.

Again, we have a very personal relationship with our phones. People feel a sense of intrusion when they feel like an automated processor, somebody who doesn’t really care about them, is intruding on their personal space on their phone. You know, it’s intrusion both into someone’s psychological personal space, their iPhone in the pocket, in their hand. It’s also intrusion on their time. That’s pushing a phone call to someone and demanding they pick up and listen to it right now. It is, by its nature, an intrusive action.

Now, it doesn’t mean we should never do it, obviously. People who have an appointment with their doctor tomorrow to get their cancer drugs adjusted should really hear about that. That is important for them. But, we should be conscience of the fact that that interaction is a personal one and give them give them the true impression that we as a business really care about them, even if it’s an automated interaction.

That means doing human niceties. You would never have a conversation with someone like you would have a conversation with an API. There’s a Japanese word for it but I don’t know it in English: to just jump directly into the business part of the conversation without having some human niceties like, “Hey, how are you doing? How’s your day today?” Just like one would never have a solely goal directed conversation with somebody, we should also have the human niceties in the phone call and, dealing with the UX challenges with that, using a human voice to the maximum extent possible.

Dealing with voicemail

You should think of how your phone call will be played in different environments. Here’s a funny Twilio story for you….

I was not conscious of the fact that many of my phone messages that I was leaving would be played on answering machines. Let’s say you leave someone a message which says, on the assumption that someone’s live on the call, “Press ‘1’ to confirm your appointment, press ‘5’ to cancel your appointment, press ‘9’ to have your service provider contact you about your appointment.”

Someone presses ‘5’. The next thing they hear, is “This message has been erased.”

Someone is pushing buttons against their voice mail system, not against my application. My application didn’t know that. The user, who has no clue what’s going on, feels very bad. They ask their dentist about it. My dentist doesn’t know what’s going on. This was causing problems for me for months before I realized that was happening.

Twilio has a way to detect whether you’re speaking with a answering machine or not. Unfortunately it’s not 100 percent accurate. It’s been getting a little bit better over the years. Ball park, it’s maybe an 80 percent solution. I’d investigate turning that on. I would also, if you’re running a multi-tenet application with Twilio such that you have different clients, you might have different client populations, give people the ability to turn on answering machine detection per account basis.

Also, give them the ability to do the following trick on a per account basis. Different client populations have different handsets they use, have different voice mail systems that they use. Often, a local business has a particular carrier that’s hegemonic in that region, or a particular socio-economic group that’s associated with that business that tends to use faily similar handsets and those hand sets have fairly similar limitations. I found that the settings that work for some clients don’t work as well for others. It’s good for me to be able to offer choices that I can tweak per client, rather than making an assertion across my customer base that is incorrect.

If you’re using the Twilio thing to detect whether you’re talking to a voicemail or not, Twilio has to guess the point in the voice mail that’s “Leave your message after the beep,” guess when the beep happens, and then start playing your pre-recorded message. Unfortunately, it doesn’t guess it correctly 100 percent of the time. A symptom that your user could get as a result of that is Twilio starts playing a message early but the machine only starts recording after the beep is that your message starts half-way through.

They’re typically not too happy about that. But, fun fact about most voice mail messages: they will immediately skip and start recording as soon as you hit something on your keyboard. You can skip to the beep by pressing one. An option that I give my customers is “We will play the tone for a button press immediately on starting every call,” which skips immediately to the recorded part of the voice mail prompt so that they almost always get 100 percent of the message recorded.

That’s a trade off I give my customers, too. Why wouldn’t you do that all the time? Well, if it’s on a live phone call, the live phone call would be like “BEEP! This is an automated message from your dentist….” When folks say, “My customers typically use who complain about the message being cut off.” Then I say, “Well, okay, we have this option for you. Here’s the trade off. Are you willing to take that trade off?” If it’s important to them, they are.

Another thing you can do, just like we build reliability into our systems and assume that queue workers will fail some of the time, we might assume that the message fails some of the time. One of the things I do is I play messages in a loop, at least twice. Let’s say a typical message is, “This is an automated message from your doctor, your appointment is at… Press ‘1’….”

Let’s say the first half of that message gets cut off. The message which somebody actually hears might be “… December 24th, press ‘1’ to confirm. Press ‘2 to cancel. Press ‘9 to have us contact you.”

That’s not a very useful message. But if you repeated the original message and say, “This is an automated Appointment Reminder from the doctor” at the end of the cut off part. “Your appointment is at 10:00 o’clock on December 24th. Press ‘1’ to do this, press ‘2’ to do this, press ‘3’ to do this,” then that’s a much more comprehensible message.

The level of complaints my customers were getting back unintelligible messages went way down after I put them in a simple for i=1; i<=2; i++ to repeat the message text.

How Patrick got a Twilio track jacket

This is actually my favorite jacket in the world, I wear it almost every day. Twilio is very protective of their jackets. You have to either be an employee or perform them a conspicuous service. The conspicuous service that I performed for Twilio back in the day was there was, there a security vulnerability in the Twilio application.

They’ve got a team of crack engineers, but everybody has a problem once in a while. I ran into a bug, which as I was debugging this with the very helpful support folks they said something, which really piqued my radar.

Well, if that thing they said is just true, this is very, very bad news for us because it allowed at the time — they fixed it three years ago so don’t worry about it — but at that time would have allowed any user of my application to control my applications use of Twilio. Which would be bad because it would allow someone to send phone calls to any number with any sort of content that they wanted, drain my Twilio account, harass people, get information out of my application that they shouldn’t get. Very bad news.

After dealing with the Twilio folks and realizing that this bug was a symptom of an issue which would allow someone to hijack Twilio applications, I reported that to the Twilio team. No lie, after I CC’ed the issue to their security contact, three minutes later I got an e-mail from the CEO of the company. One minute after that I got an e-mail from the CTO of the company that they were on it. They fixed it within probably an hour. I was very happy with their responsiveness there.

I’m generally pretty happy with Twilio’s responsiveness to everything over the last couple years. They’ve been very, very good to me. I strongly suggest that you use them. They’re far and away my favorite company that I do business with.

I love talking Twilio and all sorts of things to people in the software industry. My e-mail address is [patrick at kalzumeus dot com] or [patrick at appointmentreminder dot org]. They’re both the same and will send me an e-mail about the same time. Send me an email anytime, I love talking about this stuff. If there’s ever anything I can do for you, drop me a line.

Photo credit: The Business of Software Conference.