They didn’t originally plan to create a stand-alone venture. They talked to dozens of telecom carriers and handset providers, with the aim of jointly starting a project that would license the technology. But because the few resulting commercial projects implemented only small parts of its original vision, the founding team decided to drop that idea and create and build its own venture. Speech-to-text was the easy part: SRI had launched Nuance, a world leader in speech solutions. The hard part was analyzing words so as to understand the user’s intent and then reason about and answer the request. The runaway success of Siri demonstrates how well the team met that challenge.

The market vision that led to Siri, the virtual personal assistant that’s now an integral part of Apple’s iPhone, can be traced back to 2003, when a mobile phone’s primary applications were still limited to ringtones and messaging. The author and his colleagues at SRI International recognized that the phone’s growing capabilities would eventually put a communicating supercomputer in everyone’s pocket. They believed that their company was well suited to be a leader in the inevitable technology and market revolution—as it had been in every previous computing revolution.

Jeff Singer

In the long process of designing and perfecting a product, there’s often a single moment when a potential customer’s reaction helps overcome the doubts that surround any creative endeavor. For Siri, the virtual personal assistant that’s now an integral part of Apple’s iPhone, that moment came on an airplane in 2009. I had just taken my seat on a delayed flight when a passenger asked what time we were expected to land. Since I was one of a few dozen people testing Siri, I took out my phone and said, “Siri, what time is United Flight 98 expected to arrive?” When Siri responded with the updated arrival time, the passenger looked stunned. He said, “I have only one question: Why are you sitting in coach? You ought to be a billionaire!”

I had been so deeply immersed in the venture’s business, technological, strategic, and financial challenges that I had lost sight of how dazzling the Siri technology was. It took a stranger’s dropped jaw to remind me: We had developed a smartphone application that could understand and answer questions using natural language. We were going to put artificial intelligence into millions of consumers’ hands.

It had been a long road with a couple of surprising turns.

The Valley of Death

As president of SRI Ventures, I lead the group that creates, builds, and spins off ventures at SRI International, an organization founded in 1946 as Stanford Research Institute (and independent since 1970). I have an amazing job. Every day I watch the development of breakthrough technologies with the potential to make people safer, healthier, and more productive.

But a valley of death lies between invention and innovation. This is a common metaphor in the venture world, because most inventions perish before reaching the marketplace, for lack of a large and growing market, a strong value proposition and business plan, or sufficient resources.

It’s my job to help opportunities cross this valley of death. Sometimes we succeed beyond our wildest dreams. Siri was indeed a stunning breakthrough.

The market vision that led to Siri goes back to 2003, when a mobile phone’s primary applications were still limited to ringtones and messaging. We recognized that the phone’s growing capabilities would eventually put a communicating supercomputer in everyone’s pocket, and we believed that SRI International was well suited to be a leader in the inevitable technology and market revolution.

We formed a team, dubbed Vanguard, to develop market concepts. Some early ones were to put intelligence into the smartphone so that users could ask it by text or voice to perform tasks, such as scheduling a call among multiple parties, placing a call, or ordering groceries.

At about the same time the Vanguard team was formed, the U.S. Defense Advanced Research Projects Agency (DARPA) funded a $150 million program to develop a “cognitive” software assistant. (One inspiration was Radar O’Reilly, of the TV series M*A*S*H, who always knew what his colonel wanted before the colonel did.) Concepts from the DARPA program contributed to Vanguard’s thinking and ultimately helped inspire Siri.

Over the next four years creating a stand-alone venture was not our goal: We talked to dozens of telecom carriers and handset providers, with the aim of starting a joint project that would license our technology and deploy an intelligent assistant in the commercial world. This turned out to be difficult. Again and again we heard various objections: “Not possible: The technology is 20 years away.” “Too expensive” (we were seeking $5 million to $10 million in development funding, plus licensing fees). “Not part of our business model.” “Creating a product will take longer than 12 months.” “Not an early source of revenue.” “We’re already doing it ourselves.” We did a few projects with companies that implemented small parts of our vision, but ultimately we decided to spin off a venture from SRI to create a whole new product category.

The Four Ingredients

The founding team of SRI business and technology leaders met almost daily in SRI’s venture space to discuss the market and product possibilities. We knew that to succeed we needed four major ingredients: a solution to a large and important problem or pain point, with potential for rapid market growth; a differentiated technology that would trump the competition; a team capable of outstanding execution; and a value proposition and business plan that would articulate the venture’s strategy and value. Without all four, the probability of success would be nearly zero.

We also knew that we had only a short time and limited financial resources to enter and succeed in the market before we ran out of money or competitors emerged.

The pain point.

Over several months the team zeroed in on the market opportunity: People were frustrated by all the keyboard clicking needed for any task on a smartphone. (Clicking on smartphones was not yet natural in 2007.) Market research found that each time users had to click through a screen, 20% abandoned the application or purchase intent.

The breakthrough idea behind Siri was simple and powerful: In contrast to search engines, Siri would be a voice-driven “do engine.” It would understand your query, automatically access the information needed, and distill it into an answer. All the effort would be made by Siri rather than by the user—it would be a virtual personal assistant that would help people buy tickets to a ball game, make a dinner reservation, get a weather report, or find a movie with one or two clicks.

The differentiated technology.

The technology needed to address the pain point was daunting, even though decades of development were behind it. Converting speech to digital text was the easy part: SRI had launched Nuance, a world leader in speech solutions, in 1994. The hard part was analyzing words so as to understand the user’s intent and then reason about and respond to the request. The computer had to identify concepts and associate groups of words with them. Humans perform such tasks easily, but most people believed they were impossible for computers.

The broad basis for technology to understand natural language had been developed by the SRI Speech Technology and Research Lab and SRI’s Artificial Intelligence Center in programs with DARPA, and by SRI’s internal investments. Adam Cheyer and Didier Guzzoni led the specific implementation that allowed us to make Siri a product that could be deployed to millions. For almost two decades Cheyer, one of SRI’s most visionary computer scientists, had designed and implemented delegated computing and “agent-based systems” that let humans interact with networked programs and devices. With Guzzoni, his PhD student, he developed approaches for natural-language understanding and reasoning that simplified the task of responding to queries.

The team.

We were fortunate to recruit an outstanding entrepreneur, Dag Kittlaus, to be the new venture’s CEO. Cheyer chose to leave SRI and join the venture. Tom Gruber, a leading innovator in intelligent user interfaces, joined a few months later and eventually became the CTO. Bill Mark, the president of information and computing sciences at SRI, and I were the other founders. We two remained at SRI, and I became a board member of the new venture.

The value proposition.

Over the next six months the overall value proposition came into sharp focus: We would solve a major problem for millions of consumers with a powerful product that could generate billions of dollars in revenue. Specifically, Siri would relieve the pain of too many clicks; save people time and energy; provide a differentiated and breakthrough technology through speech recognition, natural-language understanding, and artificial intelligence; provide revenue-generating uses; and surprise and delight consumers. We decided that Siri’s business model would be dependent on collecting fees from websites for helping to execute transactions. We recognized that revenue from the leads Siri provided to hotels, restaurants, and airlines could be substantial.

In late 2007, after six months of crafting the value proposition, we decided to seek outside investment for our spin-off venture. We knew that finding backers would not be easy, because Siri depended on breakthroughs in both market and technology. Many venture capitalists had seen the hype versus reality for AI and were skeptical. They worried about every element of the value proposition and business plan, including market, technology, and competitors. Would we be able to grow a large consumer base? Would the processing power of the smartphone be sufficient? Would the AI technology work? Would communication and processing be too slow? Would the lead generation business model produce enough revenue? Would potential competitors, such as Google and Microsoft, respond rapidly with their own products?

In the end, concerns can only be mitigated, not eliminated. Siri would be a bold but risky investment. It would clearly have an impact on the mobile industry with its disruptive technology—the result of a remarkable convergence of worldwide trends, including the emergence of smartphones; the acceleration of computing and storage capacity and communication speed; the growth of web services and interfaces; and the development of new AI systems. The time was right.

We raised $8.5 million—enough to fund the venture for 18 months. The funding process gave us far more than money, however. It gave us courageous, insightful investors who became our partners, helping us identify business models, develop strategy, build relationships with customers, and more.

Still, we faced many challenges: We were delayed for six months by issues relating to the slow server-to-user response and the speech recognition technology. In the meantime, Google and others were making progress on their own solutions. Some companies made offers to acquire us. Deal terms with providers and web services companies were complex. Wireless carriers emerged with opportunities that distracted from our initial product.

The Launch

Finally, after user testing from November 2009 to February 2010 (during which I showed off Siri on that airplane), we were ready to launch in Apple’s App Store. (That “Siri” is close in spelling to “SRI” is pure coincidence. We chose the name for several reasons, including that it was just four letters and did not have negative connotations in any language.) We had prepared with demonstrations and reviews by top bloggers from sites such as Scobleizer and TechCrunch. The demonstrations were a great success, and the press created an avalanche of consumer interest. Siri was downloaded free at an astronomical rate. It was in the top 50 of all Apple apps and was the top lifestyle app.

Two weeks after the launch, Kittlaus received an unexpected phone call: “Hi, this is Steve Jobs.” Kittlaus thought it was a joke and hung up. Then the phone rang again: “Really, it’s Steve Jobs.” It was. The two talked for a while, and Jobs congratulated Kittlaus on Siri’s capability. He invited Kittlaus, Cheyer, and Gruber to his house, where they discussed Siri’s technology. Jobs understood the value of the engine’s AI as well as the nature of the technology and the certainty that errors, such as in recognition of natural language, would always occur—but he was not discouraged. That seemed remarkable, because virtually all Apple products are designed “for perfection.”

Over the next few weeks Jobs and Kittlaus discussed a purchase price for Siri. We were not eager to sell, because we believed the value of the business would almost certainly increase with successful trials and new distribution deals. But Jobs made an offer that the investors and the executive team couldn’t refuse. (The price cannot be revealed because of contractual obligations.) The team was also deeply attracted to working with Jobs and Apple.

A year later Siri became the core platform for a highly popular service on Apple’s new iPhone 4S. On October 4, 2011, Phil Schiller, Apple’s SVP of worldwide marketing, introduced Siri as the “coolest feature of the iPhone 4S.” The next day, Steve Jobs died. I’m grateful that he got to see the presentation. In the first few weeks post-launch, analysts reported that Siri helped to accelerate billions of dollars’ worth of sales. Siri remains a core element of all Apple’s iOS devices.

Apple and many other companies, including SRI, are now in a race to develop products that both advance the technology and serve new markets. Much can be done. Speech and natural-language recognition and machine learning are still in their infancy. New virtual personal assistants will be even better at word and language understanding. They will maintain context, enable true conversations, learn from their users, and become “specialists” helping consumers access information such as health records and bank accounts. For example, SRI recently launched a new venture, Kasisto, that is redefining the mobile banking experience through speech, text, and touch interfaces and has the capacity for conversation. The future of virtual personal assistants is unquestionably secure.