Talking to Siri has become pretty interesting

    Michael Sherer writes about interacting with our mobile devices.
    Posted on Dec. 9, 2012 at 12:00 a.m.

    Michael Sherer

    Crystal Ball

    In 1987, Apple Computer released a video called “The Knowledge Navigator” in which a college professor in 2012 walks into his paneled office and has a conversation with his computer, personified by a bowtie-wearing avatar. I love showing this video when talking about the future because it’s pretty easy to spot the predictions that have come true and those that haven’t.

    While the video accurately predicted a world of global networks, search engines, large databases and the integration of phones and computers, the biggest miss would have to be the natural language interaction between the professor and his computer. While I’ve been talking to my computer for nearly 20 years, it’s largely been a party trick. But that’s changing. Here’s why:

    Competition. Google, Apple and Microsoft, the dominant players in tech, are making voice interfaces a central strategy in their quest for long-term dominance. Apple surprised everyone with the release of Siri, the occasionally cheeky voice assistant, and Google responded with an improved Google Voice Search.

    While Microsoft has been on the mobile sidelines recently, they’ve had the car automation field largely to themselves since Ford adopted the My Ford Touch/Microsoft Sync system in its cars. The early going was rocky, but recent Sync updates have shown improvement, just as Siri of fall 2012 can do a lot of things that Siri of fall 2011 couldn’t and Google’s latest iteration of Voice Search has gained speed and a voice that is almost eerily human-sounding. This kind of competition augurs well for rapid advances in voice technology.

    Location and context awareness. Thanks to GPS and geolocation services for cell towers and WiFi, our phones, tablets and computers know where we are. That makes it possible to say things like “Remind me to call my father when I get home” or “Is there an Indian restaurant nearby?” and have the phone respond appropriately. Voice interaction isn’t absolutely necessary for location and context awareness to be useful, but it can certainly cut out a lot of steps.

    For example, on a recent trip, I told my iPhone: “When I land in Los Angeles remind me to get my jacket out of the overhead bin,” and it responded by setting a reminder titled “Get my jacket out of the overhead bin” that popped up when the plane touched down at LAX.

    Cloud processing power. Google and Apple rely on the massive processing power of cloud-based servers to amp up the accuracy of their voice recognition and the quality of their results. While this means the services lose functionality in the absence of a network connection, it also means that relatively low-powered devices can do some truly amazing things.

    Curated data. Wolfram Research got this area rolling with its “Wolfram Alpha” service, which they dubbed a “computational knowledge engine.” Over the last several years, Wolfram has been pulling together an ever-longer list of authoritative data sources for scientific, demographic, geographic, financial and other types of data.

    The result is a voice-driven search that allowed Apple to surprise Google with Siri, a voice assistant that was smarter and more capable than most people expected. (Try asking your iPhone, “What is the atomic weight of gold?” The results are pure Wolfram Alpha.). Curated data is a major part of Apple’s voice strategy going forward. Siri 2012 got the curated data makeover for sports and financial info and we can expect more niches to get filled next year.

    Deeper integration. Siri’s first iteration couldn’t open apps, and while the current version can, it doesn’t yet let you control third-party apps with voice commands. Google is ahead of Apple in this area, but that’s the direction things are headed.

    While the current reality involves you knowing the limits of your digital assistant, the future of voice interaction will have fewer and fewer limitations, and greater and greater capacity for conversational, complex interactions with your devices.

    GoshenCommons.org is a production of Goshen College and involves community bloggers. This post by Michael Sherer was posted on its Crystal Ball blog. You can read more at GoshenCommons.org.

    Recommended for You

    Back to top ^