Let’s talk about tech: How voice technology is changing the way we interact with devices

23 Aug 2012

It’s less than a year since Apple’s Siri made her debut on the iPhone 4S and since then she has split opinions of consumers into those that find the personal assistant app useful and those that simply aren’t keen to speak to their phone unless there’s someone on the other end of the line. But adoption grows and consumers become more comfortable using voice technology, what’s next?

The key selling point of the iPhone 4S, Siri gave the smartphone a voice and a personality, transforming it into more than just a device. As voice technology quickly advances and more of these services spring up, chances are it won’t be long before we’re all talking to our devices, and they’re talking back.

Evi-lution

Entrepreneur William Tunstall-Pedoe and his team based in Cambridge, England have been working on their own voice technology service for several years. Evi, a personal assistant app for the iPhone and Android smartphones, launched in January and quickly shot to No 1 on iTunes in the US and the UK within its first week of release. Since then, the app has been installed more than 1m times.

Developed in the UK, Evi is more useful than Siri for European users with local search enabled in a number of countries this side of the Atlantic (Ireland included), and Tunstall-Pedoe alleges that she can handle a variety of accents and dialects.

The years of development were spent building up a massive and constantly growing knowledge base for Evi. “One of the things Evi can do with that knowledge is combine facts,” Tunstall-Pedoe explains. “She’s capable of reasoning, taking multiple facts and combining them together to produce new answers.”

 

The team created a database of 635m facts on 28m things from all manner of sources, from Wikipedia to the Yellow Pages. As well as these things that Evi ‘knows’, the service has also partnered with companies like Yelp in order to enhance local search.

Aside from building up Evi’s knowledge, Tunstall-Pedoe and his team had to license speech recognition technology that could satisfactorily convert sound into text and then make it so that Evi could understand what this text meant. “It’s not uncommon for a question to quite literally have millions of different ways it can be phrased, and all of those ways need to be understandable by the machine,” says Tunstall-Pedoe. “The technology is very challenging but we have gotten to the point where it is now practical, and it’s only going to get better from here.”

Artificial intelligence

The magic of advanced voice technology is the ability to communicate with these services using natural, everyday language. “Evi’s core technology is about understanding what the user means,” explains Tunstall-Pedoe.

Unlike Siri, who hunts around for an external service to call up the required information, Evi is fundamentally designed to understand and know things herself and respond directly to questions.

But why would you talk to a device when you can type a query into the old reliable Google box instead? “A search engine can index a web page and map it by the keywords in it, but it can’t really understand the information and it can’t answer you directly,” answers Tunstall-Pedoe.

Search engines don’t have artificial intelligence. They index pages and use statistics and rankings to deliver results, and they do this at an enormous scale. “What we’re trying to do is understand what the user genuinely needs, and understand the world’s knowledge, and […] respond to that user directly in a way that’s like talking to another human being,” says Tunstall-Pedoe.

Making service easy

“There’s lots of evidence that voice is the way that the world is going,” says Tunstall-Pedoe. “Even with imperfections in the technology that you currently see, many people find it much easier to interact with their devices by speaking to them. It’s always been true, it’s just been [a matter of] waiting until the technology became practical – and that moment has arrived.”

Sebastian Reeve from Nuance’s enterprise division agrees. Its new virtual assistant, Nina, is targeted at businesses and serves more as a customer service representative than a PA. Currently available in UK, US and Australian English, further development is expected in the coming months that could see Nina going on a trip around the world.

“It’s all about making service easy,” says Reeve. “From the customer perspective it’s about driving down the time to get to the resolution they’re looking for,” he explains. “For an enterprise it’s about increasing the automation.”

 

Split personality

Like Evi, Nina is designed so that users can speak to her in their own words. She can decipher the meaning of what is being said based on a database of different contexts and meanings and quickly navigate users in the right direction.

Nuance have opened up Nina’s SDK for enterprises to embed the service into their own applications, and it can be customised to represents the brand she is ‘working’ for. Businesses can adjust the original Nina template, change how she sounds or even add new functionality with the help of Nuance.

“We recognise that a big part of this is persona, and I think we know that every company’s going to want their Nina. In fact, they won’t all be called Nina when they hit the market,” Reeve explains.

Security benefits

As well as making customer service easier and more efficient, Nina offers something unique in terms of security. Using voice biometrics, Nina can verify a user, just as a password does. “It’s not just about understanding what you want and executing that, it’s about understanding who are you, and if you are who you say you are,” says Reeve.

To do this, Nina matches characteristics of a user’s voice with an established ‘voice-print’ created at set-up. A confidence score is assigned as a percentage and this can be configured by a company to be as secure as they wish. For example, back transactions may require a confidence score of 90pc or higher.

This offers a level of security control not available through passwords and PINs, and, as we all know how easily hackers can skirt around other security protections, this makes it less vulnerable to attacks.“Banks are really starting to see that this offers some great benefits, not only for security but it’s a good trade-off between that and customer experience,” says Reeve.

The way of the future

Reeve has seen a notable consumer shift towards the adoption of voice technology in the past year. “The adoption feedback from certain studies that I’ve read support the fact people are using those services on a fairly regular basis,” he says.”It’s not for everyone, so let’s not pretend I think that these services are one-size-fits-all […] but there’s certainly a large segment of people who are really getting it.”

Specifically, Reeve has seen huge demand for customers to use mobile apps to solve problems in this way. He puts this down to the simple rule that consumers will always opt for the easiest channel to get the job done and, right now, this is through voice.

“Voice is the way that we’ll all be interacting with all sorts of devices in the future,” says Tunstall-Pedoe, “and it’s not just mobile phones – it’s televisions, it’s in-car systems, etc. And the only thing that has prevented that up to now has been the limitations in the technology.”

Voice technology image via Shutterstock

Elaine Burke is the host of For Tech’s Sake, a co-production from Silicon Republic and The HeadStuff Podcast Network. She was previously the editor of Silicon Republic.

editorial@siliconrepublic.com