SpeechTEK 2007

Speechtek SpeechTEK brings together the people who build those telephone voice response systems that are becoming a ubiquitous part of our lives.  While many of these systems are downright annoying, the cost savings are impressive - one estimate was that every second shaved from the human interaction in a directory assistance application saves the phone company $7 million.  The developers are keenly aware that the caller is not always enthusiastic about speaking to a robot and in fact measure their success by reducing the percentage of the time the caller presses 0 to get to a human being.  Sometimes they cheat, e.g. removing or hiding the escape, but other times the robot can actually do a better job than a human, such as GOOG411's sending a map to one's mobile phone.

A session on Simulating the Personal Touch  provided some insights into how human and machine intelligence can be combined to offer the caller a more human experience.  In directory assistance applications valuable seconds can be saved by reserving human talent for those cases where machine interaction fails, and having the comouter serve as the intermediary between the caller and the human in the box.  One commonly used technique is to record the caller's request while simultaneously feeding it to a voice recognition system.  If the voice reco fails, the request is passed as a WAV file to a human who can say the response to the caller or make a keyboard selection which results in synthesized speech.  While these systems can be very efficient, they haven't always been popular with the employees who may find the experience less than satisfying.  This is especially an issue in European countries with powerful labor organizations.  Indeed, such systems have led to strikes in France and outrage in the press in the Netherlands.

Of course, speaking directly to a human isn't always satisfying either.  Lizanne Kaiser gave a good example of how a particularly lame system at AT&T Wireless, instead of passing the account number from the call routing system to the reps' screen pop, had all of the customer service reps reciting the same scripted excuse for having to ask the customer to provide the number again.  And this was to prevent her phone from being disconnected for an underpayment of $0.05 after AT&T mis-scanned the original check.  Sometimes the solution is not so much adding technology as it is correctly designing the entire system, including the humans in the box.

Google 411 at SpeechTEK

At the SpeechTEK conference this morning, Mike Cohen, Manager of the Speech Technology Group at Google talked about GOOG411, the free directory assistance service they have been operating.  While he refused to say anything about what was coming in the future, he did provide a few useful tidbits about Google's approach to speech applications.  They see mobile devices and speech as an important mechanism for providing access to data and are building speech into Gogle's core infrastructure.

One of the more interesting parts of Mike's talk was about how they measure usability and go about refining the system.  One of the measures of "user happiness" they used was the percentage of calls that the caller allowed Google to transfer after receiving the initial information.  They did A/B comparisons on new features, such as offering to connect the first match before listing the rest of the results.  They found that this increased the transfer rate by 1.5%.  In order to verify that this was really the result of happiness and not just passive acceptance of the transfer, they did interviews with 34 subjects.  They did verify that people realy did find the feature useful, but also found that many people using the service didn't want to actually make a call but just wanted to get the information.

I learned about one useful feature, which is that in addition to saying "SMS" to get the results sent to my phone, I can say "Map It" to get a map of the results.  Pretty cool!

My Photo

Other Places to Find Me

Tracking