I want to apologize. sometimes I feel that it's all my fault. I know I'm just a small cog in a giant industrial machine, but I have worked on interactive voice response systems for decades. Much of my career has been focused on helping spread awful IVR hell and worthless chatbots around the world. You're welcome.
When you are involved in building self-service tools it's easy to blame the companies that deploy them in God-awful ways. IVRs don't kill people; people kill people. But if IVRs didn't exist, IVR hell wouldn't exist either.
In the 1990s there was a good excuse to build awful customer experiences: You had 12 touchtone keys for input and the spoken word for output, so what could go wrong? It was limited, and people tried to push the boundaries of what it could do; the results were often comical or disastrous.
Since the 1990s, so many cool, new language understanding capabilities came along that seemed like they would transform self-service. So many times we've been disappointed:
- Sophisticated natural language understanding-driven speech recognition. Allowing us to talk to the systems, providing a much more sophisticated input mechanism.
- Machine learning and text-based language understanding for chatbots, allowing a new level of sophistication.
- Speech-to-text capabilities to extend the power of chatbots to intelligent virtual assistants and speech applications. (Alexa for your contact center anyone?)
Recently I attended the Fivec9 CX Summit, and during a breakout session at the event the presenter asked what sort of self-service people offered. Old-school press-or-say type applications were the most common.
People, check the calendar, it's 2023!!!
As I said, it's not the company's fault, it's the tools we give them. The state of the art today is limiting. You must identify each question specifically and teach the system all the ways that a single question might be asked. Once you have defined the input you want, you need to parse out all the weird partial answers that people love to give and handle the cases where the person just jumps to something new.
These limitations make it very hard to get out of the weeds and emulate a real conversation.
Generative AI and large language models are different; this time we might really see a change. With a large language model, the programming approach is flipped on its head. We no longer need to identify everything we want the system to understand and do a bunch of bespoke programming to understand what is being said. Now, the job is to narrow down what we want the system to say. This is so much faster and easier to do: Just ask the handy-dandy LLM for some information and it will do the rest.
This is no longer the art of programming, it's the art of asking questions (building the prompts for the LLM). It's easier to understand what you are doing, it's far less laborious, and it's sure to take self-service to the next level, just like speech recognition, machine learning, and high-quality speech-to-text were supposed to do. But this time it's real. Trust me; have I ever let you down before?
Max Ball is a principal analyst at Forrester Research.