Generative AI voice agents are finally accomplishing what IVRs and intelligent voice agents, or IVAs, set out to achieve in the contact center.
Why do businesses avoid the customer’s call? At $20 per call with a human agent, it’s not worth it for them to talk with you.
But if technology could allow voice interactions at a fraction of the cost, businesses could go back to engaging with customers in the channel they preferred. It’s only taken over 75 years for contact centers to find the appropriate solution.
The Birth of Contact Centers
From their earliest days of contact centers, the goal was to standardize and automate to reduce costs and simplify the operations of customer experience (CX).
Following the introduction of automated call routing systems, customer service hubs, staffed by “operators” sitting in centralized locations, managed customer calls, and processed orders.
But there were challenges. At the top of the list was the expense of hiring, training, and retaining customer service representatives.
Turnover was high, given the lack of advancement and the stressful working environment. More customers expected 24/7 services, an expectation that was often cost prohibitive and difficult to staff, especially for overnight shifts.
Why do businesses avoid the customer's call? At $20 per call with a human agent, it's not worth it for them to talk to you.
Many customer inquiries involved basic, routine questions: checking order status, account balances, seeing if prescriptions were ready, or the locations of stores. But paying human agents to handle these inquiries was inefficient and costly, while customers endured unnecessary wait times.
And, of course, humans are...human. They get sick, take vacations, and aren’t always in the best mood.
The Early Era of Speech Recognition IVR
In the 1970s, early IVR systems were launched, bringing a new tool to automate voice interactions and reduce their handling by humans.
These first phone trees let you “press 1 for sales, press 2 for service,” and so on. They didn’t need sick days or PTO: and mood swings weren’t an issue. Delivery of information could become standardized across the brand.
While initial systems were expensive and difficult to implement, hardware advancements brought down costs and simplified deployment.
But while IVAs were heralded as a huge step forward, from a customer perspective they were still pretty lackluster.
These developments led to an explosion of IVR use in the 1990s, especially across financial services, telecommunications, utilities, and the airline and travel industries. If your business wasn’t using IVRs, you were behind the eight ball.
However, IVR’s limitations soon became apparent. Early systems required customers to listen to long lists of menu options, and there was no guarantee their specific issue would be on that list.
These speech-enabled systems often struggled with accents and background noise, leading to frustration and the infamous “press zero” option to reach a human operator: defeating the purpose of the IVR entirely.
For businesses, the worst part was that instead of solving problems, these systems became emblematic of bad customer service, eroding trust in the brand.
NLP and IVAs
Advances in natural language processing (NLP) enabled major steps forward. NLP was an early method for understanding and processing human language and human intents. It allowed IVRs to ask more open-ended questions, such as “How can I help you?”
With NLP, IVRs became IVAs. The customer was now able to engage with the system more verbally instead of incessantly pressing buttons or screaming the ask to speak to a human. NLP enabled intent detection, which is the ability for the model to identify what the customer was asking about.
But while IVAs were heralded as a huge step forward, from a customer perspective they were still pretty lackluster.
Issues with the system misunderstanding the customer’s request were still common. The IVA would take what was said by the customer and match it to the closest result based on their interruption. It didn’t seem to matter if the IVA’s response was irrelevant and had nothing to do with the request.
So, if you said, “I’m really struggling with opening the window,” the IVA might respond, “Great. I’ll help you find a new window.” It didn’t understand the nuance of what you were saying and tried to match its settings to the closest thing it knew.
Customers realized the IVA couldn’t help with complex or nonstandard requests, and often immediately resorted to mashing “0” or saying “representative,” to get out of them. And it was actually hugely time consuming for companies to build and manage these systems as they were so rigid.
The Rise of AI Voice Agents
Through it all, customers still seek out that voice engagement. Up to 70% of customer interactions still happen by phone despite companies’ best efforts to hide their customer service numbers.
Large language models (LLMs) enable the understanding and generation of language as well as reasoning. This massive technological advance is powering a new generation of automation: voice AI agents.
Whereas traditional NLP-based IVAs excelled at structured, rule-based customer interactions, LLMs enable more natural, nuanced, and contextually fluent engagement.
Internally, we find that despite it being obvious that they are speaking to an AI agent, voice AI agents boast containment rates above 97%, dramatically better than the 20%–40% containment seen with IVAs. That puts voice AI agents’ containment rates on par with top-performing human agents.
The most exciting part is that in the last 12-18 months, AI agents have started to perform better than human agents.
Why? Human agents must jump from call to call, with little downtime. They aren’t always available and even when they are given customer information, they aren’t able to instantly process it all. They are humans, after all.
Voice AI agents are finally fulfilling the promise of a low cost, high-quality voice interaction...
AI agents, by contrast, are always available, and can quickly ingest every past conversation, analyze it in real time, and deliver deeply personalized conversations.
Customers can get the experience they want and in the channel they prefer (voice, most often), with better outcomes than a human could provide.
Yes, early LLMs have known weaknesses, but these have been acknowledged and are being addressed, like accepting uncertainty to avoid wrong answers (i.e., hallucinations).
Also, leading suppliers have been investing heavily in guardrails, temperature control (see BOX), continuous supervision, post-call QA, and more on their LLM-using platforms. This ensures that brands are comfortable using AI agents to handle customer interactions.
What is AI "Temperature Control"?
When you come across the term “temperature control” in discussing AI it doesn’t refer to that in the user’s office, the conference room, or the data center.
Instead, temperature control refers to tuning how deterministic or creative an AI model’s responses are.
Lower temperatures make the model more predictable, consistent, and on-brand, while higher temperatures allow more variation but increase the risk of going off-script.
We have done more than 350 million calls with a CSAT over 85%, indicating customers are happily taking advantage of brands’ investments in this area.
Voice AI agents are finally fulfilling the promise of a low cost, high-quality voice interaction that will eliminate the need for contact centers to staff humans to do Tier 1 customer interactions and open up budget that will enable companies to massively increase investment in the CX.
A hybrid approach enables organizations to efficiently and cost-effectively scale support and provide customers with quality experiences, while also elevating the role of human agents.
The real winner? The customer.