From AI Agents to Human Agents

Maik Hummel / Oct 2025 / Strategy, Customer Experience, Technology, Artificial Intelligence

How next-generation AI transforms voice calls.

When AI was first introduced in the contact center, contact center managers hoped to use this new technology to streamline operations, enhance personalization, and boost call resolution rates.

The goal in those early days was to provide more effective support for their human agents. Since then, we’ve witnessed AI’s rapid evolution, thanks to continuing innovation and investment in AI and the open API delivery model, allowing for more seamless technology integration.

Conversational AI, GenAI, and Natural Language Briefing

Two important types of AI are driving this advancement in the customer service sector: generative AI (GenAI) and conversational AI. Both are built on natural language processing (NLP) foundations, yet they represent fundamentally different approaches to handling customer interactions.

...GenAI dynamically crafts responses in real time by considering the context of the entire conversation. This shift enables a previously unattainable level of personalization...

Traditional conversational AI uses NLP primarily for intent classification (determining what kind of request is being made) and entity extraction (pulling specific key information from the input provided).

In this application, response generation has been largely static, relying on pre-written responses that guarantee consistency but often lack the nuance and adaptability required for truly personalized interactions.

In contrast, GenAI has fundamentally transformed this paradigm. Leveraging advanced natural language generation (NLG), GenAI dynamically crafts responses in real time by considering the context of the entire conversation. This shift enables a previously unattainable level of personalization, addressing a vast variety of customer interactions with greater fluidity and intelligence.

However, this power also introduces a challenge: ensuring that the conversation remains on scope. To address this, companies must guide and guardrail GenAI’s output by employing natural language briefings based on prompt engineering.

A natural language briefing is a report prepared by humans that trains the AI agent to interact with customers. It covers a wide range of knowledge, such as company policies and processes, as well as the preferred tone the agent should use when engaging with customers. It’s important that the briefing contains clear, concise, non-technical language.

These methods ensure that every generated AI response is both verified and compliant with the company’s predefined rules. Thus, they deliver dynamic yet controlled customer interactions.

These two types of AI (GenAI and conversational AI) can therefore work together to create a more human experience for customers and with little intervention by humans.

The New AI Role

Because of these advancements, contact center managers are now looking at the role of AI agents differently.

Rather than limiting AI agents to being assistive tools, it’s now possible to train them to handle customer interactions from start to finish.

The new goal is to leverage AI agents to scale, especially in high-volume call situations. And in cases where a human needs to be involved, AI can make the handoff more seamless for a better customer experience (CX).

The timing for all of these AI advancements couldn’t be better, as the ability to quickly scale up a contact center is becoming more paramount.

The call center industry standard for service level is to answer 80% of calls in 20 seconds, but that is easier said than done when an incident causes an unexpected influx of customer calls. Every customer knows that pre-recorded disclosure: “We’re currently experiencing higher than expected call volume.” They know that means they’re in for a long wait, even if they stay on the line.

The new goal is to leverage AI agents to scale, especially in high-volume call situations.

Handling twice—or even 10 times—the usual volume of customer conversations was once prohibitively expensive and complex. But in the AI era, rapid scalability is not only achievable, it’s transforming the way contact centers operate.

Getting Rid of the Script

So, how do these two AI application types work? Traditional conversational AI relies on clearly defined, rule-based decision-making. This approach guarantees consistent responses and meets strict compliance requirements, making it ideal for routine, regulated interactions that support human agents.

In contrast, GenAI fundamentally transforms agent design. It does so by using natural language briefings to generate dynamic, context-driven responses in real time. Also known as agentic AI, this type of artificial intelligence has the ability to act independently, learn faster, and solve problems with less human intervention.

By eliminating the constraints of rigid scripts, GenAI-powered agents can tackle a broader variety of customer inquiries with greater personalization. This allows contact centers to automate more complex interactions, reserving highly-skilled human agents for situations that demand empathy and nuanced problem-solving, thus ensuring both efficiency and regulatory integrity in customer service.

...GenAI and conversational AI presents an opportunity for contact centers to build a new conversational AI playbook.

This means in case of a large-scale incident, deploying a fast, scalable reaction is possible by quickly adopting AI agents. These simulate the correct behavior and put it into action within hours.

This was not possible before. Instead of a scenario where people in urgent need of help are left waiting for hours for their calls to be picked up, imagine that an army of trained AI agents can quickly come online and begin picking up calls within minutes.

These sophisticated AI agents, powered by GenAI and conversational AI, can listen to each caller’s needs and then respond with vital information or instructions the caller is seeking. Human agents can even manage these AI agents remotely, monitoring calls for quality. Or, if the need is more complex, the AI agent can answer, listen, collect relevant information, and then hand off the call to a human agent for completion.

The leap forward in GenAI and conversational AI presents an opportunity for contact centers to build a new conversational AI playbook. AI agents can help understand, contextualize, and anticipate customer needs, enabling more personalized and consistent interactions.

How AI Agents “Hear” the Customers

In most contact centers, audio-to-audio communication between a customer and an AI agent typically occurs over several stages (see FIGURE 1). These stages are:

A person speaks to the AI agent.
The AI agent converts the caller’s audio into text (speech-to-text or STT).
The AI agent recognizes speech activity through voice activity detection (VAD) and contextual end-of-speech detection.
The AI agent thinks and reasons about the best response, using either deterministic (rule-based), non-deterministic (GenAI/large language model [LLM]), or hybrid approaches.
The AI agent reads the text, generates its response in text form, and then converts that text into audio (text-to-speech or TTS).
The AI agent’s audio is sent back to the person.

Although these six steps occur in rapid succession, each conversion—from audio to text and back—can introduce slight latency. Streamlining these steps is critical to achieving a fluid, natural conversation between the customer and the AI agent.

As call centers and customer support teams increasingly embrace real-time, voice-based AI capabilities, solving this latency and reducing the number of steps in the communication chain will be crucial for customer satisfaction and broader adoption of AI communication.

Solving the Latency Problem

This is where advanced GenAI becomes a game-changer. For example, OpenAI’s Realtime API seeks to improve this chain by eliminating the need for interim text transcriptions between the customers and the AI agents, reducing response times, and creating more seamless conversations.

With capabilities like embedded VAD and end-of-speech recognition, interactions are more attuned to natural speech, further narrowing the gap between AI and human agents. Such solutions typically have a built-in barge-in capability through the Realtime API, which is also a benefit because it allows the caller to interrupt the AI agent if the conversation gets off track (see FIGURE 2).

Audio-to-audio exchanges also have the potential to capture additional dimensions of a conversation that aren’t otherwise encompassed in text. Think about the missed nuances between a text message and a phone call with a friend. Tone, emotion, emphasis, speed, and other human elements of spoken conversation now have the opportunity to influence and better personify exchanges with AI agents.

The Human in the Loop

Even as this technology matures, human agents will always be needed to solve more complex problems. In those cases, AI can also make the transition process more seamless by providing a better way to define escalation criteria for AI agents through natural language briefings.

And for those customers whose questions require human intervention, it’s now possible to instruct AI agents on specific cases or customers’ behavior that results in call handovers. They may even be able to give human agents a “heads up” about the emotional state of the callers before they get on the line.

Instead of clunky handoffs from AI agents to human agents that may result in lengthy hold times or require the customers to repeat themselves, today’s AI agents can significantly improve this process. AI agents can quickly summarize the subject matter of the calls, recap customers’ needs, and provide the human agents with a few best-case scenarios to help resolve the situations.

Call summarization and the extraction of valuable customer information (such as a customer ID or a summary of the complaint) can be performed more rapidly and accurately using GenAI. This enriched context ensures that human agents receive comprehensive briefings, ultimately enhancing the CX.

The human-in-the-loop approach enables customer care representatives to resolve customer issues with the help of AI agents. In some cases, the calls continue using the artificial voice of the AI agents, while the human agents control the interactions behind the scenes. This is seamless to the customer.

We are only in the early days of the myriad ways agentic AI will revolutionize customer service. Leaders in the GenAI era are making it possible for customer service organizations to provide a personalized AI agent for every customer. But in order to make this a reality, they’ll need to be able to scale.

AI agents can quickly summarize the subject matter of the calls, recap customers’ needs, and provide the human agents with a few best-case scenarios to help resolve the situations.

These agents, powered by GenAI and conversational AI, will have the ability to compile historical data on customers, quickly analyzing their past purchasing behaviors and preferences in real time.

Using this exciting technology, companies will finally be able to make significant improvements to the customer service experience with personalized, sophisticated contact center automation that ensures each unique customer is heard.

Subscribers Download Article [PDF]