Take on any CX challenge with Pipeline+ Subscribe today.

Humans or AI?

Humans or AI?

/ Current Issue, Technology, Artificial Intelligence
Humans or AI?

Which option is best for IVR?

Artificial intelligence (AI) has been hugely transformative and disruptive across industries and use cases, whether scientific, commercial, or creative.

At the nexus of all of them is voice, both the customer contact channel and the audio recordings that shape it, greeting and guiding callers to deliver more efficient and satisfactory service.

For more than 40 years, voice technologies have been predominantly personified by human voices, whether it meant a professional voice talent in a recording studio or a company admin recording prompts from a phone handset.

Synthetic text-to-speech (TTS) voices have existed nearly as long, but their robotic sound and unnatural cadence have always been derided, leaving it as a “break-glass-in-case-of-emergency” last option for most businesses and implementation teams.

But no longer.

Following over a decade of fast-improving TTS voices like Apple’s Siri and Amazon’s Alexa, and now, the limitless possibilities of Generative AI, the quality of AI voice makes it nearly indistinguishable from fully-human recordings. Most critically, the public is more accustomed to interacting with them from the devices they use and social media they consume every day.

AI voice eliminates the need for hiring professional voice actors and booking studio sessions...

Organizations have faced the challenge of balancing callers’ expectations and the hallmarks of the brand with the business realities of keeping a performance-based IVR system current. But recent innovations have given telephony teams more viable options than ever before for integrating high-quality voice prompts.

Now, the choice is less a matter of quality, and more informed by the priorities of the deploying business, the processes they’re able to adopt more readily, and the emotiveness of the user interface, with minor pros and cons involved with any approach.

AI Voice Isn’t the Future: It’s Now

Companies are transitioning to AI voice for contact center technologies due to a few key factors. And let’s be real: one of the primary long-term benefits is significant cost savings.

AI voice eliminates the need for hiring professional voice actors and booking studio sessions, and in many instances, for the time and cost of post-production (but more on this piece soon). And because IVRs are dynamic, evolving applications, an AI voice license delivers compounding savings; the initial voice implementation only scratches the surface of lifetime value.

Additionally, the quality of AI-generated voices has markedly improved, offering natural-sounding speech in a wide range of languages. The growing number of off-the-shelf AI voices provides businesses with viable options for finding a close match of their brand’s tone and style.

And perhaps most importantly, in an industry where the phrase “please listen closely as our menu options have changed” is everywhere, AI solutions ensure fast turnaround times, enabling companies to quickly update or expand their IVR systems in response to changing needs and customer feedback.

AI voice is not without a few key drawbacks of its own, however. Especially if you’re working with a voice generator and managing the audio output yourself.

“There are lots of considerations for either human or AI voice, and it’s really about what’s better for the application,” said Helen VanScoy, Director of User Interface Design at Performance Technology Partners. “Turnaround can be really fast with AI. It’s great for things like addresses or unique customer information.

“However, AI voices require a lot more human intervention for things like prosody,” added VanScoy. “I’ve worked on some highly localized applications, and when you’re using AI voice, it does make your localization cycle take much longer because you need native speakers to review those aspects.”

Fortunately, this weakness can be addressed by bringing in a specialized voice provider to tune your audio before integration.

While still avoiding talent scheduling and recording sessions, some voice-over studios are equipped to manage your AI audio, ensuring that the pacing of your prompts is perfect. They experiment with inputs, so the AI provides the proper pronunciation for unique acronyms, names, and identifiers, and apply any necessary editing and mastering to create a suitably “human” delivery.

Finally, one last disadvantage of AI comes with an option for overcoming it. Many customer-obsessed organizations and ones with standout brands still insist on using wholly custom voices.

With voice actors, these sounds are developed through comprehensive persona exercises: trial-and-error, auditions using in-context IVR script samples, and hands-on direction from brand and project stakeholders.

Real people will always be able to inject more personality, depth, and nuance into a script...

In contrast, AI voice is largely a domain of “pick one” default styles. While they sound professional, they’re not created with any specific business in mind.

For contact centers that insist on both speed and creative control, custom AI voices can be created with their ideal performers in their specific read styles, often with industry exclusivity negotiated with their talent and developers. It’s a bespoke solution that guarantees consistency and on-demand availability with spot-on branding.

Human Voice: Still the Standard Bearer

So, with the economics, efficiency, and rising quality of AI voice for IVR, surely that’s the slam-dunk choice for every contact center, right?

Not so fast.

Companies still choose to use IVR voice prompts from professional voice actors for several compelling reasons.

First, a professionally voiced system still sounds best. Real people will always be able to inject more personality, depth, and nuance into a script, whether for stage, screen, or call flows. The interplay between talent, director, and client is always beneficial for framing the context of the application, the callers’ expectations, and the sound required for the engagement.

Sticklers for humans—professionals and not internally-sourced voices—hang their hats on a prestige customer experience (CX). Whether it’s their call volume, the stakes of the phone interactions, or the large (often multimarket) scale of their operation, the audio strategy isn’t one to happen in a vacuum. It’s aligned with a broader commitment to omnichannel, personnel, and localization.

Second, session engineers and audio editors ensure that the voice prompts are read according to spec, cleaned up, and ready to load. The cost is a bit higher, sure, and you may need to factor in an extra business day for delivery, but it needs to be done whether you opt for human voice or AI.

“When I get audio back from a recording studio, I know they’ve QA’d it,” said VanScoy. “If I’m generating AI voice, which we sometimes do in a pinch, I know my team needs to do that. Audio is an ‘invisible’ file. I can’t just look at the file and know what it contains.”

Third, while it’s tough to argue with the nearly instant availability of AI output, some voice-over studios have instituted specialized processes that make it practical and cost-effective for companies to employ human voice for IVR applications.

One such practice is order/script bundling, which consolidates multiple client scripts into a single recording session and spreads booking and setup costs across all customers. Additionally, these studios often schedule recurring recording sessions with the same voice talent on a weekly, biweekly, or even twice-weekly basis, ensuring a predictable production and delivery schedule.

Long-term contracts with voice actors further enhance this stability by guaranteeing their ongoing availability and locking in rates over extended periods. These measures collectively provide businesses with reliable, high-quality IVR recordings while managing expenses efficiently.

What Do Customers and Professionals Think?

Research and available data for the voice channel almost always focus on broader technology and service metrics. However, in parallel studies conducted in May 2024 surveying callers and contact center professionals, we were excited to analyze direct responses to our most pressing questions.

The findings were sometimes anticipated, sometimes unexpected, and always useful in painting the picture of today’s voice CX. Of the callers:

  • Nearly 79% said the quality of prerecorded voice on an IVR system is important or very important.
  • 60% believe they can accurately distinguish between a human and AI voice during a phone interaction.
  • 52% stated their preference for human voice, but a significant 36% said they had no preference or were unsure.
  • 53% claimed indifference to a caller engagement featuring multiple prerecorded voices, but 38% said they don’t like voice experiences that switch back and forth.

With recent strides in AI voice quality, we only wonder if general consumers are even aware of how natural and lifelike AI voices sound on phone applications and elsewhere.

In our survey of contact center pros, AI voice was received more favorably. With widespread use of AI and machine learning within call flows, it’s clear that AI voice isn’t just the future of the industry: it’s largely the present.

  • 80% affirm that the voice deployed on their technology has made a positive impact on their CX.
  • 78% have a positive opinion of AI voice recordings for IVR platforms.
  • 67% believe their callers would respond favorably to AI voice on call flows.
  • Nearly 75% have already deployed AI or machine learning with their caller automation.

Hybrid, Matching Single-Voice Caller Interactions

Specifically in the case of IVR, is the choice as binary as humans versus robots?

No. In fact, some CX-focused brands are utilizing a best-of-both-worlds approach, combining human and AI voice for caller interactions that reaffirm the brand, encourage successful self-service, and even accelerate the ROI of large voice technology investments.

AI voices are always modeled from actual humans, and because career performers are licensing their voices to be used by company- or application-specific uses.

It’s possible to use matching human and AI recordings for a seamless, single-voice caller experience. In this scenario, an ideal voice user experience might reserve more evergreen or performative elements of its call flows for humans. Think of a main menu, verbiage conveying the brand’s mission or values, and creative or seasonal on-hold recordings or promos.

The voice channel and CX are at a critical and exciting juncture of evolving technologies and customer expectations.

Dynamic prompts like customer data dips, changing menu options, and certainly emergency messaging like product recalls and service outages, are perfectly suited for an AI voice counterpart.

If forced into an either-or recommendation, the circumstances of the individual company are the driving factors. A still-growing contact center increasing headcount and introducing omnichannel, but still using an “internal” staff voice, would be well served to consider AI if that represents the easiest path to a “professional” voice, considering budget and performance metrics.

A brand with a mature or multilingual customer service operation with brand recognition and all the expectations that entails needs to consider an ongoing commitment to a best-in-class “human” or hybrid caller experience. Changes and upgrades can be implemented thoughtfully with the right teams.

Other Use Cases Favoring AI or Human Voice

AI voice is especially advantageous for technologies like eLearning and navigation systems due to its flexibility, efficiency, and cost-effectiveness.

In navigation systems the nearly countless number of street names and addresses can be generated quickly and accurately with the proper tuning, allowing for precise and reliable directions. The straightforward, informational delivery required in navigation plays to AI’s strengths, and with proper integration with the right platform or partner, should be nearly imperceptible from human recordings.

For eLearning, which is often created for an exclusively internal audience, AI voice makes it economical and streamlined to maintain consistent output or update a large back catalog of videos and tutorials. This is particularly beneficial for growing organizations with a diverse and dispersed workforce, as AI can now accommodate many languages and dialects.

Human voice actors are more appropriate than AI voices for highly creative media like film, TV, and documentary productions; broadcast ads and high-visibility media such as home page explainer videos; and video games due to the unique demands and expectations of these projects.

These products’ defined scope often requires a level of artistry and personality that AI voices struggle to deliver, and the high stakes of their performance justify the investments. The scrutinizing ears of the audience can more readily detect artificiality and are sometimes even openly hostile to it.

Human voice actors bring their experience and ability to convey complex emotions. This makes them the ideal choice for delivering the impactful and memorable performances these mediums demand.

Looking Forward Sounds Good

The voice channel and CX are at a critical and exciting juncture of evolving technologies and customer expectations.

Fortunately, the industry is well positioned to leverage these advancements to better serve callers. With the availability of lifelike AI and tried-and-true voice actors offered with IVR-focused managed services, contact centers have greater freedom to implement, scale, customize, and expand globally with high-quality voice prompts.

As contact centers adapt and brace for the next wave of advancements, one thing is for sure: we’ll all be watching (and listening) for ways to improve the CX.

Matt Strach

Matt Strach

Matt Strach is the Enterprise Marketing Director for BLEND, a global provider of localization and voice-over solutions for all technologies and media. For more than 15 years, he has worked with IVR implementation and marketing teams to develop and select voices for customer-facing applications.

Contact author

x

CURRENT ISSUE: January 2025

Moving Forward: What Will 2025 Bring For Contact Centers?

View Digital Issue

SUBSCRIBE SUBSCRIBE

Most Read

Artificial Intelligence

Is CX Improving? Or Declining?

Artificial Intelligence

Looking Beyond NPS and CSAT

GartnerMQ
Upland 20231115
Cloud Racers
Amazon Connect 20240826
Trends Forrester Budget Planning Guide
Verint CX Automation
MPOwer