As the public is preparing to adapt to the disruptive impact of AI in their work and daily lives, many professionals in the voice-over industry have clearly felt this change.
Nowadays, AI can generate audio recordings that sound like humans and at a super-fast pace, similar to an assembly line. The cutting-edge nature of this technology is gradually making the world less reliant on human experts – those who have made a living through their voice for many years.
The impact on voice-over professionals is evident and significant
Many of these experts have experienced a significant decrease in revenue.
Tanya Eby, a full-time voice-over professional and professional storyteller for 20 years, also has a home recording studio.
However, over the past 6 months, her workload has decreased by half. Her appointment schedule now only extends until the end of June, whereas in a normal year, it would stretch until August.
Many of her colleagues have also noticed a similar decline. She shared that while there are multiple factors at play, “it seems like AI is affecting all of us.”
Although the most evident consequence is the loss of job opportunities for voice-over experts, they are also concerned that their previously narrated voiceovers may be used without their consent to develop AI tools as replacements.
While no specific works mention AI-assisted voiceovers, experts reveal that thousands of circulating audiobooks are utilizing voice generated from data banks.
DeepZen, a platform providing voice narration for audiobooks, has employed advanced technologies and offers pricing that can cut production costs by up to a quarter or at least lower than the cost of producing a traditional audiobook.
This London-based company has created a database by recording the voices of several experts required to speak in various emotional gradients.
Kamis Taylan, the CEO of DeepZen, stated, “Every voice we use, we have a licensing agreement, and we pay for the recordings.”
He further added, “For every production project, we pay copyright fees.”
According to Dima Abramov, CEO of Speechki, a Texas-based startup, they utilize their own recordings as well as voices from existing databases. However, data exploitation only occurs after they have signed usage agreements, Abramov stated.
However, Eby mentioned that not everyone respects those standards. She stated, “Many new companies that emerge don’t respect ethics. Some companies use voices from databases without paying for them.”
Taylan also acknowledged that there is a “gray area” being exploited by some platforms. “They take your voice, my voice, voices of five other people, and combine them to create a new voice… They claim that voice doesn’t belong to anyone in particular,” shared the entrepreneur.
The future integration of AI voice synthesis and human voice
Some experts in the field have revealed that certain traditional publishers are utilizing general AI programs (capable of generating text, images, videos, and speech from existing content without human intervention) in their operations.
Ananth Padmanabhan, CEO of HarperCollins in India, stated that they are “sampling” AI-generated voices for audiobooks but have not yet found the desired sound. He believes that AI can help produce more audiobooks, save time, and expedite the release of translated works.
Padmanabhan said, “I can tell you that you wouldn’t notice the difference. That’s the goal the industry is working towards. Unless I provide information on whether the recording is done by AI or a human, you wouldn’t be able to tell. In non-fiction and certain other genres, the voice isn’t that critical. I think the voice matters in novels when there are breaks, pauses in the story. That’s where a narrator brings a lot more to the table.”
The spokesperson for Audible, an Amazon subsidiary and a major player in the audiobook industry in the United States, stated, “Professional narrators have always been and will continue to be at the core of the Audible listening experience. However, as text-to-speech technology improves, we envision a new future where human narrators and automatically converted text-to-speech content can coexist.”
Amazon itself is deeply involved in the booming AI field and is actively pursuing the promising business of digital audiobooks.
Earlier this year, Apple announced its shift towards developing AI-narrated audiobooks. To date, Apple Books has developed several “digital narrators”: Madison with a deep and textured voice for fiction and romance genres, “Jackson” with a simple and friendly voice, “Helena” with a serious and gentle voice, and “Mitchell” with a dry but professional tone.
On its website, Apple Books introduces its “digital narrators” as a means to help independent authors and writers launch audiobooks and “make audiobooks more accessible to all.”
Google is also providing a similar service known as “automated narration.”
Abramov said, “AI narration opens up opportunities for old books that have never been converted and all future books that are usually not converted due to profitability concerns.” He also acknowledged that, with the fees for human narration, only about 5% of all books get turned into audiobooks.
The value of human voice “The essence of storytelling is teaching humanity how to be human. And we feel that should never be entrusted to a machine,” said Emily Ellet, a voice recording expert for audiobooks and co-founder of the Professional Audiobook Narrators Association (PANA).
Ellet shared, “Storytelling should still be entirely human-based because, compared to a human recording, an AI product ‘lacks the ability to connect emotionally.'”
Indian speculative and fantasy author Mimi Mondal, who received a second Nebula Award nomination for the anthology “Dungeons & Dragons Journeys Through the Radiant Citadel,” expressed her excitement about the concept of AI-narrated audiobooks and the rapidly developing future of science fiction. However, she admitted that she would not introduce or purchase such books if they resulted in human voices being marginalized.
She said, “Technically, it’s not the fault of AI. It’s us putting it into an unequal world and turning it into another oppressive tool.”
Ellet, for her part, also worries that the public will become accustomed to machine-generated voices, stating, “And I think that’s happening quietly. My hope is that companies will let listeners know they’re hearing an AI-generated work… I just want people to be honest about it.”
So long as AI employs statistical routines in place of real context, AI will never generate appropriate context and the appropriate prosodic emphasis that a human performer applies – not that all of the narrators do a great job. If anything, it just means that narrators will have to give more consideration to their readings – do a better job. If anything, the threat of AI should improve human performances.
Thank you for your feedback