Deluxe’s Chris Reynolds on Dubbing, Subtitling, and Localization for Hollywood

Chris shares how Deluxe plans to use AppTek’s expertise in language AI to enhance the quality and efficiency of dubbing workflows, while maintaining the artistic integrity and emotional resonance of the original content.

Subscribe on Youtube, Apple Podcasts, Spotify, Google Podcasts, and elsewhere

While there is an increasing amount of AI tools for lip-syncing and dubbing, Chris emphasizes the continued importance of human voice actors for conveying emotion and nuance.

Chris also touches upon the recent agreement by SAG-AFTRA and its implications for voice actors’ rights and compensation in the era of AI dubbing.

Looking ahead, Deluxe is focused on integrating tools more securely, improving dubbing workflows, and exploring hyper-localization to cater to different languages and dialects.

Transcript

Florian: Welcome everyone and welcome to SlatorPod. Today we have on the podcast Chris Reynolds. Chris is the Executive Vice President and General Manager of Worldwide Localization and Fulfillment at Deluxe, the media and entertainment services provider. Tell us a bit more about your, initially for our listeners, your professional background, how you got into media and entertainment and started working in localization and this part of the business.

Chris: For me, it all started very young. I’ve always loved music, literature, film, TV, and from a young age always wanted to be involved in it somehow. Ended up going to school for journalism and then changed and ended up graduating in recording arts. So I started my career as an audio engineer, an editor, and then through that got into film and TV sound, worked my way up to mixing. And then as I continued through my career, I started mixing localized, centralized mixing for large theatrical films and joined Deluxe, who has a long history of localization and global distribution, and at this point now overseeing teams around the world in all kinds of countries and regions that are focused on localization and distribution for theatrical, streaming, broadcast, social media, we pretty much hit every of those distribution points.

Florian: So you know everything from the inside out, from the very start, really, from the mixing up and everything.

Chris: I mean, not everything, but yes, recording, mixing, that’s core to my background from a very young age.

Florian: So tell us a bit more about Deluxe: services, client segments, kind of how large the organization is. It’s very large, of course, according to our research here. You got three units, at least that’s what’s on the website. You got cinema, localization, and fulfillment. So just tell us a bit more about size, footprint, clients.

Chris: Overall, we have close to 4000 full time employees around the world. Our headquarters are in Burbank, California, Los Angeles area, but we also have a large presence across Europe, in the UK, Spain, France, Germany. And then in APAC we’re actually going through a big expansion, so operating in Korea, Japan, Taiwan, Indonesia, Malaysia, Thailand, Australia, and then we also have a large team in India as well that’s been supporting the business for over 20 years. On top of that, we have a massive distributed workforce of linguists and translators, over 5000 that work on our platform every day, localizing in over 100 languages. And then we work with a large network of dubbing studio partners so that we can support every language. We have about over 400 onboarded into our platform that we work with regularly. The three groups at Deluxe: Cinema really is all about mastering and distribution for theatrical release and what that means is doing quality control of the final picture and sound and subtitle elements that come in for theatrical. Every single one actually has to be viewed by a human in a theater, every version and combo before it goes out to theaters. And then the actual distribution to the theaters, be it electronically, hard drive, satellite, IP delivery, and that’s a big part of our business. We’re operating worldwide for distribution. Localization, of course, is what we’re mostly talking about today, but that’s subtitling, dubbing, where you’re re-recording all the voices and mixing new foreign language versions. We do graphic localization for artwork, both key art that’s still images, and then motion graphics that are the actual titles or inserts within film or television content. And then we also support accessibility services, which covers things like audio description for the visually impaired, of course the captions with sounds for the deaf and hard of hearing, and also sign language, so doing more in American Sign Language over the past year, which has been exciting, but we also support it in some other countries and see a growing interest in that. And then lastly, fulfillment is about the mastering and distribution of content for any non-theatrical distribution. So in that area, it’s the quality control. Some of that is the original, because if it’s episodic for streaming or feature for streaming, there’s no theatrical QC, so it doesn’t go through our cinema team. And then we do the quality control. We do have some original post-production in certain areas to actually do the original sound and picture work and some visual effects, but then we create the distribution masters and those then go into fulfillment along with all the localized assets after final QC, the audio dubs, subtitles, et cetera. And we end up delivering it to any endpoint: streamers, broadcasters, social media all around the world. We deliver millions of files every month. It’s a massive, high scale, very automated platform on the distribution side for that side of the business. And our customer segments, we service all of the major Hollywood studios we’ve been around for over 100 years, so every major Hollywood studio and streamer. We’ve been a part of all the streaming launches from when iTunes first started offering videos for rent or download to Netflix, to Amazon, to Disney Plus, Paramount, all of the streaming platforms out there, and then all of the broadcasters. We actually, in our platform, deliver to over 18,000 endpoints, and any endpoint is a delivery destination, so it’s a massive distribution part of our business. And I oversee localization and fulfillment, so that’s actually becoming more and more one group. Our customers often buy that as one thing, localize and distribute the content. And so we found a need to bring those together to be more efficient, better match how our customers think about it, and it’s been good. Like, our platforms have been coming together on that side, the teams, and it helps us be faster, and everyone’s under pressure to get things out faster and faster these days.

Florian: I think what people tend to not appreciate, at least I didn’t until I learned about, is just the size of these files sometimes, right? It’s not like 100 megs or 500 megs. I mean, I think these masters are huge, right? And that also makes them very tricky to distribute because people would think distribution, well, send it or upload it to something. Just tell us a bit more about that. That’s just so technically challenging, I guess.

Chris: It really is. The masters that we receive are in very large, uncompressed formats, so we’ll get video masters that are multiple terabytes. Last year we made the biggest theatrical delivery ever with Avatar, so both for cinema and for streaming and all the downstream, it’s just massive, huge files. And then it gets bigger if it has 3D versions, 2D versions, you end up having 4K UHD versions, HD versions, SD versions, and there’s different color spaces too, but these are just enormous files. There’s a lot of technical complexity in getting all the different versions right. And then the sound really stacks up too, because it’s not like you have one French version, you might have a French Atmos version, French 5, 1 version, french 2.0, which is left-to-right version, and each one a mixer has really spent time making sure it sounds right, and then you multiply that by how many languages, and then if you have IMAX theatrical, you have another sound version. There’s other specific formats that different regions support, so it all really adds up. Huge tons of time pressure, and got to have a lot of automation to get all the transcodes out because the file in the theater is still big. But at that point you’re talking gigabytes, not terabytes, so it’s a lot of compression still to get it there.

Florian: Actually, this week when we were recording the podcast, you announced that you started a partnership with AppTek, speech technology company. Can you just tell us a bit more about that because that was just in the news this week as we’re recording this.

Chris: We’re very excited about that. It’s a great team at AppTek, some brilliant scientists that are really experts in neural language processing and technology. What that is about for us is Deluxe, we’ve been around for over 100 years. You can’t be around that long without leaning into technology and always being on the forefront of what’s happening. I think our customers expect it. We often get, if there’s a new format, new process, we’re one of the companies that they turn to all the time to say, how do we do this? How do we figure it out? And so, in every aspect of our business, we’re always investing in technology. We develop a lot of proprietary technology, but when it comes to language processing technology, we’ve been partnering with different companies, incorporating it into certain workflows. But we had a lot of ideas of how we could make it work better for our workflows, for the media and entertainment industry specifically, which is really pretty niche when you think about it. A lot of the providers focus on broader segments, be it just consumer use or of course you have people focused on book publishing or websites, you have different focus areas. And on our side, you’ll see a lot of things about, oh, I have speech-to-text, I have machine translation, I have text-to-speech. But is it actually specific enough to be incorporated in a practical way into the workflow? And for us, the practical implementation, the way that you can improve the quality, the places where you insert it into the workflow, we just wanted more control, more access to the scientists that work on that, so that we can tailor it to our specific industry needs and more tightly integrate it into our platforms.

Florian: When you were mentioning these file size and just the complexity of the process, I think it’s so different from a website or a 200-page document where the language is obviously the key challenge, but not all the things around it. And in your case, language is one component and, of course, which is just the voice and all the other aspects, it’s just so much more technical, right? Now, can we talk a bit about, just go away from the tech a bit or just go to the dubbing production, right, because it involves so many kind of intricacies: voice casting, script adaptation, post-production, just tell us a bit more. What are some of the key challenges that you’re still in 2024 what you’re helping your customers solve now? And the tech is one piece, but just let’s remind our listeners a bit about the general dubbing process.

Slator 2024 Language Industry Market Report — Language AI Edition

The 140-page flagship report features in-depth market analysis, language AI opportunities, survey results, and much more.

$970 BUY NOW Included in our Growth, Pro, and
Enterprise plans. Subscribe now!

Chris: Absolutely. It’s a fascinating area, especially for those who aren’t as familiar with it. It’s one of those conversations when you talk to friends or family and they say, what do you do? You can end up in a really long conversation when you get into the dubbing side because they just don’t think about how that happens. But, yeah, timelines are always very challenging, I’ll say. I mean, it’s a creative industry and the original version creators, they’re never happy, they’re never done. There’s a famous saying in entertainment that the film or TV show isn’t done, it just kind of walks out the door on the last day. Finally everything comes together. Somebody’s announced a release date and the director has to let it go at some point and say, good enough. Because of that, they’re always asking for more time, right? It’s a creative story they’re trying to tell. There’s a lot of passionate people involved. So to get the localization done on such a compressed timeline, because our customers generally want the localized versions released on the exact same day. They call it day and date as the original version, or very close to. It used to be wider windows, but I think the industry has found that the world is smaller, everybody has access to everything, so they want it out as quickly as possible. So the actual process for dubbing, to cast voice actors that are appropriate for every role, you’re talking about auditioning, you hire a director, the director starts looking at the OV material, thinking about who in that country or area is best suited for different parts. They might sometimes have to get approvals from the studios for who they cast. Other times they have more autonomy on that and they can just move. We need access to preliminary versions so it’s not a final edit of the episode or film, and so there’s a lot of version control tracking. We want to get a version that’s resembling what will be final to them as soon as we can, so that they can start casting, preparing schedules, have a sense of how many lines it is, start translating, but then there’s a few different versions that will go through generally, and then you track all the changes. Did certain scenes actually get removed, added? Did any dialog change? And make sure that everybody has access to that information of what’s different and what isn’t. You go through the script translation, and then you go into adaptation, which is really important because you have to adapt the translation to match as best as possible the lip movement. It can obviously take more words to say something if you do a literal translation into a different language, and then you have to take the nuance of lip-sync into account. And so you might actually rearrange and change it even more while still making sure that you’re capturing the meaning, the intent, and that the actor will be able to deliver the emotion. Then you get into the studio. Actors, directors, recording lines, they don’t have the time you have on the original version. They don’t get to do table reads and audition. It’s usually pretty fast. These dubbing actors are amazing, if you ever go into the studios, incredibly professional and talented people. And then once all the recordings come together, they edit it, and it goes to a mixer. Mixer mixes that dialog in with all the music and effects to make it sound as natural as possible. Create that suspension of disbelief that’s so important in media content to make you feel like that’s the language that was spoken in. And then it hits the distribution chain I talked about, and all those audio versions come into someone like Deluxe, and they get QC, notes go back and forth, fixes get made if needed, and then out to distribution. But yeah, it’s a pretty fascinating workflow.

Florian: And so I guess that the timeline compression was probably over the past 10 years with the streamers, right, though, because they want a kind of same release. It’s on Netflix, I need it now in 10 dubbed versions already when the original gets released. Was that a big part of that, driving the compression of timeline?

Chris: When everything started going digital, I think even when you got to DVD and Blu-ray, you started to see some compression, but then streaming certainly accelerated it. For theatrical it’s been a pretty compressed timeline for a long time. And so because they want to get the box office, it’s the first window of release, so there it was more common that you would need all the languages at once. Historical broadcast, oftentimes you would broadcast it in the original version language. You might not have even made your licensing deals for distribution elsewhere. And at the point when you made that deal, the distributor, that channel in that country, might have been responsible for the localization. Once you got to DVD Blu-ray, you got more of the content owners again, just like theatrical being responsible for the localization when it was broadcast, they didn’t necessarily always get those foreign language versions even back in their custody, even if legally they were allowed to, but now they wanted to release disks. They’re creating that output for every market generally. And then with streaming, now it’s just like theatrical. I mean, everyone wants everything on the same day, for sure.

Florian: What were some of the biggest technological kind of improvements over the, let’s say, the past three to five years in this process? Because it seems it’s very complicated. I mean, complicated in the world of translation, localization where everything’s remote. I mean, you still have people coming to a studio, there’s a director, there’s a ton of people involved. So has there been any kind of top three things in terms of tech that made this more efficient over the past three to five years?

Chris: There have been little things. I think a lot of the process though hasn’t changed materially. I mean, sure, you might have newer workstations, you might have some plugins that people use to better sync for the actor. The lip-sync, you might have words across the screen. That was common in France, but not as common in other countries. Now you see different countries implementing technology to improve the lip-sync when actors talk, different cloud tools for the script writing and the translation steps. And then on our side, it’s the tools for creating the video references as quickly as possible, getting them to studios as quickly as possible, getting the notes about the changes between versions. That’s where the acceleration we’ve focused on mostly has been, because you want to just give the creative process as much time as possible. So we have tools, for example, where the studio gives us the video, we have to apply both visual watermarks, where it has the name of the studios or the actors, plus invisible watermarks in case something gets pirated, that it can be traced back. And historically that would take a lot of time, but it’s all very automated now. We can churn those out quickly so that you don’t lose time. You just want to get those videos in the hands of everybody, try to speed up those non-creative aspects. You said three to five years, though I think I have to acknowledge COVID, so there was also the launch of cloud recording during that period, which some people really leaned into during the pandemic. Others, it just was too challenging or different home environments wouldn’t have been quiet enough for them to get the quality they wanted. But I think that that technology has persisted generally as another tool in the toolbox. Sometimes the actor is traveling, not around. We have some productions where actually, it’s like for animation, you have the same team all the time, and we actually have one where for the English version the actors are all around the United States and they all have good home studios and everything is cloud recorded every time. They love it. They don’t have to travel to studios anymore, but that’s rare. A lot of times it’s specific actors, specific lines that people use it for today.

Florian: You mentioned lip-sync. Obviously, it’s a key, right, I mean, I grew up on German-dubbed content and it did suspend my disbelief. But now we have all these tools, tons of AI tools that pretend to be able to do lip-sync like basically perfectly, so is that something you’re experimenting with? Do you see this getting broad adoption in the near future, or what are your thoughts about that?

Chris: I think it’s interesting. A lot of interest in it. We’ve been doing testing. Customers are thinking about it and how it would work. There’s still a lot to be worked out on workflow and pricing. I think the biggest challenge right now is it adds time to the process. You have to be done with the dub. You have to be done with the dub to make sure you can do that final video dub now, right? Essentially a new version of video with lips that match the dub track. So you need to account for some additional time in the workflow. There’s different versions of it and varying degrees of quality. Sometimes a lot of it can be automated, but to your point, these are large files, and it’s different when you’re watching a user-generated piece of content on YouTube versus you sit down to watch a big scripted episodic series or a movie, you expect a degree of quality. And when you look at it that way, people get really tuned in on the quality. So there are a few companies that are really focused on it for that type of content, for theatrical and for streaming and broadcast that are starting to hit those quality levels. So now it’s really about how do you get it into the workflow? Can you do it fast enough? Does it meet the economic models of the content owners and distributors? It might force them to change some of those economic models if consumers like it, right? But I think right now it’s a lot of testing, it’s a lot of evaluation, it’s a lot of thinking through, how would we make this work? It’s actually a major fundamental shift in the distribution supply chain that a lot of people don’t think about. The streaming sites and theatrical have really perfected and spent a lot of time working around a single video master with multiple audio and subtitle tracks and a lot of metadata that makes those separate tracks play together. And that’s what gives us all the flexibility on streaming platform to just change audio on the fly or change your subtitle language, right, that consumer experience. But it’s based on a common video and so suddenly having a different video file for every version is also an infrastructure thing that has to be thought through. Certainly not impossible, but you still want it to be that seamless. You still want the person at home to be able to switch all those things around. And if every time you switch the language the video is switching too. It’s just something that has to be accounted for. It’s a lot more transcoding. I could get into how it gets to your home and how the files get near to you, but there’s a lot of complexity people don’t think about and it certainly changes that dynamic too.

MAIN IMAGE - Media Localization into English

Slator Pro Guide: Media Localization into English

The 35-page Slator Pro Guide covers the growth of local language content, subtitling, dubbing, AI, talent sourcing, and more.

$390 BUY NOW Included in our Pro and Enterprise plan.
Subscribe now!

Florian: That’s definitely something people don’t think about when they see these like 30 second clips on Twitter or on X or YouTube, which are okay, right, for the consumer. But yeah, if you want to do this on an ad scale, global release in 50 languages or 15 languages, then very different. Yeah. We had Jonathan Bronfman from MARZ on the podcast about mid last year and I think they’re focusing on the theatrical side of the AI lip-sync. But that was very interesting. I didn’t know about all these complexities that actually it filters into all types of distribution issues and also that it actually would slow it down. That’s something that’s not very intuitive, that this addition would slow down the process materially, right?

Chris: I mean at the end of the day though, if people react to it, if it helps improve that suspension of disbelief, if you get more viewers watching this international content, that’s the promise, and I think it’s a very interesting area. And to your point, a lot of Americans might think, well you’ve been watching dubbed content in German forever so it’s fine to you, but hearing you say that takes me out of it, it’s an important note. So if it’s possible to have people be more immersed and have them feel like it’s their local content, that’s pretty exciting. And so if the consumer reaction is strong enough, the industry will work through the technical challenges, right?

Florian: I think it’s a huge function of if you’re actually able to, if you understand the original language. So for me it got harder to accept the German dubs as my English improved, right, but my kids, they don’t care. I mean they watch it and they don’t notice. I mean, to them it’s fine, right, but to me, I see the lips and that literally the English speaks to me underneath the German dubs.

Chris: It’s interesting in America because dubbing hasn’t been common. English dubbing is, yes, on animation it was happening, but on live action, it’s a very new segment, right? A lot of interest, a lot of growth, but people aren’t used to it. And so it’s been interesting the discussion here when you’re around the studios, is a lot of them are like, well, if we can get people to watch non-English original version content and this gets them more engaged and gets over that hurdle, it’s a whole new market, a huge market for the European and Asian original content. So would it allow this content to succeed more? It certainly goes both directions, but I think because the studios are here, they’re also really looking at it from what imports to the US. It’s also a very tough audience, but if you can hit that bar, we can make it work for any language.

Florian: And we had the Squid Game example, right, that really went well with the English dubs. I must have watched this dub into English as well, so not German.

Chris: A lot of people are watching. I think that’s going to be the other element that content owners and distributors look at is when they do test it and people are testing it, some of it is out there. It’s not that public what’s out there. And they’re going to be looking at the data. They’re going to do some level of A/B testing as best they can and say, do people hang in longer when we do that? Do they not? Does it actually affect viewership, right? And this is such a data-driven world now, and streaming platforms can see all of that. They can see if people watch it all the way through, they can see where they’re clicking, what are they doing? And so it’s going to be about data, but people will test it, and that’s what I mean, if the consumer reaction, if the data shows this matters, then they’ll figure it out.

Florian: Now, we talked about the lip-sync, but what about the dubbing, kind of the AI component of the dubbing? How do you think about that? Where are you experimenting with it, like just maybe synthetic voices or maybe even for the translation bit? Is that something that’s on the radar or kind of more in an experimental phase at this point?

Chris: I think it depends on the content type and there’s a budget time, content type, dimensions to all of this, right? I mean, if you’re talking about, like start at live. If you’re talking about live, there’s very real applications where humans can’t do live dubs into a bunch of languages very easily. I mean, it’s just very complicated, so there it becomes an access thing. So you have sports leagues and people like that thinking it’d be pretty great to be able to use this technology to have someone speaking in one language, all of our announcers, and then just have the dubs going out so that people around the world, these are big global community events, right, like a big football soccer match. It’s like people are watching it everywhere. That’s a great, less controversial application because you just can’t do it otherwise. I think when it comes to stuff that’s more social media or very low budget, they don’t dub today. They just don’t have the money. Financially, there’s no economic model, so they’re more open to experimenting with it. When it comes to mature dubbing markets, where there’s established dubbing infrastructure, that’s where it’s more of a tool. I think of it more of as can we solve some of the common pain points in the current production workflow? It’s very common that when you’re at the mixing stage, something on version control got missed and an actor didn’t record a line or you get a last late minute drop in from the original version. There’s a new line, there’s a new scene, and you’re at the last hour, that actor is asleep, they’re on vacation, they’re sick. So are there ways that we can integrate it into the workflow to just help with that real world pain point of we just can’t get them here? So I think that there’s a lot of more practical applications for mature dubbing markets. But is it at the level where it has that human emotion that a dubbing actor has? No, is my view. And those dubbing actors, it’s incredible, we have studios in Spain, Germany and France that we own and operate, so I’ve certainly been there quite a bit. And when the actors come, it’s just amazing how they can capture that emotion in another language quickly. They can hit that lip-sync quickly. I mean, there’s a lot of nuance, right? It’s a creative medium and I think the creative aspects of dubbing shouldn’t be underestimated. So for accessibility, just access to content, maybe opening up new languages that aren’t usually dubbed and doing voiceover dubs, or just a lower quality dub. I can see people testing that, but I don’t see it replacing human voice actors. There’s an emotional performance there. I think of it like music sampling technology. I don’t know if you play music, people have keyboards and they’re like, oh, now it’ll sound like a guitar or whatever, but the performance is what matters. And even if you have the best emulation software, this will make it sound like I’m playing a totally different guitar. Well, if you don’t play the guitar well, if the performance isn’t there, it doesn’t matter that you can make it sound like something else. You missed the nuance of the performance, right? So performance nuance is important. I mean, technology moves fast. We don’t know where it’s all going to go. But personally, I also just think with AI, people jump to the most complex things and do we really want to automate human creativity? I mean, that’s kind of an essential part of humanity and I do think dubbing is a creative industry and creative localization is creative right? The technology can get objective things done. Like you could say, yes, that’s an accurate translation, but will it have the interpretation that a human has? We see generative AI, you see examples of it, but they’re inconsistent and they’re not always reliable, right, it’s still a machine, it’s not a human, but we’ll see where it all goes. I don’t know.

Florian: No, but I also like what you said about, I mean, let’s say an actor is away or there’s like a last second fix and the voice actor can’t be reached. So it kind of takes it out of this binary question of like, it’s all going to be AI and fully automated. You click a button versus like, there’s this kind of clunky old world where still people have to do it, right? There are use cases even in your highly complex world maybe for some of these technologies right now, even in those very mature markets, but it’s not like clicking a button and replacing anything there. Can you talk a bit about this? SAG-AFTRA went on strike. That wasn’t a pleasant time for anybody in the media business and now they’re back and they got a ton of concessions. One of them, and we just covered it this week was like they got it in writing that voice actors have to be humans. That’s how they put it. But then they also said there that those same voice actors would get royalties on international version, they said, digitally altered into foreign languages versions, which to me would imply that basically they’re getting compensated if their voice is being used for AI dubbing, if that were to happen, right? So just tell me a bit about your thinking here. I mean, this seems very hard to parse from the outside. So on the one hand, I mean, obviously they’re trying to get the best for their members, but looks like kind of the dubbing world wouldn’t be very much in favor of this. I’m not sure if that’s the correct take.

Slator Pro Guide: Translation AI

The Slator Pro Guide presents 10 new and impactful ways that LLMs can be used to enhance translation workflows.

$290 BUY NOW Included in our Pro and Enterprise plan.
Subscribe now!

Chris: I think it comes down to your own likeness, right? I don’t think any of us would want to wake up tomorrow and turn on TV and say, wait, that’s my voice. I had no idea they were going to use my voice or that’s a digital video image of me. And that’s really the thing that people are fighting to protect the most and so if you think about the voice of Brad Pitt in Italy or Germany or France or Japan, right, there is a voice actor who’s been cast in that. That’s the voice that the people who watch the dubbed content are used to. It’s not Brad Pitt’s voice. What they’re trying to protect is if it literally sounds like Brad Pitt in that language then Brad Pitt should have knowledge about that and an opportunity to be compensated. I think right for compensation is negotiable, right? I mean, it’s just acknowledging that that is their voice, that is their likeness, and you have to have permission, and it needs to be part of the renumeration conversation. I don’t think that’s surprising to the studios, honestly, and it’s the same kind of thing that we’re seeing the concern for in the dubbing community, right? Like dubbing voice actors want, like in Spain, there’s a lot of conversation right now about we don’t want people using our voice to create artificial versions without us knowing about it. That’s what they’re looking to protect. It’s just like an awareness. Like, this is my human likeness and when I went to the studio I thought it was for this movie or this show, and I want some acknowledgment for that. Those are the kinds of things people are fighting for, so it doesn’t necessarily mean it’ll never happen. It’s just with consent, with knowledge, with renumeration, if that’s what that voice actor requires, right? And I think we’ll see that in every industry, right, people want to have control of themselves.

Florian: They even said it doesn’t have to sound exactly like the original. It just has to be plausible that it could be that original, so they put a pretty big kind of circle around that core. Yeah, these are the big questions. Big questions as these models get so good, and especially in voice. I mean, maybe even two or three years ago, those kind of text-to-speech was just not there at all, but now with things like ElevenLabs, you’re like, okay, that actually starts to sound very much like a human. Not the emotion, I get it, but just kind of reading off a script, it’s getting quite good.

Chris: It is. I mean, emotive AI is definitely getting better. You can’t slow down technology, right, but we’re going to have these debates. And to your point, the fact that they put in language of just resembling, well, that’s a common part of dubbing voice casting today. So that kind of thing is where it might get tricky and there will be some debate, right? Some of these things end up sadly getting settled in court. It’s like music where people say, well, that sounds like my song, and sometimes you listen and you go, man, that sounds identical, I understand why that case is, and then sometimes they’ll lose that court case. Other cases I think that didn’t sound that similar and they’ll win. So there’s an element of human interpretation as well to these things, right? It’s tough.

Florian: How do you think about YouTube creators as a new client segment, potentially because YouTube has been kind of teasing everybody with this multi-audio track thing for like, I think must be two years now, but I still only see it with MrBeast and like two other 100 million dollar subscriber creators. Is that something you’re watching if they’re ever going to open it up?

Chris: We have some social media content creators as clients today for traditional dubbing, but generally it’s the most successful people who are making real money on it, right, because they have the confidence that they can monetize that investment. But a lot of user-generated content, they don’t make anything. They certainly don’t make enough to be paying for a professional dub, so that’s an area where I see people going with more automated dubbing solutions because it’s just not economically plausible for them right now, right? They just don’t make that return or certainly not a guaranteed return, so it’s just way too much upfront investment. But it is happening and we do have some customers who do it the traditional way, and then we have some who are interested, but traditional is too expensive and so they’re testing more automated to see is there something that makes it feasible for them to hit more people, right, more audiences with their content.

Slator Pro Guide: Subtitling and Captioning

Pro Guide for buyers and LSPs on how to leverage captions and subtitles for video content to grow viewership and improve engagement. Features 10 x 1-page use cases.

$290 BUY NOW Included in our Pro and Enterprise plan.
Subscribe now!

Florian: So to close, what are some of the things that are on your roadmap, some key things you’re planning this year with the company? With AppTek maybe?

Chris: Certainly with AppTek we’re working on more tightly integrating the tools. We have various customers who want to test different things, live is of a lot of interest to our customers and just improving the quality. I mean, timelines are challenging, so even for a traditional translation workflow, for subtitling, we’ve had translation aid tools for years, but the higher the quality, the more concise they are. It allows translators to work faster under tight deadlines, focus more on the lines where they have to really transcreate, if you want to use that term right, and really capture the nuance, but worry less about the easy stuff. And then for more budget conscious people, we think that it’ll open up more languages. We don’t have that many who want to go fully automated, just like spit out the MT, but doing a machine translation with a post-edit certainly makes it faster, has a different price point. And so people are interested in those solutions, and I think it’ll open up more language coverage. And I’m hopeful that the future has a more hyper-localized aspect, that we start getting into dialects. It’s like neutral Spanish, Latin American Spanish, nobody speaks that. Same with Arabic, neutral Arabic, nobody speaks that. So maybe there’s a world where you spend a lot of time on a neutral version and you use technology to create different dialect versions and you get closer to what people speak. So new languages, more hyper-localization to hit different dialects. I think a lot of content owners and distributors are interested in that, but it’s all about a scale and a price matrix. So there’s a place for the traditional workflows, place for new workflows, there’s a lot of in between applications. And so we’re excited to explore that with localization professionals, our customers, and I think there’s an exciting future, a lot of different options. It’s not so binary like you said, it’s not all this or all that. There’s a lot of in between.

Deluxe’s Chris Reynolds on Dubbing, Subtitling, and Localization for Hollywood

Transcript

Slator 2024 Language Industry Market Report — Language AI Edition

SlatorCon London 2026

Slator Pro Guide: Media Localization into English

Slator Pro Guide: Translation AI

Slator Pro Guide: Subtitling and Captioning

Featured

Boost Language Access

Leading with Excellence

AI should speak every language

memoQ Translation Tech