A few weeks ago I wrote about how AI translation technologies were likely to accelerate us into the world of the ‘universal translator’ of the 1966 Star Trek and beyond universe. This mini demo using technology by an AI company called HeyGen shows how it’s at our doorstep. As tech enthusiast site Tom’s Guide puts it in their piece “This new AI video tool clones your voice in 7 languages — and it's blowing up”:
“How many languages do you speak? Thanks to AI, that number could be as many as seven. Los Angeles-based AI video platform HeyGen has launched a new tool that clones your voice from a video and translates what you’re saying into seven different languages. If that wasn't enough, it also syncs your lips to your new voice so the final clip looks (and sounds) as realistic as possible.
Called Video Translate, the tool allows you to upload a video of yourself speaking in English, Spanish, French, Chinese, German, Italian, Portuguese, Dutch, Hindi or Japanese. The requirements are pretty basic so you don’t need any fancy cameras, microphones or software. The clip has to be at least 30 seconds long and should ideally feature just one person. But other than that, you just upload your video and in a single click HeyGen can translate what you’re saying.
You can choose whether you want the output to be in Spanish, French, Hindi, Italian, German, Polish, Portuguese or English.”
“Following its September 7 launch, HeyGen’s AI tool has since gone viral and Jon Finger’s video added fuel to the raging fire.
“We wanted to try the tool for ourselves but our 30-second test clip ended up at the back of a queue that was, *deep breath*, over 141,000 videos long.”
The free part of the service has a long list for people to make their videos. The paid tiers that from from $29 to $150 per month get faster service. GPU capacity is the gating factor, which will be an issue for a wide range of AI services for at least the next couple of years. Nvidia and other AI infrastructure companies are hard at work on this issue.
The obvious concern by many is of course the negative uses for this type of technology, as Axios points out in their review:
“Hi, mom. Hi, dad." The face in the video was mine, and the voice was mine, too. But I hadn't spoken the words my parents heard.”
“What's happening: The video was the product of a company named HeyGen, which allows anyone to create a personal "deepfake" — an AI-generated video double capable of reciting virtually anything you type into a text field.”
“Zoom out: There is of course plenty to worry about with such a tool, from scams and fraud to political misinformation.”
“But HeyGen is attempting to harness the utility of "good" deepfakes as a quicker, cheaper and easier alternative to recording everything from customized marketing to instructional videos.
"We want to build a generative video engine to replace cameras, and make it work for everyone to freely create content," said Joshua Xu, HeyGen's CEO, who spent six years at Snapchat before launching his startup at the end of 2020.”
The tech is again early and evolving. And there are many other companies rushing into this segment as well.
So it’s all likely to get much more capable and multi-faceted in its uses, both bad (deep fakes etc,) and good. We’re still in the earliest days of this AI Tech Wave, with the next phase being multimodal LLM AI before this year is out.
In many ways, this capability and product eventuality is a continuation of spectrum of rapid AI service introductions over the past few months ranging from companies like Midjourney productizing Generative AI into a multi hundred million revenue generating businesses generating images from user prompts, to the continued upgrades to AI generated computational photography capabilities that increasingly make the question ‘What is a Photo’ of existential importance.
We’re getting rapidly to the ‘What is a Video’ part of that spectrum, and in many languages no less (yes, that’s a Stormtrooper vacuuming on a beach, or is it?).
As usual, technology races ahead of the ability of companies and users to figure out what to do with them, and how to create lasting businesses, and productive user habits, for personal and business use. These types of capabilities will be embedded in AI services both big and small.
All coming to a screen near you. Stay tuned.
(NOTE: The discussions here are for information purposes only, and not meant as investment advice at any time. Thanks for joining us here)