Madonna among the first to embrace AI video-generators in new wave of technology
During her concert tour, Madonna performs the 1980s hit “La Isla Bonita” accompanied by moving images of swirling, sunset-tinted clouds on the giant arena screens behind her.
To achieve that ethereal look, the pop legend embraced an as-yet-uncharted branch of generative artificial intelligence – the text-to-video tool. Type in a few words—for example, “surreal cloud sunset” or “waterfall in the jungle at dawn”—and the video will be made instantly.
Following in the footsteps of AI chatbots and still image generators, some AI video enthusiasts say the emerging technology could one day transform entertainment, allowing you to choose your own movie with custom storylines and endings. But there is still a long way to go before they can do that, and there are many ethical pitfalls along the way.
For early adopters like Madonna, who has long pushed the boundaries of art, it was more of an experiment. He tweaked an earlier version of the “La Isla Bonita” concert visuals, which used more traditional computer graphics to evoke a tropical feel.
“We tried CGI. It looked pretty lazy and cheesy, and she didn’t like it,” said Sasha Kasiuha, director of content for Madonna’s holiday tour, which runs through the end of April. “And then we decided to try artificial intelligence.”
ChatGPT maker OpenAI gave a glimpse of what advanced text-to-video technology might look like when the company recently introduced Sora, a new tool that is not yet publicly available. Madonna’s team tried a different product than New York-based startup Runway, which helped pioneer the technology by releasing its first public text-to-video model last March. The company released a more advanced “Gen-2” version in June.
Runway CEO Cristóbal Valenzuela said that while some view these tools as “a magical device where you type a word and somehow it conjures up exactly what you had in your head,” the most effective approaches are for creative professionals looking for an update on decades-old ideas. they already use digital editing software.
He said Runway can’t make a full-length documentary yet. But it can help fill out background video or b-roll – supporting shots and scenes that help tell the story.
“It saves maybe like a week of work,” Valenzuela said. “The common thread across many use cases is that people are using it as a way to augment or speed up something they could have done before.”
Runway’s target customers are “large streaming companies, production companies, post-production companies, visual effects, marketing teams, advertising companies. A lot of people make content for a living, Valenzuela said.
Dangers await. Without effective safeguards, AI video generators could threaten democracies with convincing “deepfake” videos of things that never happened, or – as with AI image generators – flood the Internet with fake pornographic scenes depicting real people. recognizable faces. Under pressure from regulators, major tech companies have promised to watermark AI-generated results to help identify what’s real.
Copyright disputes are also pending over the video and image collections on which the AI systems are trained (Runway or OpenAI do not disclose their data sources) and the extent to which they unfairly copy trademarked works. And there is a fear that at some point video-making machines could replace human work and art.
So far, the longest AI-generated video clips are still measured in seconds and can feature jerky movements and telltales like distorted hands and fingers. Fixing this is “just a matter of more data and training” and the computing power that training depends on, said Alexander Waibel, a computer science professor at Carnegie Mellon University who has studied artificial intelligence since the 1970s.
“Now I can say, ‘Make me a video of a rabbit dressed as Napoleon walking through New York,'” Waibel said. “It knows what New York looks like, what a rabbit looks like, what Napoleon looks like.”
He said, which is impressive, but still far from creating a convincing story.
Before it released its first-generation model last year, Runway claimed AI fame as the developer of the Stable Diffusion image generator. Another company, London-based Stability AI, has since taken over development of Stable Diffusion.
The diffusion model technology behind most leading image and video AI generators works by mapping noise or random data onto images, effectively destroying the original image, and then predicting what the new one should look like. It borrows an idea from physics, which can be used to describe, for example, the outward spread of gas.
“What diffusion models do is they reverse this process,” said Phillip Isola, an assistant professor of computer science at the Massachusetts Institute of Technology. “They kind of take randomness and solidify it back into volume. This is a way to go from randomness to content. And that’s how you can make random videos.”
Creating video is more complex than creating still images because it takes into account temporal dynamics, or how video elements change over time and across different sets of frames, said Daniela Rus, another MIT professor who directs its computer science and artificial intelligence lab.
Rus said that the computing resources required are “significantly greater than creating still images” because “it involves processing and creating multiple frames for every second of video.”
That doesn’t stop some very successful tech companies from trying to outdo each other by showing higher quality AI video for longer. Requiring written descriptions to make a picture was just the beginning. Google recently introduced a new project called Genie, which can be made to turn a photo or even a sketch into an “infinite selection” of video game worlds to explore.
AI-generated videos are likely to appear in marketing and educational content in the near future, providing a cheaper alternative to producing original material or acquiring stock videos, said Aditi Singh, a researcher at Cleveland State University who has studied the text. video market.
When Madonna first talked to her team about AI, “the main intent wasn’t, ‘Oh, look, it’s an AI video,'” said creative director Kasiuha.
“He asked me, ‘Could you use one of these AI tools to make the image sharper, to make sure it looks current and high-resolution?'” Kasiuha said. “He loves when you bring in new technology and new kinds of visual elements.”
Longer films created by artificial intelligence are already being made. Runway hosts an annual AI film festival to showcase such works. Whether the human audience wants to watch it remains to be seen.
“I still believe in people,” said Waibel, the CMU professor. “I still think it’s ultimately going to be a symbiosis where some AI suggests something and a human improves or guides it. Or humans do it and the AI fixes it.”
Associated Press reporter Joseph B. Frederick contributed to this report.