AI companies are really "hungry" and are starting to spend money on your "waste movies"

AI companies are really

Image source: generated by Unbounded AI

Everyone who has been an Up host, YouTuber or video self-media practitioner knows that it takes 10 minutes for a video to be uploaded to the platform There may be hours of footage behind the finished film. Like fast charging, "shoot for 1 hour, cut for 1 minute". In the film industry, common material ratios are 10:1 to 20:1 or even higher.

Those discarded materials are called "waste films". After the film is output, these materials are like waste products and will only occupy hard disk space.

But just like in real life, there are people who are willing to spend money to collect scraps. Now big AI companies want to start spending money to "collect scraps."

On January 11, according to foreign media reports, companies such as Open AI, Google, and Moonvalley are purchasing "waste videos" shot but not used by video creators. High-quality 4K, drone, 3D animation materials, 1-4 US dollars (approximately 7.3-30 yuan) per minute, materials for online videos such as YouTube, TikTok, Instagram, etc. 1-2 US dollars (approximately 7.3-15 yuan) One minute.

Looking at it this way, as long as the quality of the scrap film is good enough, an hour of scrap film can be sold for up to 1,800 yuan, which may be higher than the share the platform gives Up owners.

01. AI giants are really "hungry"

Why do these technology companies spend money to buy useless products that users have taken pictures of? piece?

The reason is simple: the video data is not enough.

Generative video models, autonomous driving systems and even robot training all require a large amount of videos as training data. Not only does the creation of high-quality videos have high barriers to entry, but the copyright division in the AI era is also very blurry.

Copyrights for advertising and film companies are expensive, and online platforms usually only have distribution rights but not usage rights. Copyright contracts signed with directors and production teams rarely include terms on AI usage rights.

The same goes for video websites. If a video model wants to legally crawl YouTube videos, should it contact YouTube or YouTuber? This is also a gray copyright issue that has not yet been resolved in the AI era.

YouTube also does not have the right to license third-party content｜Source: YouTube

The Generative AI Copyright Disclosure Act proposed by the U.S. House of Representatives in April 2024 requires data sets Producers submit "a fully detailed abstract of any copyrighted work" to the registrar or face fines.

In this context, Open AI, Google and other AI companies have thought of the method of "not buying finished films but buying scraps"Mode.

However, big AI companies do not directly connect with creators, but use third-party professional companies to contact platforms and creators. They only pay. How to negotiate, who to buy from, and how to use it after buying it are all negotiated between the intermediary company and the platform.

Several intermediary companies stated that they have purchased more than 5 million US dollars in materials so far, and have connected with as many as 17 AI companies, including OpenAI, Meta, Microsoft, etc.

The AI company cannot use it indiscriminately after buying it back. The "intermediary guarantee" provided by a third-party professional company limits the scope of use of waste films: the AI company cannot create a digital clone of the creator; it cannot recreate it in the AI model. Create an AI scene exclusive to the creator, such as directly generating a fixed background of an Up master or using his or her classic memes or catchphrases; materials cannot be used in a way that damages the reputation of the creator.

For Internet celebrity YouTubers, the face is an "identity mark"｜Source: PewDiePie

YouTube also added a similar new feature last month: YouTubers decide for themselves whether AI You can capture your own video content, and you can even choose an authorized AI company (of course you can choose all). However, YouTube has not yet given a policy on licensing fees.

There are eighteen mainstream AI companies in the authorized list｜Source: YouTube

02. The arms race of video models< p>With the development of Internet content, as bandwidth and information volume increase, the trajectory gradually shifts from text to video, and the same is true for large models.

Video models have become the most popular field for large-scale models in the past year. Many AI companies have gone a step further and directly started to create "world models" that can generate dynamic scenes. However, no matter which model, data nourishment is far away. Don't turn on the video. As a result, major AI companies have begun an arms race. Whoever can get more video data may have better video models.

At the recent CES 2025, NVIDIA released Cosmos, the world’s basic model platform. According to reports, Cosmos has undergone 20 million hours of video training. However, Nvidia was exposed by 404 Media last year for illegally grabbing a large number of YouTube and Netflix videos to train "a product internally named Cosmos" without authorization.

NVIDIA’s internal chat records on Slack｜Source: 404 Media

According to leaked NVIDIA internal chat records, NVIDIA’s AI scientists and executives have compiled a large number of selected YouTube videos Datasets are used for model training, including one named HD-VG-130M, which was constructed by researchers from Peking UniversityBuilt with 130 million YouTube data, use restricted to academic research.

After being questioned that "YouTube's terms of service prohibit downloading, and the data can only be used for research purposes," NVIDIA executives stated that "whether copyrighted data can be used for training is currently an unsettled legal issue. Question... On the large language model, I believe our legal team has approved this approach, so video training may also be approved." Before NVIDIA, OpenAI's video large model Sora has been approved. YouTube called it out. The New York Times, which is currently in a legal battle with OpenAI, first reported that OpenAI collected more than one million hours of YouTube videos to train GPT-4.

As for the source of Sora training data, Mira Murati, the then OpenAI chief technology officer (resigned) bluntly said, "Actually, I'm not sure either." YouTube CEO Neal Mohan responded, "If OpenAI uses YouTube videos for training Sora, it clearly violated YouTube's terms of use."

With the same attitude, YouTube sent this interview to 404 Media and responded to Nvidia.

There are also video models that are taking a different approach. The new video model "Marey" that will be released in the next two months will be the "cleanest" in the industry. They claim that all training data has been authorized, and Marey's The target users are the major studios in Hollywood and the entire film industry.

Picture source: Moonvalley

This is because movies are not only the pinnacle of video quality materials, but also the video field with the strictest copyright regulations.

For online video creators, the destination of waste footage is a backup hard drive or even a recycling bin. Nowadays, large companies are willing to pay to "reuse waste footage." If this model can continue to operate, It can also be regarded as a means of income for small creators.

For bigger-name "creators" such as film companies and studios, technology has already penetrated and even transformed the film industry, from CGI generation and virtual production to AI voice synthesis, facial de-aging, etc. Wait, AI is nothing more than a new technological means to improve the efficiency of film and television production.

But creators, big and small, may be wary of “killing the goose that lays the goose to seize the egg” when it comes to AI video generation. Just imagine, when a creator keeps selling his scrap films to an AI model, and when the AI model is enough to make the fake look real, do we really need a specific creator to show up? When AI can generate movie-level empty shots and visually impactful special effects, will the film industry still need highly skilled photographers and digital special effects producers...

"Learn from you, chasing you, replacing you." This is the unavoidable fear that every creator faces when facing the evolution of generative AI. I can only comfort myself by saying: Under the unstoppable wave of AI, scrap films can still be sold for money. This is better than being a "data cash machine" for free.