Sora is finally here, but Juan Wang Keling has already "filmed" an AI movie

Sora is finally here, but Juan Wang Keling has already

Image source: generated by Unbounded AI

This time, it is the turn of the famous directors to come to terms with their lives.

The opening of the short film is quite shocking: Zhong Kui holds a ghost-slaying sword and walks through a dark forest with twisting branches.

Accompanied by a burst of fast-paced gongs and drums, hare spirits, toad spirits, and skeleton-covered dryads appeared one after another, creating a tense and terrifying atmosphere.

But after a long drink, the camera turned, and the four words "Do Not Disturb" popped up on the phone screen, with various intercepted information scrolling underneath:

Automatic accounting APP , multi-person video conference, 4 calls from unknown numbers, 183 WeChat group messages, jump links to risky websites, and a call from the big boss...

The last screen slowly displayed "During vacation" , subtitles "Don't disturb a hundred ghosts". I thought it was a Chinese fantasy film, but I didn't expect it to be a workplace complaint film.

What is even more surprising is that this 5-minute short film created by director Yu Baimei was completely produced by Keling AI.

As early as three months ago, Kuaishou Keling AI jointly launched 9 well-known directors including Li Shaohong, Jia Zhangke, Ye Jintian, Xue Xiaolu, Yu Baimei, Dong Runnian, Zhang Chiyu, Wang Zichuan, and Wang Maomao Participated in the "Keling AI Director Co-creation Project" and produced 9 AIGC short films.

Jia Zhangke, who laughs at himself for not being able to write scripts in Mandarin, used the "lip-syncing" function of Keling AI in his first AIGC short film "Wheat Harvest" to make the robot speak Shanxi Fenyang dialect.

Tim Yip, the art director of "Crouching Tiger, Hidden Dragon", used the "first and last frame" function of Keling AI to depict the adventure of an alien courier in space. Through AI technology, every frame is so realistic that it makes people feel like they are in a real space world.

In his work "Daisy", director Wang Zichuan started from the relationship between computers, robots and other modern technological products and people, and made extensive use of the "Tusheng Video" function of Keling AI, through repeated montage, high-speed The editing and special effects shots bring a strong audio-visual impact to the audience.

Keling AI has the most say in assisting film and television creation. Zhang Di, vice president of Kuaishou Technology and head of the large model team, said, "AI large models in the field of visual generation will develop rapidly in 2024. Since its release in June, Keling AI has allowed many users to feel its capabilities in video creation."

This time, these 9 experimental short films all use Keling AI for video generation. This is the first time in China that a film director relies entirely on large-scale video generation models and is deeply involved in the creation of film-level content.

As the result of China’s first AIGC director co-creation plan, these 9 AIGC short films have been launched on the Kuaishou platform and have been selected by Chinese FilmMuseum permanent collection, screening and display. This is not only a tribute to the history of Chinese film, but also a bold exploration of future film creation.

How was Keling AI developed after 6 months of iteration and more than ten times?

At the beginning of this year, Sora became a fire, completely igniting the field of AI video generation.

On June 6, Kuaishou took the lead in launching the self-developed large-scale video generation model "Keling AI", becoming the world's first large-scale real image-level video generation model available to users.

Since then, Keling AI has successively launched functions such as Tusheng Video, Video Continuation, and Motion Brush. Not only have there been significant improvements in picture quality, command compliance, and motion range, but it will also generate videos. The duration has been extended by about 3 minutes, and the duration of the Vincent video generated by the creator at a time has also been increased to 10 seconds.

Despite its good results, Keling AI did not rest on its laurels, but continued to innovate non-stop.

On September 19, the Keling 1.5 model made its grand debut, directly raising the image quality, dynamic quality, aesthetic performance, motion rationality and semantic understanding to a new level.

Netizens around the world have also started to engage in crazy activities. A lot of imaginative videos have emerged on social platforms, such as Mona Lisa wearing sunglasses, panda playing guitar, Zhu Bajie eating noodles, etc.

In terms of commercialization, Keling AI is also at the forefront of the industry.

Keling AI has successively launched the web side and independent App, creating a multi-terminal and cross-platform product matrix, fully open to internal testing, and gradually launching a membership payment system to users at home and abroad. In addition, Keling AI has also opened API services for the B-side, covering multiple modules such as video generation, image generation and virtual try-on.

In order to provide creators with new channels for commercial monetization, Keling AI launched the "Future Partner Program" on October 18, taking the lead in launching a one-stop AIGC ecological cooperation platform.

In the past six months, Keling AI has been soaring all the way, with more than ten iterative upgrades, giving it the confidence to firmly stand in the first echelon of the industry. As of December 10, Keling AI has more than 6 million users and has generated more than 65 million videos and more than 175 million pictures.

Reshape the film and television industry Keling AI is back in action

Recently, Keling AI is striking while the iron is hot and has successively launched AI face customization models and AI trials. The clothing function provides more powerful tool support for film and television creators.

AI customized model

Overcome the problem of "character consistency"

In the video generation process, the current videoLarge models still have strong randomness. When they process the same text description, they often produce different video subjects. This randomness makes it difficult to maintain the coherence of the story and the consistency of the characters.

In this regard, Keling AI has launched a face customization model. Creators only need to upload 10 5-second multi-angle high-definition videos to self-train a face model. For better results, you can upload up to 99 videos.

After completing the training, you can generate video results with consistent faces in the Wensheng video of the Keling 1.5 model, which meets the creator's request to generate multiple shots containing the same person, and the person in a single shot is The face will also be more stable and clear.

We got our hands dirty, training a face model of Sam Altman and then placing it in various scenes.

For example, Sam Altman eats spaghetti in a restaurant:

Sam Altman eats dumplings in a restaurant:

And Sam Altman rides a motorcycle on a busy street:< /p>

There is also a sci-fi style, allowing Altman to transform into Iron Man and walk on the streets in a cyberpunk style:

It can be seen that Keling AI’s face customization model is solving the problem of character IP A new step has been taken to solve the industry problem of stability, which also marks another important technological breakthrough in the field of AI video generation.

AI fitting

New exploration of film and television styling

In the movie "The Devil Wears Prada", there is a classic montage of cross-dressing. Anne Hathaway changed into six beautiful outfits in less than a minute, and each outfit was stunning.

So can AI realize clothing matching and styling design in film and television production?

KeLing AI’s newly launched “AI Fitting” function can handle this. Based on the graphic model, it introduces technologies such as clothing SKU preservation network, character pose, and background restoration to achieve the generation of try-on effects for any clothes, any body, and any movement.

Usage is also very simple. You only need to upload a model photo and a set of clothing pictures to change clothes in one second, which undoubtedly greatly improves the efficiency of clothing matching and effect display in film and television production.

For example, let Anne Hathaway change into a cheongsam. The changed clothes will not only naturally fit the body curve, but can also be matched with a handbag according to the style.

For another example, Swift, who was originally wearing an off-shoulder shirt and denim shorts, immediately changed from a casual style to a ladylike style after some transformations by Keling AI. The pleats and knots after the change They are all generated naturally.

Put on a black leather jacket for Sister Feng. While keeping the complicated headdress, it perfectly retains details such as the fur collar and zipper.

The most amazing thing is the cross-dressing of the devil Cate Blanchett. A second ago she was still cool and sexy black tightsWearing a T-shirt, the next second she put on a white rose dress.

It shows the silk material of the skirt in detail, and even the color and position of the roses are restored one by one.

World famous paintings or statues can also be instantly changed. Let Mona Lisa, who is wearing a black robe, wear a large Northeastern flower jacket:

Wear a round-neck polo shirt and gray trousers on the terracotta warriors and horses:

In addition, it also The whole process of material generation can be realized through AI image enlargement and AI's image-based video large model.

For example, the gray hooded sweatshirt that the queen put on was transformed into a loose robe through AI enlargement.

Then use the lens control function to turn it into an outfit video.

Or enter Prompt "The model turns left and right to show the audience the clothes on her body" to make Huang Renxun move in fur.

From Wensheng Video, Tusheng Video, to the first and last frame functions, to face models, AI dress-up... the continuous launch of these innovative technologies all demonstrates Kuaishou's contribution to the future development trend of the film and television industry profound insight.

Open sharing leads AIGC innovation

As an AI video generation track Among the leaders, the Kuaishou Keling Large Model Team continues to expand the boundaries of technology, while also disclosing a series of technological progress and actively sharing technological research and development results with the industry.

In the field of AI video generation, basic video generation models and data are the cornerstones of building a high-quality video content generation system. In order to crack this hard nut, the Keling team conducted systematic research and took the lead in proposing a Scaling Law modeling method tailored for the video generation model (Video DiT).

This method can predict the performance of large-scale models in advance at low computational costs, helping researchers optimize technology selection and adjust model parameters, thus significantly reducing experimental trial and error costs.

Precise Scaling Law Modeling under Video DiT Architecture

Paper title: "Towards Precise Scaling Laws for Video Diffusion Transformers" Paper address: https://arxiv.org/pdf/2411.17470

In addition, the Keling Big Model team also disclosed part of its core preprocessing process for video training data, and launched the high-quality video generation data set Koala-36M based on this process.

This data set is one of the world's leading large-scale high-quality video-text data sets. It contains 36 million video clips with an average duration of 13.75 seconds and a resolution of 720p. Each video clip is equipped with There are an average of 202 word detailed description.

Data processing process

Paper title: "Koala-36M: A Large-scale Video Dataset Improving Consistency Between Fine-Grained Conditions And Video Content》Paper address: https://arxiv.org/abs/2410.08260 Code address: https://github.com/KwaiVGI/Koala-36M project homepage: https://koala36m.github.io/Dataset link: https ://huggingface.co/datasets/Koala-36M/Koala-36M-v1

In comparison with other datasets, based on Koala-36M The model trained on the dataset demonstrated excellent performance, achieving optimal performance both in terms of video quality and consistency of text and video content.

The Keling team has also made a series of progress in the controllability and interactivity of video generation.

For example, the 3D trajectory control video generation project 3DTrajMaster was released:

3DTrajMaster can accurately control the movement of different subjects in the video in 3D space

Project homepage: http: //fuxiao0719.github.io/projects/3dtrajmaster

Multi-camera video generation project SynCamMaster:

SynCamMaster Supports a variety of camera perspective changes, such as changing camera azimuth angle, pitch angle, distance, etc.

Project homepage: https://jianhongbai.github.io/SynCamMaster/

As well as the precise video stylization project StyleMaster.

Project homepage: https://zixuan-ye.github.io/stylemaster

These projects can not only control the three-dimensional movement of the subject in the video, but also generate multi-view videos based on the user's text description, and support any Artistic video style transfer.

In addition, the team also developed GameFactory, a game video generator with generalization capabilities, which allows users to customize character actions and enjoy a personalized virtual world experience.

By continuing to open core data and technical components, and sharing thesis technical solutions, the Keling team not only injects new impetus into the field of film and television creation, but also opens up more possibilities for future creative expression and content creation. .

Opening a new era of film and television creation

Looking back on a century of film history, technological innovation has always been a key driving force for the development of the film industry.

From silent to sound, from black and white to color , from film to digital... Every technological leap promotes film art to a higher stage.

Now, with the continuous iteration and breakthrough of AI technology, AI can be used. Visual models and products represented by AV have gradually become the new infrastructure and new tools of the visual industry. They are reshaping the future of the film and television industry with their unique advantages.

In traditional film shooting, directors have unlimited imagination. Ideas are often constrained by physical conditions and the real world, but AI breaks these boundaries and creates any scene imagined by the director. This creative freedom provides unlimited possibilities for film narratives.

Director Wang Zichuan is working with Keling AI During the cooperation, I deeply realized the profound impact of Wensheng Video and Tusheng Video technologies on film narrative methods. "Keling AI can quickly turn the creator's imagination into a visual content presentation, simulating what you want as much as possible. Every dynamic and overall narrative rhythm, including all conflicts, internal scheduling of the screen, etc."

In his view, technology is not just a tool, but also a new dimension of narrative art, providing a brand new language for film narrative.

On the other hand, AI has greatly optimized the cost efficiency of the film industry.

Once upon a time, making movies was a luxury. Take "Avatar: Water Path," a money-burning masterpiece in film history, as an example. Its production cost exceeded $450 million. Based on a running time of 193 minutes, the production cost per minute is as high as $2.33 million. Such a huge amount of money was spent, even Hollywood, which has deep pockets, was once unable to bear it.

In contrast, AI-generated movies can complete most of the work in a virtual environment, significantly reducing costs. At the same time, the high efficiency of AI has greatly shortened the film production cycle, which is undoubtedly a huge advantage for the film industry that pursues quick returns.

Of course, current AI video generation technology is still in the development stage and still has shortcomings in simulating subtle changes in human emotions, creating deep narrative structures, and capturing the unpredictable contingencies of the real world.

However, as Director Yu Baimei said, although today’s AI works are not great works, they are precious to those who come through. I believe that in a few years, AI will produce very high-quality movie masterpieces.

Online Consultation