Just now, Zhipu has raised another 3 billion! Valuation of over 20 billion leads the large model entrepreneurship track

Image source: Generated by Unbounded AI

Just now, Qubit learned that Zhipu, one of the top six big model entrepreneurs, has recently completed a round of RMB 3 billion in financing!

New investors include a number of strategic investors and state-owned assets, and Legend Capital and others will continue to invest.

This is Zhipu’s second round of financing in the past three months.

Just in September, Zhongguancun Science City Company announced that it would lead the investment in Zhipu with a pre-money valuation of 20 billion. Zhipu was also the first large-scale startup in China with a valuation exceeding 20 billion.

Qubits have incomplete statistics. The financing situation officially disclosed by Zhipu is as follows:

In 2021, it was disclosed that the A-round financing of over 100 million yuan was completed. Dachen Financial Intelligence, Huakong Fund, Jiangmen Venture Capital and other joint investments; announced in 2022 to receive hundreds of millions of yuan Legend Capital and Qiming Venture Partners jointly led the Series B financing of the coin; in 2023, it was disclosed that it had received a total of more than 2.5 billion yuan in financing, with investors including Legend Capital, Meituan, Ant, Alibaba, Tencent, Xiaomi, Kingsoft, Shunwei, Boss Direct Recruitment, TAL, Sequoia, Hillhouse, etc.

At the same time, Zhipu also rarely disclosed its commercialization results to the outside world. This is also the first time it has been made public.

The specific data are as follows:

So far, Zhipu’s commercial revenue has increased by more than 100% this year. The annual API revenue of bigmodel.cn, the open platform of Zhipu, has increased by more than 30 times year-on-year. The average daily consumption of Tokens on the platform increased by 150 times, of which paid Tokens increased by more than 40 times. The MaaS platform has 700,000 enterprise and developer users, and the number of paying customers has increased more than 20 times. C-end product Zhipu Qingyan has more than 25 million users. In the third quarter, Qingyan began to launch paid functions, and its ARR (annual recurring revenue) revenue exceeded 10 million.

So, here’s the question – what’s the next step for Wisdom, which has attracted so much money and has a lot of commercial success?

Further development of large model of Zhipu base

Zhipu revealed , this round of financing will be used for the further research and development of the large model of the smart spectrum base:

From answering questions to solving complex reasoning and multi-modal tasks, it will better support the development of the industry ecosystem.

At the just-concluded industry summit hosted by Qubit - MEET 2025 Intelligent Future Conference, Zhipu COO Zhang Fan happened to reveal some information.

He said that OpenAI has released a classification of AGI target capabilities, and Wisdom Map also has its own understanding.

Zhipu divides AGI into five levels:

The first level is language, and Zhipu "has done a very good job."

The second level is to solve complex problems. We can see the emergence of abilities like o1. The application of the model is similar to transforming from "brain system 1" to "system 2", from simple and intuitive answers. , turned into in-depth thinking and dismantling.

The third level opensWhen you start to use tools to answer complex questions, you can not only do deep thinking directly, but also continuously interact with the outside world to obtain information. For example, autonomous agents can not only obtain information through APIs, but also operate the interfaces of mobile phones, PCs and even cars like humans.

The fourth level is to realize self-learning.

Level 5 has not yet been clearly defined, but the direction is to surpass humans. AI will have the ability to explore ultimate questions such as scientific laws and the origin of the world.

Zhang Fan said that Zhipu is constantly exploring and enriching the capabilities of the model, from the initial language capabilities, to L2 complex problem capabilities, to tool capabilities and the fourth level that it is currently trying to solve. Capabilities such as GLM-zero and GLM-OS.

Let’s take a systematic inventory of Zhipu’s exploration route.

End-to-end multi-modality and Agent layout have begun to take shape

Sora started the year with explosive popularity, and then multi-modal models emerged one after another. Now, deep inference models have become the top trend, and end-side large models and Agent technology have become new trends.

Looking at the summary, every step of the wisdom spectrum has not been missed.

First there are video generation models CogVideoX that benchmark Sora, and end-to-end voice models GLM-4-Voice and GLM-4-VideoCall that benchmark GPT-4o.

Later, the layout of Agent and client-side fields has gradually become clear in the near future -

The intelligent product AutoGLM/GLM-PC and the client-side large model adapted to Qualcomm Snapdragon flagship chip have successively release.

In this, unlike OpenAI and others, one of the consistent principles of Zhipu is to continue to be open source.

If you look at Zhipu’s early GLM report, you will find that it contains the words “We invite everyone to join its open community and promote the development of large-scale pre-training models.” This company will “use open source The habit of "making friends with developers and industry users" continues to this day.

According to current data, more than 20 models including ChatGLM have received 150,000 GitHub stars, and open source models have been downloaded 30 million times worldwide.

The following is the technology release timeline of Zhipu this year:

In November, an upgraded version of AutoGLM was released, which can independently perform long-step operations of more than 50 steps, and can also perform tasks across apps, turning on " "Fully automatic" new Internet experience, supporting driverless driving of dozens of browser-based websites. In November, GLM-PC internal testing was released to explore "unmanned driving" PCs based on the smart spectrum multi-modal model CogAgent. It can participate in video conferences, process documents, search web pages and summarize, and perform remote scheduled operations on behalf of users. In November, the video model CogVideoX was upgraded to support 10s duration, 4k, 60-frame ultra-high-definition image quality, any size and more.Good human movement and physics world simulation. CogVideoX v1.5-5B and CogVideoX v1.5-5B-I2V are open sourced at the same time. In October, the GLM-4-Voice end-to-end emotional speech model was released and launched on the Qingyan app. It can understand emotions, have emotional expression and emotional resonance, can adjust the speech speed by itself, supports multiple languages and dialects, and has lower latency. , can be interrupted at any time. In October, the internal beta version of AutoGLM was released. It only needs to receive simple text/voice commands to simulate human operation of the mobile phone and is not limited to API calls. In October, it announced cooperation with Samsung and Qualcomm to jointly create AI products and large terminal-side multi-modal interaction models. In August, GLM-4-Videocall, a large real-time reasoning model across text audio and video modalities, was released to realize real-time video calls between AI and people. Through the API interface, it can be seamlessly deployed on various end-side devices with cameras, including mobile phones. In August, the new generation of large base model GLM-4-Plus was released, with comprehensive improvements in language understanding, command following, and long text processing. In July, the video generation model "Qingying" was officially launched on Qingyan PC, mobile applications and mini-programs, providing text generation video and image generation video services. It can complete 6-second video generation in 30 seconds, truly restoring physics. The process of movement in the world. In June, the GLM-4-9B model was open sourced, supporting 1 million Tokens long text and 26 languages. It also open sourced the GLM-based visual model GLM-4V-9B for the first time, with multi-modal capabilities comparable to GPT-4V. In January, the new generation base large model GLM-4 was released. Its overall performance has been greatly improved compared to the previous generation. It supports longer contexts, has stronger multi-modal capabilities, faster inference speed, supports higher concurrency, and greatly reduces inference costs. .

As the end of the year is approaching, a new year of storms for large-scale model entrepreneurship is about to emerge.

Online Consultation