A first-hand test of the "Doubao" deep thinking model: Can it surpass DeepSeek?

A first-hand test of the

Picture source: Generated by Unbounded AI

ByteDance's AI assistant Doubao is testing the deep thinking model in a small scale. According to a relevant person in charge of Doubao, the current test is different experimental versions of its own deep thinking model.

In addition, it is reported that the deep thinking model that Doubao is being tested is developed based on the Doubao 1.5 base model.

In fact, in mid-January, when the Doubao big model team released Doubao 1.5Pro, it announced the existence of the deep reasoning model Doubao-1.5-pro-AS1-Preview, and said, "Without using other model data at all, through breakthroughs and engineering optimization of RL algorithms, we fully utilize the computing power advantages of Test Time Scaling, completed RL Scaling, and developed Doubao's deep thinking model."

Geek Park's actual test found that the answers generated by the latter did start to show the inference process when talking to Doubao, but they did not appear stably. At present, there is no entry to the "Deep Thinking" function on the Doubao dialogue page.

Since February 22, Doubao has been suppressed by Tencent's AI application "Tencent Yuanbao", and ranked third in the Apple App Store free APP download ranking in China (first place is deepseek). After multiple applications of Tencent and Baidu are connected to deepseek, how Byte Doubao will be the focus of everyone's attention, and the answer is now emerging.

01. Is the bean bread also "deep thinking"?

The earliest model with deep thinking was the o1 system launched by OpenAI in December 2023, but it adopts a closed source strategy and is only available for paid users ($200 per month). DeepSeek, through open source strategies, cost reduction and interactive innovation, became the first AI company to popularize deep thinking capabilities on a large scale. DeepSeek released R1-Lite-Preview on November 20, 2024, becoming the first inference model in China to benchmark o1, and opened the R1 model on January 20, 2025.

The innovative points of the R1 model are: transparent thinking chains; displaying a complete inference process, including anthropomorphic thinking paths such as self-questioning and hypothesis verification; low cost and open source; the inference cost of the R1 model is only 1/27 of OpenAI o1, and the code is completely open.

DeepSeek's deep thinking model is an inference process that explicitly integrates AI models.Chain of Thought (CoT) is the core technology that supports this model.

Simply put, the deep thinking mode allows users to intuitively see the thinking process of the model, which involves the display of the thinking chain, that is, COT (Chain of Thought) - the thinking chain is simulated. Through training, the model outputs intermediate steps, such as self-questioning and reflection. Although it is just a sequence of words, it looks like a human thinking process.

Under the deep thinking mode, users can not only see the final answer of AI, but also observe the complete logical chain of the model's problem-solving, including self-questioning, hypothesis verification, error correction and other steps. For example, when solving mathematical problems, the model will show the entire process from problem disassembly, multiple methods verification to the final conclusion.

After combining real-time networking functions, the model can capture the latest information and integrate logically. On the 25th, Anthropic released the Claude 3.7 Sonnet hybrid inference model, and the Alibaba Cloud Qwen inference model "QwQ-Max Preview Edition" was also unveiled. I asked Doubao to evaluate these two inference models:

You can see that Doubao found 9 articles of information and did "deep thinking" | Picture source: Geek Park

Doubao showed the thinking process | Picture source: Geek Park

The thought-making Doubao outputs evaluation of these two models｜Picture source: Geek Park

The thinking process display allows users to clearly see the model's reasoning steps, not just the final result. In this way, users can feel that the model's decisions are based on and they will have more trust in the results output by the model.

02. Doubao vs deepseek, each has its own advantages. Because it is still under test, the "Deep Thinking" function is not currently displayed on the Doubao dialogue page. When entering the message, there is no selection box like other products connected to deepseek to choose whether to enable the "Deep Thinking" function. However, users who are grayscale will trigger this function when asking some questions.

I asked Doubao and deepseek at the same time with a few questions to see what the two will behave in "deep thinking".

Classic mathematical problem: "Who is older than 9.11 or 9.9"

Look at the thinking process of Doubao first:

Let me talk first. During the test, I found that the "deep thinking" mode of Doubao is not stable. After the first input of "Who is older than 9.11 or 9.9",It simply responded to me:

Picture source: Geek Park

But when I typed "9.11 or 9.9, who is the biggest" again, I wanted to try to trigger the "deep thinking" mode, it really appeared:

Doubao considered in detail why I asked it this question for the second time... | Picture source: Geek Park

It can be seen that although Doubao realized that he had answered me just now, it still thought about the possibility that I might not understand the previous answer, and then gave the judgment method and finally output the result.

Look at deepseek's thinking process again:

You can see that although this is a "simple" question, deepseek's thinking process is also very detailed and more comprehensive than Doubao's thinking process.

In this simple math problem, Doubao and deepseek both follow the basic rules of decimal comparison and use multiple methods to verify; the difference is that Doubao focuses on teaching guidance and takes into account possible user misunderstandings, while DeepSeek is more self-questioning and repeated verification, and the thinking process is more complicated.

Philosophical Question: What is the essence of consciousness? Will AI gain self-awareness?

Let’s look at Doubao’s answer first:

It can be seen that DeepSeek’s answers are divided into scientific theory and AI The four parts of consciousness path, ethical framework and solution path are cited by neuroscience, quantum theory, etc., and legal cases and specific data are mentioned; while Doubao's answers are more inclined to philosophical theory classification, listing physicalism, dualism, etc., and discussing views that support and oppose AI rights, but there is no in-depth technical details.

Both admit that there is no consensus on the nature of consciousness, and they also mention philosophical and scientific theories and ethical issues. The difference lies in depth and technical details. DeepSeek is more technically oriented, involving neuromorphic calculations and quantity.Sub-seal technology, etc., while Doubao focuses more on philosophical schools and existing ethical guidelines.

Through this actual test, we have seen the initial performance of Doubao in the deep thinking mode. Although it is currently in the testing stage and the stability and entrance of its function have not been fully opened, its preliminary display of the reasoning process has brought users a more intuitive understanding path.