DeepSeek V3 "reports the wrong door": I am ChatGPT

DeepSeek V3

Image source: Generated by Unbounded AI

If you want to talk about the top topic in the big model circle these two days, it is definitely DeepSeek V3.

However, as netizens are testing it, a bug has become the focus of heated discussion -

There is just one missing question mark, DeepSeek V3 actually calls itself ChatGPT.

Even if you let it tell a joke, the generated result is the same as ChatGPT:

In addition, a highlight of DeepSeek V3’s popularity this time is that the training cost only US$5.576 million. .

Since then, some people have begun to doubt: It is not trained based on the output of ChatGPT, right?

What a coincidence, Altman also posted a status, which seemed to be sarcastic...

However, DeepSeek V3 is not the first large model to have "reported the wrong home". .

For example, Gemini once said that he is Baidu’s Wen Xinyiyan...

So what is going on?

Why does DeepSeek V3 report the wrong door?

The first point that needs to be emphasized is that from the current overall discussion point of view among netizens, it is not possible to say that DeepSeek V3 is trained on ChatGPT output. big.

The reason why I say this is, as netizen Riley Goodside summarized - because the shadow of ChatGPT is everywhere.

Even if DeepSeek V3 deliberately uses the output of ChatGPT for training, it does not matter. Almost all the big models that appeared after ChatGPT have seen it. For example, ShareGPT is a ChatGPT conversation data set that is not new, and many people have tried to adapt it and other ChatGPT data sources. But even so, there was no large model at the DeepSeek V3 level.

Following this, Riley Goodside presented some evidence from the DeepSeek V3 report:

Moreover, if ChatGPT data is used, some issues regarding the quality of DeepSeek V3 cannot be explained. For example, in the Pile test (the effect of basic model compression on Pile), the score of DeepSeek V3 is almost the same as that of Llama 3.1 405B. This has nothing to do with whether it is exposed to ChatGPT data. Moreover, the report states that 95% of GPU-hours are used to pre-train the basic model. Even if it is related to ChatGPT data, this part will occur in the post-training stage (the last 5%).

Instead of using ChatGPT data, perhaps we should pay more attention toWhat I am paying attention to is why large models often have the problem of "reporting the wrong home".

TechCrunch gave a sharp comment on this issue:

Because the network, where AI companies obtain data, is flooded with AI garbage.

After all, an EU report predicted that by 2026, 90% of online content may be generated by AI.

This kind of "AI pollution" will make it difficult to "completely filter the AI output by training data".

Heidy Khlaaf, chief scientist of the AI Now Institute, said:

Despite the risks, developers are still attracted by the cost savings brought by "distilling" knowledge from existing AI models. Models accidentally trained on ChatGPT or GPT-4 output will also not necessarily exhibit output reminiscent of OpenAI's custom messages.

So now for the hotly discussed issue among netizens, qubits have been tested in a wave of experiments. DeepSeek V3 has not yet solved this bug.

There is still a missing question mark, and the answer will be different:

More ways to play with DeepSeek V3

However, most netizens have greatly affirmed the capabilities of DeepSeek V3.

This can be confirmed by the fact that AI big guys from all walks of life collectively call it "elegance".

In the past two days, netizens have successively revealed more practical gameplay supported by DeepSeek V3.

For example, some netizens competed with DeepSeek V3 and Claude Sonnet 3.5 and used them to create websites in Scroll Hub:

Well, DeepSeek V3 is a bit easy to use. of.

One More Thing

For the 53-page paper previously published, some netizens also paid attention to a non-technical detail -

The contribution list not only shows Technical staff, as well as data annotation and business staff:

Netizens believe that this approach is very consistent with the tone of DeepSeek:

Reference link: [1 ]https://techcrunch.com/2024/12/27/why-deepseeks-new-ai-model-thinks-its-chatgpt/[2]https://x.com/victormustar/status/1872647314231398524[3]https://x.com/breckyunits/status/1872422078592516295[4]https://x.com/op7418/status/1872689338242482203 [5]https://x.com/goodside/status/1872911457857208596[6]https://x.com/kevinsxu/status/1873146905846530472