News center > News > Headlines > Context
Musk's Grok 3 is not the "cleverest on Earth" yet, but it is indeed the richest
Editor
2025-02-20 18:05 8,367

Musk's Grok 3 is not the

Image source: Generated by Unbounded AI

The "smarst AI on Earth" Grok 3, which Musk mentioned, is here.

In a live broadcast of millions of people watching, Musk released Grok 3, and he participated in the release of two Chinese researchers, namely xAI co-founder Tony Wu and Jimmy Ba . Judging from the benchmark test, Grok 3 is indeed amazingly strong, and from the perspective of capital investment, the computing power cluster of 200,000 GPUs behind it is also shocking.

The release of Grok 3 includes a series of models: Grok 3, Grok 3 mini, as well as updates such as Inference Mode (Think), DeepSearch, Big Brain, etc.

#01. The name of "smarttest AI" comes from the list, how to test the actual test

p>

In terms of benchmark evaluation, Grok 3 performs better than other models such as GPT-4o, Gemini-2 Pro, Claude3.5 Sonnet, DeepSeek-V3, etc. in benchmarks in mathematical reasoning, STEM and science fields. Even the smaller version of the Grok 3 Mini is at the top level.

Early versions of Grok 3 also scored high in the Chatbot Arena, a crowdsourcing test platform where different AI models compete with each other and users vote for the best answer, Grok- 3 is the first model to break through 1400 points and ranks first in all categories.

Grok's MMILU score has increased rapidly since its release in 2023, especially in 2024, reaching a significant breakthrough in Grok 2, showing a rapid catch-up and improvement compared with the GPT series.

"Grok 3 has a very strong reasoning ability, so in the tests we have conducted so far, the Grok 3 outperforms any released product we already know, which is a good sign," Ma Ske said via video call at the World Government Summit in Dubai last week.

Grok 3 has also introduced the inference mode (Think), through Grok 3 Reasoning and Grok 3 mini Reasoning, it can think like inference models such as DeepSeek- R1. Grok 3's model can solve complex problems by considering all possible solutions, self-criticism, validating solutions, backtracking, thinking from first principles, and more. However, to prevent distillation, part of the reasoning process of Grok 3 is blurred.

Grok 3 Reasoning is popular in multipleThe best version of the benchmarks surpassed the o3-mini - o3-mini-high, which includes the new mathematical benchmark AIME2025.

The team demonstrated using Grok 3's Think mode to generate an animated 3D drawing about launching from Earth to Mars and then returning to Earth, showing the trajectory of the next launch window.

In the demonstration, Grok 3 provides a Python script using Matplotlib and explains the code. The code seems to solve Kepler's law numerically. After the code runs, Grok animates the two planets, Earth and Mars, using small green spheres to represent the spacecraft's journey between them.

The demonstration was generated on site, so there was no verification that the solution was completely correct, but Musk's statement, wearing a pendant showing Earth's Mars transfer orbit, was close to the actual solution.

Andrej Karpathy, who experienced Grok 3 in advance, said that Grok 3's Think mode implements tasks that DeepSeek-R1, Gemini 2.0 Flash Thinking and Claude failed to achieve, but he said that top OpenAI models, such as o1 -pro can do it too.

After OpenAI, Gemini and perplexity, Grok also launched its own in-depth search Deep Search. The xAI team positioned Deep Search as the "next-generation search engine" and was the first generation of Grok Agent. It is more than a simple information retrieval tool designed to help program, research and answer daily questions.

From the demonstration, Grok 3's Deep Search does not have much uniqueness, and emphasizes that it is different from traditional search engines' keyword matching patterns, which can deeply understand the semantics and Intent, and obtain content from multiple sources of information, cross-verification to ensure accuracy, is more adjustable than traditional search engines, allowing users to specify sources.

xAI team specifically mentioned that the Deep Search search process is transparent to users and allows users to understand the "thinking" process of AI.

Andrej Karpathy believes that Grok 3's DeepSearch is roughly equivalent to Perplexity's DeepResearch, but has not yet reached the level of Deep Research recently released by OpenAI.

#02. Fully-blooded "Big Brain" mode

For More complex queries, using the "Big Brain" pattern to reason with more calculations. xAI uses these inference modelsAnother statement that describes it as the best suited for math, science and programming problems, looks like a "full blood version".

xAI team demonstrated Grok 3 creating a new game that combines Tetris and Bejeweled in Big Brain mode. The xAI team explained that since it was improvised during live broadcasts, Grok may make some minor coding errors, causing the game to not run exactly as expected. . During the live test, the generated game can run normally, but the color display of the game is somewhat problematic. It is also unclear whether the mechanism of Tetris clearing a whole row is implemented.

xAI team also confirmed its plan to launch an AI game studio during the live broadcast. Musk also posted a related tweet on X the day before.

#03. You can be willful if you have money, but there is still a lot to do if you want to be the "strongest". p">

Grok 3 Colossus cluster based on xAI, the first phase of 100,000 cards took only 122 days to build, and another 92 days to expand to 200,000 cards, and used about 200,000 GPUs Come to train Grok 3 and complete pre-training in early January. Musk previously posted on the X platform that Grok 3 development uses "10 times more computing resources than its predecessor, Grok 2, and the training data set has been expanded, allegedly including documents for court cases. During the live broadcast, he said that the computing resources of Grok 3 are about 15 times that of Grok 2.

Musk also revealed that xAI is building a new AI cluster, which will have five times the power of the current cluster.

In addition, regarding the voice mode, the team did not give a specific release date, but Musk said that "it will be released in about a week."

In specific details, voice will be generated directly by a Grok-like model that can understand what is said and generate audio directly. This approach allows AI to remember details and continue the conversation more naturally. The voice mode feature will be available in both the application and the API.

xAI plans to launch Grok-3's API in the next few weeks. This API will include Grok-3's inference model and Deep Search functionality. The xAI team is very looking forward to enterprise-level application scenarios, believing that the powerful capabilities of Grok-3 and the addition of Deep Search will bring huge value to enterprise users.

It is worth noting that xAI recently launched an activity where as long as it agrees to share data, it will give away an API limit of US$150. Obviously, xAI doesn't care about giving up this little wool, but it pays more attention to obtaining users and data in this way.

Regarding the open source plan, Musk said he would continue the previousStrategy, when Grok 3 matures and stable (which will probably be implemented within a few months) will open source Grok 2.

At present, users can experience it through X and Grok's websites and apps, not all Grok 3 models and related features have been launched (some are in the beta stage). Grok 3 will be launched first to Premium+ subscribers on X platform, in addition to an independent subscription service called Super Grok, providing Grok users with state-of-the-art features and earliest access, $30 per month or $300 per year, SuperGrok unlocks functions such as more query times in DeepSearch, and also provides an unlimited number of image generation services.

The release of Grok 3 marks a fierce competition for xAI in the field of AI, including not only OpenAI and Google, but also faces pressure from emerging Chinese companies. For example, DeepSeek has allowed AI companies around the world to adjust their strategies and make deep thinking models the "standard", which has also prompted OpenAI to open its inference model for free recently and has also begun to emit open source signals.

For Musk, OpenAI may be xAI's biggest enemy. Musk founded xAI in 2023, aiming to become a replacement for OpenAI and publicly criticized OpenAI for planning to reorganize itself into a for-profit business.

Musk also filed two lawsuits against OpenAI, accusing it of deviating from its original founding principles and proposing to acquire OpenAI's nonprofit division for $97.4 billion, but the proposal was rejected by the OpenAI board of directors last week. . Ultraman Sam said the acquisition offer was a strategy to "slow down our pace." Although Musk has been involved in the founding of OpenAI, he has been critical of the company since leaving the board in 2018.

And both companies are raising amazing financing, and their valuations continue to soar. According to Bloomberg last week, Musk's xAI is in financing negotiations of about $10 billion. After the financing is completed, the company's valuation will reach $75 billion, while xAI's last valuation was $51 billion. Meanwhile, OpenAI is negotiating to raise up to $40 billion in funding, and its valuation is expected to rise to $300 billion.

The characteristics of "rich and powerful" brought by capital are also very obvious. SoftBank, OpenAI, Oracle and Abu Dhabi-backed MGX jointly announced in January that it would invest $100 billion in the United States and would eventually invest $500 billion to build data centers and other artificial intelligence infrastructure. Meanwhile, Dell Technologies is close to completing a transaction worth more than $5 billion to provide xAI with servers optimized for artificial intelligence.

From the current situation, OpenAI is indeed the main competitor of xAI. BothThere is a direct competitive relationship in technology, market positioning and financing strategies. OpenAI remains at the forefront with its mature product line and strong market share. Although the release of Grok 3 has advantages in some indicators, from the overall demonstration, there is not much innovation, and it is more about filling in and catching up with leading companies in the industry. What really supports Grok 3 seems to be more of the 200,000 GPUs and continuous capital support, rather than real technological breakthroughs. This release is not what Musk said, "Maybe this is the last chance for AI to surpass Grok."

The opening of Grok 3 release, Musk once again introduced the mission of xAI and Grok: Understand The nature of the universe, figuring out what is happening, finding out the traces of aliens, exploring the meaning of life, understanding the origin of the universe, and determining how it ends. xAI is driven by the pursuit of truth and becomes the ultimate truth-seeking artificial intelligence.

However, whether it is to realize these grand visions or face more realistic competition, it is obviously not enough to rely solely on "money ability" and the "strongest" title on the list. Musk and its xAI have a long way to go.

Keywords: Bitcoin
Share to: