OpenAI released the flagship inference model o3 and o3-mini, becoming their final work in 2024, creating a Little wave craze. As usual, Twitter (x.com) is still the main forum for external discussion.
But there’s something I don’t know if you’ve noticed. The voice from OpenAI’s “own people” has increased significantly this time – almost all OpenAI employees with x accounts are posting articles to update the company’s website for major models. Cheer.
The more netizens scrolled through the tweets, the more they discovered that the world was full of OpenAI engineers and researchers. And this time it’s no longer limited to a few familiar names, the whole team came out.
Does this situation sound familiar to you? Do the OpenAI employees on Twitter look like you who are working hard to "do business" for your employer in your circle of friends?
1. All members of OpenAI jointly create o3 event:The leadership team takes the lead in PR
No need to say more about the leader Ultraman: Before the release, he couldn't help but say "oh oh" oh" hinted at the new product and invited everyone to apply for o3 testing permission;
To emphasize that the programming performance of o3-mini surpassed o1 at a significantly reduced cost, Online thanked the team members for their hard work and called everyone working together "a life-changing experience." "One of the greatest joys";
Then there are various O3 tweets from colleagues with one click and three links, which is very lively.
OpenAI co-founder and president Greg Brockman, who just finished the "longest vacation in his life" last month, has been closely cooperating with the company's every move and working hard on publicity since his return.
After o3 was launched, he expressed his praise: the new model achieved a qualitative leap in the most challenging test and directly reached a new height.
Chief Product Officer Kevin Weil and Vice President of Research Mark Chen retweeted the ARC-AGI test breakthrough and the o3-mini team's tweets respectively.
Dane Stuckey, the new chief information security officer of OpenAI, also came to "one, two, three, link up" and lamented that this was "what an exciting day."
Interestingly, Stuckey registered a Twitter account about four years ago, and it was not until October this year that he left Palantir Tech and officially announced that he would join OpenAI that he began to post officially. And he changed his usual low profile and became extremely active.
The main creative team appeared together
In this live broadcast, the young Chinese researcher Hongyu Ren, as the team representative, introduced the lightweight model in detailmodel o3 mini.
He later posted on Twitter, focusing on the excellent performance of o3 mini, including its efficiency, cost-effectiveness and flexible and adjustable inference time. Several core members who participated in the development of o3-mini were also specially mentioned to pay tribute to them.
Several creators also wrote in response, saying that o3-mini is "a smart little monster", "extremely fast", and has "amazing mathematics and coding performance", and they are full of pride in their words. show.
In fact, these researchers have already made considerable achievements in the industry. Looking at their backgrounds, you will find that many of them are key contributors to o1 and o1-mini. However, this wave of official announcements has indeed made more people aware of them. Judging from OpenAI’s star-making ability, several new KOLs in the large model industry may be just around the corner.
Colleagues from each group gather to praise each other
There are so many members in this “OpenAI Praise Group”: as long as you click on an employee’s x account, there is a high probability that you can follow the Retweet like a matryoshka, clicking through the tweets of several other colleagues praising o3.
It made us all wonder if Ultraman had set some targets and included increasing o3 exposure into employee KPIs.
Sébastien Bubeck, a well-known computer tycoon who has worked at Microsoft Research for ten years and served as vice president of AI and an outstanding scientist, joined OpenAI in October this year. He admitted in the pinned tweet that o3 and o3-mini are his favorite models so far. The various evaluations of o3 are simply amazing, especially the 25% test score of cutting-edge mathematics.
Researcher Aidan Clark, who has led GPT-4o pre-training and o1 development, even posted five messages in a row, praising "Hongyu is so awesome" and saying that o3-mini is the first one that he can actually propose. Model of the puzzle.
Anshita Saini, a member of the technical team who focuses on the growth of GPT, said that o3 feels very different. The concept of the entire o3 series will make her stop and think about "what a world where AGI is productized would look like."
Online analysis by researchers
In addition to the above direct support, some OpenAI researchers have taken on the role of answering questions and trying to clarify some issues by sharing their opinions.
While the release of o3 and o3-mini brought excitement to the community, it also caused some controversy and doubts. Some people cheered that AGI is close at hand and has even been realized because of the test results of ARC-AGI; others scorned and expressed concerns about o3's high computing power requirements and operating costs, complaining that this is just another "picture" product.
In this regard, OpenAI multi-modal reasoning researcher Noam Brown wrote: The outside world’s reaction to the ARC-AGI test is somewhat excessive.Degree, breaking through the ARC-AGI benchmark does not mean that the model has reached the AGI level. He also mentioned a common phenomenon in the field of AI: people often think that a certain benchmark test requires "super intelligence" to complete, but when a model actually overcomes this benchmark, people will think that it does not achieve the expected "super intelligence" Disappointed by the level.
The implication: please treat it rationally and don’t praise it.
OpenAI API engineering director Sherwin Wu fully agrees with this. Sherwin reminds the community: Compared with the ARC-AGI test, o3's breakthroughs in programming and mathematics are more worthy of attention - o3's programming level has surpassed He failed himself, and O3 could answer a quarter of the cutting-edge mathematics questions correctly, but he could not solve any of them.
In addition, company researchers Brandon McKinzie and Rhythm questioned whether the o3 model uses specific data sets, optimizes in specific areas, or artificially adjusts the prompt format to improve evaluation results. Garg responded one after another:
The arc-agi public training set used in the evaluation is only a small part of the larger o3 training data and cannot determine the model performance; o3 is a general model and has not been fine-tuned in any specific field. ; The high ARC-AGI score does not rely on adjustment tips, but is a natural reflection of the model's versatility and training results.
Regarding the high price of o3, researcher Nat McAleese explained this: Although o3 is the most expensive model in the current testing stage, it has opened a new era of "computing for performance." By increasing the amount of calculations in the testing phase, o3 improved model performance to an "incredible level."
Nat believes that although it is indeed expensive at the moment, as technology advances, the price of tokens will gradually decrease. More importantly, the team has found a way to efficiently convert calculations into performance improvements, which indicates that the capabilities of AI models will be greatly improved in the future.
Finally, there is the issue of OpenAI model training speed. Jason Wei, who is very influential in the Chinese community, said: The upgrade from o1 to o3 only took three months, which proves that the new paradigm of reinforcement learning based on the thinking chain is better than the traditional pre-training method that can be launched every 1-2 years. The new model is much faster paced.
Even Tadao Nagasaki, President of OpenAI Japan Office, also came out to support: "Didn't we just release o1 in September? Now we have begun early evaluation of o3!"
2. What information should be conveyed by collective businessThis time OpenAI employees collectively endorsed o3. First of all, it was because rightHigh confidence in the product. Through interpretation from different angles, they hope that the outside world can have a more comprehensive understanding of o3's breakthrough achievements in mathematics, programming and reasoning. OpenAI intends to show the outside world that it is still the leader in AI technology and still has a presence in a market full of competitors.
In addition, at this critical time when OpenAI is facing external doubts and intensified competitive pressure, coupled with the frequent loss of core employees and the impact of the "whistleblower" scandal, all-employee operations also have a bit of "teaming together for warmth" " means. They tried to use this release to send several signals to the community:
1. New breakthroughs in the expansion rule
Many OpenAI researchers pointed out that o3 and o3-mini verified the increase in computing resources , data volume and model parameters can indeed bring significant performance improvements, and break through the limitations of the "diminishing benefits" of the traditional expansion law, proving that the model still has huge room for improvement in the future.
2. Technological innovation does not “hit the wall”
By forwarding test data and detailed interpretation, employees emphasized that the concept and performance of the o3 series have broken through many people’s imagination of the boundaries of AI models. Not only did it achieve breakthroughs in performance that exceeded expectations, it also demonstrated broader applicability. Compared with the rumors about GPT-5's "difficult birth", OpenAI wants to prove that they are opening up another innovative path.
3. The training speed has not slowed down
In the face of external doubts about the iteration speed of the OpenAI model, especially in the context of increasingly fierce global AI competition, the number of employees from o1 to o3 That quickly escalated into a clear response. This shows that OpenAI has the ability to break through the traditional 1-2 year development cycle of pre-training, launch high-quality models at a faster speed, and stabilize market confidence.
Looking back, from the launch of the official version of o1 to the official announcement of o3, these 12 days of technical live broadcast were more like a massive OpenAI show. At this time last year, the "OpenAI is nothing without its people" show of support that shocked the entire network had just come to an end. One year later, OpenAI cannot be said to be bad, but it is no longer the peak glory it once was during GPT. After experiencing various ups and downs, perhaps every employee wants to work hard to make OpenAI great again at the end of the year.