The full health version of O1 is here! OpenAI combines multimodal capabilities with new reasoning paradigms for the first time

Image source: Generated by Unbounded AI

Every action of OpenAI attracts attention.

Yesterday, its CEO Sam Altman announced in a high-profile manner that he would give everyone a "whole life": OpenAI will launch a 12-day marathon live broadcast event, which will include new product releases and functions. Updates, etc., and some "Christmas gifts".

OpenAI’s event preview

As a result, technology media all over the world are excited, and even the jet lag cannot stop their determination to follow the "Technology Spring Festival Gala".

At two o'clock in the morning on December 6, Beijing time, the live broadcast of "12 Days of OpenAI, Day 1" began. When "Jiazi Guangnian" worked all night to dismantle this "gift", it ended up like peeling off I discovered like an onion, Altman, you have "no heart"!

After the vigorous preview, the live broadcast on the first day was only 14 minutes, more like a slice taken from a large release event live broadcast in the past. Although OpenAI has released updated models and products, There are also bright spots, but its sincerity is slightly lacking.

The industry thinks this is more like a gorgeous marketing strategy. Some people joked: "OpenAI teaches you how to dominate the 12 days of technology news headlines."

To sort it out, this time OpenAI mainly focuses on Two things were said:

1. The official version of o1 was launched, which is the first version that combines multi-modal capabilities with a new reasoning paradigm. Compared with o1 preview, it is more intelligent and the response speed is significantly improved. The o1 model is currently fully online, and API functions will be launched soon.

2. Released a new professional version package - ChatGPT Pro. The subscription fee is US$200 per month. Users can have unlimited access to OpenAI's models, including voice functions. Additionally, the Pro package introduces o1 Pro mode, which performs better in challenging machine learning benchmarks such as math, science, and coding.

1.o1 official version introduces multi-modality

The official version of o1 model will replace the previous o1-preview version.

Altman introduced that the accuracy of the o1 model in the American Mathematics Invitational Competition (AIME 2024) reached 83.3%, significantly exceeding the 56.7% of o1-preview and the 13.4% of the early GPT-4o model. .

In terms of programming, the o1 model scored 89.0% in the CodeForces competition, while o1-preview scored 62.0% and GPT-4o only 11.0%. It can be seen that the o1 model can handle it like a skilled programmer. Complex coding tasks.

In the GPQA Diamond doctoral level scientific question test, these questions are basically "hell level" difficulty, o1 even surpasses human experts, and the accuracy rate is78.3%, while human experts scored 69.7%. However, o1 does not perform as well as o1 preview, which may be caused by changes in model performance based on the type of problem or the training data that may be used.

The new model also offers improvements in processing speed. Reaction times to simple questions have been reduced compared to previous versions. Altman mentioned in the demonstration that the error rate of the new version of o1 has been reduced by 34% when dealing with complex problems, and the processing time can be adjusted according to the difficulty of the problem.

At the same time, o1 introduces multi-modal functionality, capable of handling different types of input and output. Newly added structured output and developer message functions enhance the interactivity and practicality of the model.

At the press conference, the person in charge of o1 model drew a sketch on site, showing a system for collecting solar energy to supply space data centers. Since water cooling systems cannot be used in space, heat dissipation requires a huge heat sink. The researchers then asked the o1 model how much area of the heat sink would be needed to keep the GPU array functioning properly if the data center needed to provide 1 gigawatt of power.

The o1 model accurately identified and understood the sketch, conducted detailed analysis and calculations, and concluded that a huge heat sink of 2.42 million square meters is needed to meet the cooling needs.

2. ChatGPT Pro for $200 per month

Previous rumors The "more expensive" version of the app is also available today.

ChatGPT Pro is a premium subscription plan that costs $200 per month and provides users with unlimited access to its most advanced models and tools. In particular, it includes comprehensive access to OpenAI o1 and o1-mini, GPT-4o and Advanced Voice, focusing on the most complex computing needs.

One of the features of ChatGPT Pro is the introduction of o1 pro mode, which increases the input of computing resources and allows the model to conduct more in-depth thinking and analysis when answering difficult questions. This service is mainly aimed at researchers, engineers and other professionals who need to perform advanced data analysis and processing, helping them improve their work efficiency and stay at the forefront of artificial intelligence technology.

According to evaluations by external experts, o1 pro mode can provide more precise and comprehensive responses than previous models when handling complex data science, programming and case analysis problems. The o1 pro mode outperforms the o1 and o1-preview models on machine learning benchmarks in areas such as mathematics, science, and programming.

To highlight the main advantage of o1 pro mode (increased reliability), OpenAI uses a more stringent evaluation setting: only if the model can answer the question correctly four out of four attempts ("4/4 Reliability"), was considered to have solved the problem.

As if everyone has to "rush to work" before Christmas, Google Deepmind also made a big move yesterday and released the latest basic world model Genie 2; Anthrophic may also release a new model before Christmas.

A new round of AI model competition seems to be starting again.

There are still 11 days of "blind box". "To be released, some netizens have speculated that the Sora model, Dall-e 4, etc. may be released. I hope OpenAI can come up with more and harder products.

Online Consultation