Since December 5, local time in the United States, OpenAI has launched an intensive new feature release cycle, planned to New products and features will be launched through 12 live broadcast events in the next 12 days. Prior to this, OpenAI has successively released a number of innovations, including the full-blooded o1 model, ChatGPT Pro plan, enhanced fine-tuning technology, video generation tool Sora, interactive interface Canvas, advanced voice and vision functions, smart folder Projects, and support for all ChatGPT ChatGPT search function, etc. are open to users.
On December 18, on the ninth day of the OpenAI conference, the company officially announced that it would open its cutting-edge large model to third-party developers through its application programming interface (API) o1 series. This news is a huge boon for developers eager to build new advanced artificial intelligence applications or integrate OpenAI’s state-of-the-art technology into their existing applications and workflows, whether those applications are For businesses or consumers.
OpenAI's o1 series of models will be released in September 2024. As the first masterpiece in the company's "new family" series, it goes beyond the scope of the GPT series of large language models (LLM) and introduces " reasoning" function.
The o1 series models include o1 and o1 mini. Although it takes longer to respond to user prompts and generate answers, they will perform self-checking during the process of forming answers to ensure the accuracy of the answers. And effectively avoid "hallucinations". When it was released, OpenAI claimed that o1 could handle more complex, PhD-level problems, which was also verified in actual feedback from users.
Although developers have previously been able to access preview versions of o1 and develop their own applications based on them, such as PhD consultants or laboratory assistants, the release of the complete o1 model through the API brings It provides higher performance, lower latency and new functions, making it easier to integrate into actual application scenarios.
OpenAI has launched the o1 model to consumers through the ChatGPT Plus and ChatGPT Pro programs about two and a half weeks ago, adding the ability to analyze the model and respond to user-uploaded images and files.
At the same time as today’s release, OpenAI also announced major updates to its real-time API, as well as price reduction strategies and new fine-tuning methods, designed to help developers better control their models.
Open the complete o1 modelThe newly launched o1 model internal number Specially designed for o1-2024-12-17Used to handle complex multi-step reasoning tasks. Compared with the earlier o1 preview version, this version has achieved significant improvements in accuracy, efficiency and flexibility.
OpenAI announced the results of a series of benchmark tests, demonstrating significant improvements in coding, mathematics, and visual reasoning tasks with the new model. For example, in SWE-bench Verified, a benchmark designed to evaluate an artificial intelligence model's ability to solve real-world software problems with a more reliable method, o1's coding result improved from 41.3 to 48.9. In the math-focused AIME test, o1's performance jumped from 42 to 79.2. These significant improvements make o1 the ideal tool for building streamlined customer support processes, optimizing logistics solutions, or solving challenging analytical problems.
In addition, o1 has added several new features to further enhance its capabilities for developers. Structured output capabilities allow model responses to be reliably matched to custom formats such as JSON schemas, ensuring consistency and accuracy when interacting with external systems. The introduction of the function calling function simplifies the process of connecting o1 to the API and database, making integration more convenient. At the same time, o1 also has the ability to reason on visual input, which opens up new application scenarios in fields such as manufacturing, science, and coding.
In order to allow developers to more finely control the behavior of o1, OpenAI also introduced a new reasoning_effort parameter. This parameter allows developers to adjust the time the model spends on a task based on task requirements to find the best balance between performance and response time.
OpenAI’s real-time API has been upgraded to provide support for intelligent conversational voice/audio AI assistants< /p>
OpenAI also announced a major update to its real-time API, designed to support low-latency, natural conversational experiences, such as voice assistants, real-time translation tools, or virtual tutors.
In this update, the new WebRTC integration has become a highlight. It directly supports audio streaming, noise suppression, and congestion control, greatly simplifying the process of building voice-based applications. Developers can now integrate real-time functionality with minimal setup and maintain stable performance even in changing network environments.
In terms of pricing, OpenAI has also launched a new strategy, reducing the cost of GPT-4o audio by 60%. Specifically, the fee is $40 per 1 million input tokens and $80 per 1 million output tokens. At the same time, the cost of caching audio inputs has also been reduced by 87.5%, and is now priced at $2.50 per 1 million input tokens.
In order to further improve the cost performance, OpenAI also launchedGPT-4o mini, a smaller, more cost-effective model. Its price is more affordable, with a fee of US$10 per 1 million input Tokens and US$20 per 1 million output Tokens. In addition, the text tokens rate of GPT-4o mini is also relatively low, with the starting price of input tokens being $0.60 and the starting price of output tokens being $2.40.
In addition to pricing adjustments, OpenAI also gives developers more control over real-time API responses. For example, features such as concurrent out-of-band responses allow background tasks, such as content moderation, to run without disrupting the user experience. Developers can also customize input context based on actual needs, focus on specific parts of the conversation, and control when voice responses are triggered, resulting in a more accurate and seamless interactive experience.
Preference nudge provides new customization optionsAnother important addition is preference nudge, an innovative model Customization methods can optimize model performance according to user and developer preferences.
Different from traditional supervised fine-tuning that relies on precise input and output, preference fine-tuning uses pairwise comparisons to guide the model to select a better response. This approach is particularly effective when dealing with highly subjective tasks, such as summarizing, creative writing, or applications where tone and style are more important.
Early testing with partners such as Rogo AI has shown the potential for preference fine-tuning. Rogo AI is committed to building an assistant specifically for financial analysts. They reported that compared with traditional fine-tuning methods, preference fine-tuning significantly improved the model's ability to handle complex and out-of-distribution queries, and the task accuracy increased by more than 5%. . This feature is currently available in the GPT-4o-2024-08-06 and GPT-4o-mini-2024-07-18 models, and is planned to be expanded to more new models early next year.
New SDK for Go and Java developersIn order to further optimize the integration process, OpenAI is expanding its official SDK product line and now launches beta SDKs for Go and Java versions. These new SDKs complement existing Python, Node.js and .NET libraries, greatly broadening the convenience for developers to interact with OpenAI models in different programming environments. The Go SDK shows unique advantages in building scalable back-end systems, while the Java SDK is designed for enterprise-level applications that rely on strong typing and a robust ecosystem.
Through this series of updates, OpenAI equips developers with a more comprehensive toolbox designed to help them develop advanced and highly customizable artificial intelligence applications. Whether it is leveraging the enhanced reasoning capabilities of the o1 model or the display of real-time APIs,With enhancements or flexible and diverse fine-tuning options, OpenAI’s latest products are committed to providing enterprises with better performance and higher cost-effectiveness, thereby continuously expanding the boundaries of artificial intelligence integration.