Since the emergence of ChatGPT, the craze of AI has swept across two years. In the past two years, ordinary people have been excited about the capabilities of large language models. They can generate smooth and natural text with just a single instruction. Scenes from science fiction movies have now become a reality.
The track of large models has also begun to enter a crossroads, how to transform new technologies into new products, meet real needs, and develop into a new business ecosystem.
Just as mobile payments, smartphones, and LTE have jointly ignited the prosperity of the mobile Internet era, the AI industry is also anxiously looking for such PMF (Product Market Fit) this year.
The great voyage of new technologies has begun. Whether or not a new continent can be discovered will determine whether the big model is another money-burning capital game, an accelerated recurrence of the .com bubble, or as Huang Renxun said At the beginning of the new industrial revolution, this answer will be revealed to us faster than AGI.
The big problem of big modelsToday, the problem of pedestal models Competition has basically formed a stable pattern. Led by OpenAI, its ChatGPT is also the market leader. Anthropic, DeepMind, Llama, and Grok also have their own strengths.
So, the most exciting thing this year is not how many more parameters have been expanded or how many seconds the response speed has been improved, but how the large model technology has been transformed into a usable product.
How to implement the technology of large language models has been a scratching problem from the beginning. Harvard Business Review once conducted a survey and found that there are as many as 100 types of generative AI applications.
However, there are five broad categories: technical problem solving, content production and editing, customer support, learning and education, artistic creation, and research.
The well-known investment company a16z has provided excellent generative AI products in the mind of their team, many of which are familiar, such as the general-purpose Perplexity, Claude, and ChatGPT. There are also more vertical ones, such as note-taking products Granola, Wispr Flow, Every Inc., Cubby, etc. There are also NotebookLM, this year’s biggest winner in the education track, or the chatbots Character.ai, Replika, etc.
Flower Flowers is for ordinary users: most of the above products are free and sufficient. The cost of the subscription version or pro version is not necessary money. As strong as ChatGPT, this year’s subscription revenue is approximately US$283 million per month, an increase from last year.Twice as long. But in the face of huge costs, this income seems to be a drop in the bucket.
Enjoying the development of science and technology is a happy thing for ordinary users, and cooking oil is left to practitioners: no matter how exciting technological evolution is, it cannot stay in the laboratory, but must enter the commercial society and accept it test. The subscription model has not been widely accepted, and the time for advertising has not yet come. There is very little time left to waste money on large models.
In contrast, the trend of toB business is much more confident.
The number of mentions of AI in Fortune 500 earnings calls has nearly doubled since 2018. Generative AI was the top topic mentioned in 19.7% of all earnings calls.
This is also the consensus of the entire industry. According to the "Artificial Intelligence Development Report (2024)" blue paper released by the China Academy of Information and Communications Technology, in 2026, more than 80% of enterprises will use generative artificial intelligence APIs or deploy generative applications.
Applications for the enterprise side and the consumer side show different development trends: For the consumer side, large model applications focus on low threshold and creativity. For the enterprise side, large model applications pay more attention to professional customization and benefit feedback.
In other words, improving efficiency is of course what every enterprise is pursuing and wants to achieve, but these four words are too vague. Large models need to prove that they can actually solve problems in usage scenarios and truly improve efficiency.
Accurately find the corners and let the technology dropWhether it is resources investment, or efforts to develop the market, domestic competition for large models will be fierce throughout 2024.
According to data from the Ministry of Industry and Information Technology, the growth rate of China's large language model market will exceed 100% in 2023, with the market size reaching 14.7 billion yuan. Various manufacturers are actively trying in the commercialization process. The first thing to start is the price war: the costs of tokens billing, API calls, etc. are being continuously driven down. The prices of many mainstream popular general-purpose large models are no longer far away from being useless.
It is better to lower prices and reduce costs. However, understanding the business and analyzing the scenarios is a more rugged route.
However, not every company is participating in a price war and relying on low prices to win.
"In this case, it is more important to find our characteristics and give full play to our advantages. There are many scenarios within Tencent itself, which give us more insights and further polish our "Capability" Zhao Xinyu, Tencent Cloud Intelligent AI product expert and Tencent Hunyuan ToB product leader, believes, "Look outside, focus on an industry, focus on some specific scenarios in this industry, and then slowly expand."
atAmong the many base models, Hunyuan may not be the most popular one, but its technical strength cannot be ignored.
In September, Hunyuan Turbo released the general literature model Hunyuan Turbo, which adopts a new hybrid model of experts (MoE) structure. From language understanding and generation, logical reasoning, intent recognition, to encoding, long context and aggregation tasks, it has quite strong performance. In the dynamic update version in November, it has been upgraded to the best-performing model in the entire series. Currently, Tencent Hunyuan's capabilities are being fully exported through Tencent Cloud. By providing multi-size and multi-type models, combined with other AI products and capabilities of Tencent Cloud Intelligence, it helps model applications to be implemented in scenarios.
Looking at the current implementation forms of model applications, they can be roughly divided into two types: serious scenes and entertainment scenes. The latter is similar to chatbots, companion applications, etc.
"Serious scenarios" refer to application scenarios that require high accuracy and reliability in core business operations of enterprises. In these scenarios, large models are responsible for structured information processing and often need to follow preset business processes and quality standards. Their application effects will be directly related to the company's operational efficiency and business results.
Tencent Cloud once helped an outbound call service provider build a customer service system. This is a typical serious scenario. At the same time, outbound calls involve natural language dialogue capabilities, content understanding and analysis capabilities, and it seems that the natural language has a high degree of adaptability to large language models.
In fact, the challenges lie in the details. At that time, the team faced two core challenges. The first is performance issues. Due to the huge amount of model parameters, reaching a scale of 70B or 300B, how to complete the response within 500 milliseconds and pass it to the downstream TTS system has become an important technical problem.
The second is the accuracy of the dialogue logic. The model will give illogical replies in some conversations, affecting the overall conversation effect. In order to overcome these challenges, the project team adopted an intensive iteration strategy and maintained a rapid iteration rhythm of one version per week within a 1-2 month development cycle.
Enterprise customers have shown interest in large language model technology and are willing to make innovative attempts. However, there is always a cognitive gap in the deep integration of technology and business. This is not due to the company's lack of understanding of its own business, but the need for a professional technical team to find the most appropriate scenarios through in-depth understanding of industry pain points and business scenarios, tailor-made AI implementation solutions for the company, and achieve technology and The best fit for the business.
“The traditional approach may require operators to build (corpora) scene by scene,” Xinyu said. “With large models, you only need to give a prompt to realize the requirements.” In After figuring out the needs, Hunyuan's team updated the version almost every week, which increased the iteration speed. After a month or two, the accuracy has reached 95%.
For this outbound call service provider, generative technology is completely new.Hunyuan directly allowed them to see the benefits brought by large models, reducing manpower expenses by three-quarters.
“The best way is to show the effect,” Xinyu said. When the customer has a little but not much understanding of generative technology, it is most effective to show the effect. Through the customer's business experience, we can find scenarios that can be cut into, conduct direct testing and verification, and demonstrate the effects that can be improved.
Similar experience reflects the cooperation with Xiaomi, which is called a "two-way cooperation".
Xiaomi hopes to introduce large models into Q&A interactions and apply AI search capabilities to the device side. This hits two of Hunyuan’s strengths: first, the support provided by Tencent’s rich content ecosystem; second, Hunyuan’s capabilities in AI search. For question and answer, accuracy is very critical.
“There were still many difficulties at the beginning,” Xinyu recalled. “From their perspective, the business form covers multiple scenarios, including small talk, knowledge Q&A and other different types. Among them, the knowledge Q&A scene , which has relatively high requirements for accuracy. ”
Through preliminary testing, the Hunyuan team has clarified its advantages in search scenarios. The two parties will interact with questions and answers in a broad sense according to different topic levels. Gradually refine. This kind of subdivision can allow the model to have a clearer understanding of the specific needs and effect requirements of each scenario, allowing for more targeted optimization.
The quiz scene became the landing point. In terms of subsequent implementation, Hunyuan still has many challenges to overcome: Needless to say, the delay problem, the response time must be fast; secondly, the integration of search content.
"In the entire link, we have built a self-built search engine and an intent classification model to determine whether it is a highly time-sensitive question. For example, whether it is a topic related to news or current affairs. , and then determine whether to give it to the main model or AI search."
Only the most needed parts are called, so that the response speed can be greatly improved. An important finding is that 70% of inquiries will lead to AI search, which means that there must be enough rich content as the most basic call support.
Behind Hunyuan stands the entire content ecosystem of Tencent. From news, music, finance, and even more specific fields such as medical care, you can find a large amount of high-quality content in Tencent's ecosystem. These are the data that the hybrid model can access and reference when searching, and they are also unique barriers.
After more than two months of intensive iteration, in the end, the requirements were fully realized in terms of quality of answers, response and performance, and it was launched into Xiaomi's actual business.
This is the essence of toB business. To be able to achieve revenue and win trust, you need to actually bring value to the customer's business.
Only by generalizing "volume" can we move to more scenariosThe implementation of large models in different industries and products is actually promoting the growth of the technology itself.
For some large model products, there is a core consideration in choosing the toC path: using feedback from the C side to optimize the model. Large models have endless demand for tuning, and the number and activity of C-side consumer groups provide nourishment for model iterations. In this way, the iterative flywheel can start running.
In fact, this will also be achieved in toB business, and the requirements are even higher.
The K12 Chinese composition correction function of "Youth Get" applies Hunyuan's multi-modal capabilities. Combined with Tencent Cloud Intelligence's OCR technology, the content of students' compositions is identified, and the composition is scored by a large model according to the set scoring standards.
Usually, when the difference between the large model and the real teacher is within five points, it is very good - but this is not easy to achieve. At the beginning, the difference between Hunyuan's rating and the real teacher's rating was less than five points in only 80% of cases.
"The model has certain methods and capabilities and can solve problems in some scenarios. But focusing on a specific customer's business has higher requirements for this effect." Xinyu said, "Maybe 90% The accuracy can achieve business goals, but when it is only 70% and 80%, there is a certain distance."
This means that we have to continue to "roll". As the customer base of service enterprises continues to expand, new requirements are also put forward for the technology itself: First, the iteration speed is greatly improved - when facing C-end users, iteration may take one to two months. Now, a version can appear every week. This high-frequency iteration rhythm has greatly promoted the growth and progress of the model.
Secondly, by continuously serving different enterprise scenarios, the generalization ability of the model has also been significantly enhanced. This shows that in-depth service to the needs of diversified enterprises not only speeds up the pace of model development and iteration, but also improves the practicality and adaptability of the model, which can be expanded from serious scenarios to entertainment-oriented scenarios.
The role-playing content platform "Dream Dimension", which has just received tens of millions of Series A financing, has applied Hunyuan-role, a role-playing exclusive model of the Hunyuan model. It is positioned to serve young users and combines generative AI technology provides interactive and dramatic virtual character interaction experience.
Hunyuan-role creates a new way of human-computer interaction. By creating rich and diverse virtual character images and based on preset plot backgrounds and character settings, we can develop natural and smooth interactive dialogues with users.
At the technical level, such scenarios have been applied to Hunyuan-role, which has shown leading advantages in long and short text dialogue processing, intent recognition and response, and is capable of diversified application scenarios and has demonstrated excellent performance. The ability to personify content - not only enables warm conversational interactions, can also promote the development of the storyline and create an immersive user experience.
These characteristics make Hunyuan-role a powerful tool for product customer acquisition and user operations, playing an important role in improving user retention and usage stickiness. It also reflects that the generalization ability formed by Hunyuan, which has been trained and improved in serious scenes, can cover a wider range of scenes and even be applied on the device side.
From serious scenes to gradually expanding to entertainment, creativity, and even more scenes, it is a journey that large model applications must embark on.
As technology matures and costs decrease, large models are bound to expand to a wider range of application scenarios. It originally focused on serious business scenarios, such as corporate office, data analysis, scientific research and other industries, because these scenarios have clear needs and high willingness to pay.
To further expand into entertainment, creativity, content production and other industries, we need to have an anchor point in our thinking: always focusing on solving the demand points in specific scenarios as the core goal, anchoring the entry of integrated large model capabilities point.
In addition to cooperation with application software, it is also necessary to cooperate with hardware manufacturers so that the model can be displayed and exerted on the terminal side closest to consumers, providing a more convenient and closer to users' daily life. , instant service experience.
During this process, the market’s awareness and acceptance of generative AI technology continues to increase, and the user base continues to expand. Faced with this rapidly changing market environment, the ability to iterate models has become particularly important. This is not only reflected in technical performance, but also includes multiple dimensions such as understanding of user needs and adaptability to different scenarios. Only those models and teams that can learn quickly, continuously optimize, and continuously adapt to new needs can maintain an advantage in the competition.
As it continues to cover more scenarios, it is also reaching more end consumers. As the market as a whole accepts generative technology, the number of potential users will continue to increase. Only a model that can quickly iterate and improve itself can adapt to changes keenly and move forward more steadily and further.