News center > News > Headlines > Context
O3 is here, is general artificial intelligence really within reach?
Editor
2025-01-06 14:02 6,429

O3 is here, is general artificial intelligence really within reach?

Image source: Generated by Unbounded AI

"How long will it take for machines to truly possess the cognitive capabilities of the human brain?" This fundamental question that has plagued the field of artificial intelligence for decades will once again become a global technology issue at the end of 2024 The focus of the world.

When artificial intelligence continues to make breakthroughs in specific fields such as image recognition and natural language processing, a more challenging goal is always looming: to allow the machine to gain the insight of drawing inferences from one instance, the ability to reason about abstract concepts, and the ability to The general human ability to plan and allocate cognitive resources.

In this ongoing debate about the limits of machine intelligence, the new artificial intelligence system recently released by OpenAI has injected new variables into this traditional proposition. The San Francisco-based AI giant, best known for developing ChatGPT, released a new generation large language model (LLM) system called O1 in September. Just this month, it was reported in the industry that OpenAI is developing a more powerful system code-named O3. This project, known as "Prelude to General Artificial Intelligence (AGI)", has attracted a new round of attention. Compared with previous AI models, the technical routes from O1 to O3 have demonstrated an operating mechanism that is closer to human cognitive methods. These breakthrough developments are redefining our understanding of the potential of artificial intelligence.

Once realized, AGI may bring unprecedented breakthroughs to mankind: from the management of climate change, to the prevention and control of epidemics, to the overcoming of stubborn diseases such as cancer and Alzheimer's disease. However, such huge forces can also bring uncertainty and pose potential risks to humans. Yoshua Bengio, a deep learning researcher at the University of Montreal in Canada, said: "The misuse or loss of control of AI by humans may lead to serious consequences."

The revolutionary progress of LLM in recent years has inspired various speculations about the imminent arrival of AGI. guess. But some researchers say that given the way LLMs are built and trained, they are not sufficient on their own to achieve AGI and "are still missing some key pieces."

There is no doubt that questions about AGI are now greater than ever It has never been more urgent and important. “For most of my life, I thought people who talked about AGI were deviant,” says Subbarao Kambhampati, a computer scientist at Arizona State University. “But now, everyone is talking about it. You can’t call everyone ‘deviant’ anymore. ”

Why the AGI debate took a turn

The term "artificial general intelligence" (AGI) first entered mainstream consciousness around 2007, when it was used as the title of a book of the same name by AI researcher Ben Goertzel and Cassio Pennachin launched. Although the exact meaning of this term is unclear, it usually refers to something likeAI systems with human reasoning and generalization capabilities. For much of the history of artificial intelligence development, it was generally accepted that AGI remained an unrealized goal. For example, the AlphaGo program developed by Google DeepMind is specifically designed for the game of Go. It beats top human players at Go, but its superhuman abilities are limited to Go, which is its only area of ​​expertise.

LLM[1]’s new capabilities are revolutionizing this situation. Like the human brain, LLMs possess a wide range of abilities, leading some researchers to seriously consider that some form of general artificial intelligence may be around the corner [1] or even already exist.

The breadth of this capability is even more astounding when you consider that researchers only partially understand how LLM achieves this goal. LLM is a neural network loosely inspired by the human brain. It consists of artificial neurons (or computing units) arranged in layers, and the strength of the connections between these layers is represented by adjustable parameters. During training, powerful LLMs—such as o1, Claude (developed by Anthropic), and Google’s Gemini—rely on a method called next token prediction. In this method, the model is repeatedly fed text samples that have been segmented (i.e., token chunks). These tokens can be whole words or just a set of characters. The last token in the sequence is hidden or "masked" and the model is asked to predict it. The training algorithm then compares the predictions to the masked tokens and adjusts the model’s parameters so that it can make better predictions next time.

This process is repeated—often using billions of conversation snippets, scientific text, and programming code—until the model can reliably predict the hidden tokens. At this stage, the model parameters have captured the statistical structure of the training data and the knowledge contained within it. The parameters are then fixed and the model uses them to generate predictions for new queries or "hints" that have not necessarily appeared in its training data, a process called "inference."

The use of a neural network architecture called a "Transformer" significantly advances LLM's capabilities beyond previous achievements. Transformer enables the model to learn that certain tokens have a particularly strong influence on other tokens, even if they are far apart in the text sample. This allows LLM to parse language in ways that appear to mimic humans—for example, distinguishing between the two meanings of the word "bank" in the following sentence: "When the bank overflowed, the flood damaged the bank's ATM, Resulting in the inability to withdraw money."

This approach has achieved remarkable results in a variety of application scenarios, such as generating computer programs to solve problems described in natural language, summarizing academic articles, and answering mathematical questions.

As the scale of LLM increases, some new capabilities also emerge - if the LLM is large enough, AGI can alsomay appear. One example is the "Chain of Thoughts (CoT) prompt". This approach involves showing the LLM how to break down a complex problem into smaller steps to solve, or directly prompting the LLM to solve the problem step by step. However, for smaller-scale LLMs, this process does not have significant effects.

LLM’s capability boundary

According to OpenAI’s introduction, “CoT "Tip" has been integrated into o1's operating mechanism and has become a core component of its powerful functionality. Francois Chollet, a former Google AI researcher, pointed out that o1 is equipped with a CoT generator that can generate a large number of CoT tips for user queries and filter out the best tips through a specific mechanism.

During training, o1 not only learns how to predict the next token, but also masters the ability to select the best CoT hint for a specific query. OpenAI said that thanks to the introduction of CoT reasoning, o1-preview (an advanced version of o1) correctly solved 83% of the questions in the preselection test of the International Mathematical Olympiad (a world-renowned mathematics competition for high school students) . In comparison, OpenAI's previously most powerful model, GPT-4o, was correct only 13% of the time on the same exam.

However, despite o1's impressive complexity, both Kambhampati and Chollet believe that it still has obvious limitations and does not meet the standards of AGI.

For example, in tasks requiring multi-step planning, Kambhampati's team found that although o1 performed well in planning tasks of up to 16 steps, when the task complexity increased to 20 to 40 steps, its Performance degrades rapidly [2].

Chollet also discovered similar limitations when challenging o1-preview. He designed an abstract reasoning and generalization test to assess developmental progress toward AGI. The tests take the form of visual puzzles whose solution requires looking at examples to deduce abstract rules that can be used to solve similar new problems. The results show that it is clearly easier for humans to do it. Chollet further pointed out: "LLM cannot truly adapt to new things because they basically have no ability to dynamically carry out complex reorganization of the knowledge they have mastered to adapt to new environments."

Can LLM move towards AGI?

So, is LLM capable of finally moving towards AGI?

It is worth noting that the underlying Transformer architecture is not only capable of processing text, but also suitable for other types of information (such as images and audio), provided that appropriate tokenization methods can be designed for these data. new york universityAndrew Wilson, who studies machine learning, and his team pointed out that this may be related to a characteristic shared by different types of data: these data sets have lower "Kolmogorov complexity," that is, the length of the shortest computer program required to generate these data is shorter [ 3].

The study also found that Transformer performs particularly well in learning data patterns with low Kolmogorov complexity, and this ability will continue to increase as the model size increases. Transformer's ability to model multiple possibilities increases the probability that the training algorithm will find an appropriate solution to the problem, and this "expressiveness" will further increase as the size of the model increases. These are "some of the key elements needed for universal learning," Wilson said.

Although Wilson believes that AGI is still out of reach, he said that LLM and other AI systems using the Transformer architecture already have some key features of AGI-like behavior.

However, Transformer-based LLM also shows some inherent limitations.

First, the data resources required to train models are gradually drying up. The EpochAI Institute in San Francisco, which focuses on AI trend research, estimates [4] that publicly available training text datasets may be exhausted between 2026 and 2032.

In addition, although the scale of LLM continues to increase, its performance improvement is not as great as before. It is unclear whether this is related to a reduction in novelty in the data (since much of the data has already been used), or due to other unknown reasons. The latter is a bad sign for LLM.

Raia Hadsell, Google DeepMind’s vice president of research in London, raised another question. She pointed out that although Transformer-based LLM has powerful functions, its single goal-predicting the next word element-is too limited to achieve true AGI. She suggested that building models that can generate solutions all at once or in a holistic manner may be closer to what is possible with AGI. The algorithms used to build such models are already used in some existing non-LLM systems, such as OpenAI’s DALL-E, which is capable of generating realistic and even hyper-realistic images based on natural language descriptions. However, these systems cannot match the extensive capabilities of LLM.

Building a world model for AI

On how to promote the development of AGI Breakthrough technology, neuroscientists provide intuitive and important insights. They believe that the root of human intelligence lies in the brain's ability to construct a "model of the world," an internal representation of its surroundings. This model supports planning and reasoning by simulating different courses of action and predicting their consequences.. Furthermore, by simulating multiple scenarios, this model can generalize skills learned in a specific domain to entirely new tasks.

Some research reports claim that there is evidence that a preliminary world model may be formed within LLM. In a study [5], Wes Gurnee and Max Tegmark of MIT found that when LLM is trained using a data set containing information about many places in the world, with widespread application, LLM is able to internally form responses to the surrounding world. characterization. However, other researchers point out that there is currently no evidence that these LLMs use the world as a model for simulation or causal learning.

In another study [6], Harvard computer scientist Kenneth Li and colleagues found that a small LLM learned to internally represent the chessboard after using players' moves while playing Othello as training data. state, and use this representation to correctly predict the next legal move.

However, other research suggests that the models of the world built by today’s AI systems may be unreliable. In one study [7], Keyon Vafa, a computer scientist at Harvard University, and his team trained a Transformer-based model using a turn-by-turn dataset of New York City taxi trips, and the model completed the task with nearly 100% accuracy. By analyzing the turn sequences generated by the model, the researchers found that the model relies on an internal map to complete its predictions. However, this interior map bears little resemblance to the actual map of Manhattan.

▷AI’s Impossible Streets. Source: [7]

Vafa pointed out, “The map contains physically impossible street directions, as well as elevated roads that span other streets. ." When the researchers adjusted the test data to include unexpected detours that did not appear in the training data, the model was unable to predict the next turn, indicating that it was less able to adapt to new situations.

The importance of feedback

GoogleDeepMind Dileep George, a member of AGI's research team in Mountain View, Calif., points out that today's LLMs lack a key feature: internal feedback. The human brain has extensive feedback connections that enable information to flow in both directions between layers of neurons. This mechanism allows information from the sensory system to flow to higher layers of the brain to create a model of the world that reflects the environment. At the same time, information from the world model can also be propagated downward to guide the acquisition of further sensory information. This bidirectional process is crucial for perception, for example, where the brain uses a model of the world to infer the underlying causes of sensory input. In addition, these processes support planning, using world models to simulate different courses of action.

However, the current LLM can only be appendedWays to use feedback. For example, in o1, the internal CoT prompt mechanism assists in answering queries by generating prompts and feeding them back to the LLM before finally generating the answer. But as Chollet's tests show, this mechanism does not ensure the reliability of abstract reasoning abilities.

Researchers such as Kambhampati have tried to add an external module called a validator to LLM. These modules check the answers generated by LLM in a specific context, such as verifying the feasibility of a travel plan. If the answer is not complete enough, the validator asks LLM to rerun the query [8]. Kambhampati's team found that LLMs with the help of external validators performed significantly better than ordinary LLMs when generating travel plans, but the researchers needed to design specialized validators for each task. "There is no universal validator," Kambhampati pointed out. In contrast, AGI systems may need to autonomously build validators to adapt to different situations, just as humans use abstract rules to ensure correct reasoning in new tasks.

Research on developing new AI systems based on these ideas is still in its preliminary stages. For example, Bengio is exploring how to build AI systems that are different from the current Transformer-based architecture. He proposed a method called "generative flow networks", which aims to enable a single AI system to both build world models and use these models to complete reasoning and planning.

Another major obstacle facing LLM is its huge demand for data. Karl Friston, a theoretical neuroscientist at University College London, proposed that future AI systems could increase efficiency by autonomously deciding how much data to sample from the environment, rather than simply ingesting all available data. He believes that this kind of autonomy may be necessary for AGI. "This kind of true autonomy cannot yet be reflected in current large-scale language models or generative AI. If some kind of AI can achieve a certain degree of autonomous choice, I think this will be a key step towards AGI." p>

AI systems that can build efficient models of the world and integrate feedback loops may significantly reduce reliance on external data. These systems can understand, reason and plan by running internal simulations and generating counterfactual hypotheses. For example, in 2018, researchers David Ha and Jürgen Schmidhuber reported [9] that they developed a neural network that can efficiently build a world model of an artificial environment and use this model to train AI to drive a virtual racing car.

If you are uncomfortable with the concept of autonomous AI systems, you are not alone. In addition to researching how to build AGI, Bengio is an active advocate for introducing security into the design and regulation of AI systems. He believes that research should focus on training models that can ensure the safety of their own behavior, such as establishing mechanisms to calculate the probability that the model violates certain specific safety constraints., and refuse to take action when the probability is too high. In addition, governments need to ensure the safe use of AI. "We need a democratic process to ensure that individuals, companies and even the military use and develop AI in a way that is safe for the public."

So, is it possible to achieve AGI? Computer scientists see no reason to think otherwise. "There's no theoretical barrier," George said. Melanie Mitchell, a computer scientist at the Santa Fe Institute, agrees: "Humans and some other animals have shown that this is possible. In principle, I don't think there is any difference between biological systems and systems made of other materials." Are there any special differences that would prevent non-biological systems from becoming intelligent?"

Despite this, there is still a lack of consensus on when AGI will be achieved: predictions range from within a few years to at least a decade away. . George pointed out that if an AGI system is created, we will confirm its existence by its behavior. Chollet suspects that its arrival will be very low-key: "When AGI arrives, it may not be as obvious or as disruptive as you think. The full potential of AGI will take time to emerge. It will be invented first and then expanded." and applications will ultimately truly change the world.”

Keywords: Bitcoin
Share to: