News center > News > Headlines > Context
Turing Award winners are worried about becoming "Openheimer" in the AI ​​world
Editor
15 hours ago 9,742

Turing Award winners are worried about becoming

Image source: Generated by Unbounded AI

In 1947, Alan Turing mentioned in a speech that "What we want is a machine that can learn from experience."

78 years later, the Turing Award, named after Turing and named "Nobel Prize in Computer Science", was awarded to two scientists who have devoted their lives to solving the Turing problem.

Andrew Barto and Richard Sutton won the 2024 Turing Award. They are nine years apart, the founders of AlphaGo and ChatGPT, and are also technological pioneers in machine learning.

Turing Award winners Andrew Barto and Richard Sutton|Picture source: Turing Award official website

Google chief scientist Jeff Dean wrote in his award speech, "Reinforcement learning technology pioneered by Barto and Sutton directly answers Turing's questions. Their work is the key to AI progress over the past few decades. The tools they develop remain the core pillars of the AI ​​prosperity... Google is honored to sponsor the ACM A.M. Turing Award."

The only sponsor of the Turing Award $1 million prize is Google.

After winning the award, two scientists standing in the spotlight pointed at AI companies, and they gave the media a "review of the award": AI companies are "inspired by business" rather than focusing on technical research, "building an untested bridge in society and letting people cross the bridge to test."

Coincidentally, the last time the Turing Award was awarded to scientists in the field of artificial intelligence was in the 2018 class. Joshua Benxio, Jeffrey Hinton, and Yang Likun won the award for their contributions in the field of deep learning.

2018 Turing Award winners|Picture source: eurekalert

Among them, Joshua Benxio and Jeffrey Hinton (also winners of the 2024 Nobel Prize in Physics), two "artificial intelligence godfathers" have frequently called on the global society and scientific community to be wary of the abuse of artificial intelligence by large companies in the past two years.

Jeffrey Hinton even resigned from Google directly in order to "speak freely". Thornton, who won the award this time, also served as a research scientist at DeepMind from 2017 to 2023.

When the highest honor in the computer world is awarded to the founders of AI core technology again and again, an intriguing phenomenon gradually emerges:

Why do these scientists who are at the top always turn around and sound the alarm bell for AI in the spotlight?

01. PeopleThe "bridgemaker" of artificial intelligence

If Alan Turing is the guide of artificial intelligence, then Andrew Bartot and Richard Sutton are the "bridgemakers" on this road.

When artificial intelligence was flying, after being praised, they were reexamining the bridge they built, can they carry the safe passage of human beings?

Perhaps the answer is hidden in their academic careers that have spanned half a century - only by looking back at how they constructed "machine learning" can we understand why they are wary of "technology out of control."

Picture source: Carnegie Mellon University

In 1950, in his famous paper "Computer Machines and Intelligence", Allen Turing raised a philosophical and technical question: "Can machines think?"

As a result, Turing designed the "imitation game" that is widely known as the "Turing Test" in later generations.

At the same time, Turing proposed that machine intelligence can be obtained through learning, rather than relying solely on prior programming. He envisioned the concept of "Child Machine", which means that through training and experience, machines can learn gradually like children.

The core goal of artificial intelligence is to build an agent that can perceive and take better actions, and the standard for measuring intelligence is that the agent has the ability to judge that "some actions are better than others."

The purpose of machine learning is to give the machine corresponding feedback after action, and to allow the machine to learn independently from the feedback experience. In other words, Turing conceived a machine learning method based on rewards and punishments that is no different from Paplov's dog training.

The more I play in the game, the more I lose, the stronger I become. It is also a kind of "reinforcement learning" | Picture source: zequance.ai

The path of machine learning led by Turing was built by a pair of masters and apprentices thirty years later - Reinforcement Learning (RL).

In 1977, inspired by psychology and neuroscience, Andrew Barto began to explore a new theory of human intelligence: neurons are like "hedonists", billions of neuronal cells in the human brain, each trying to maximize happiness (reward) and minimize pain (punishment). Moreover, neurons do not mechanically receive and transmit signals. If a neuron's activity pattern leads to positive feedback, it will tend to repeat this pattern, which together drives the human learning process.

In the 1980s, Bartow brought his doctoral student Richard Sutton, hoping to apply this set of neuronal theory of "continuously trying, adjusting connections based on feedback, and finding the optimal behavior pattern" to artificial intelligence, and reinforcement learning was born.

"Reinforcement Learning: Introduction" has become a classic textbook and has been cited nearly 80,000 times | Picture source: IEEE

Two masters and apprentices useThe mathematical foundation of Markov's decision-making process has developed and compiled many core algorithms for reinforcement learning, systematically constructed the theoretical framework for reinforcement learning, and also compiled the textbook "Reinforcement Learning: Introduction", allowing tens of thousands of researchers to enter the field of reinforcement learning. Both are the fathers of reinforcement learning.

The purpose of their research on reinforcement learning is to explore efficient and accurate machine learning methods that maximize returns and best behave.

02. The "god hand" of reinforcement learning

If machine learning is "cramming" learning, then reinforcement learning is "free-range" learning.

Traditional machine learning is to feed the model a large amount of marked data to establish a fixed mapping relationship between input and output. The most classic scene is to show the computer a bunch of photos of cats and dogs, telling it which one is a cat and which one is a dog. As long as you feed enough pictures, the computer will recognize the cat and dog.

Reinforcement learning is to optimize the results by continuously trying and error and reward and punishment mechanisms without clear guidance. Just like a robot learning to walk, humans don’t need to keep telling it “this step is right, that step is wrong”. As long as it tries, falls, and adjusts, it will eventually walk and even walk out of its own unique gait.

It is obvious that the principle of reinforcement learning is closer to human intelligence, just like every child learns to walk when falling, learns to grab while exploring, captures syllables in babbling, and learns language.

The popular "Round Kicking Robot" is also the training of reinforcement learning | Picture source: Yushu Technology

The "highlight moment" of reinforcement learning is the "God Hand" of AlphaGo in 2016. At that time, AlphaGo made a move in the 37th hand that surprised all humans. He reversed the defeat in one move and beat Lee Sedol in one move.

The top experts and commentators in the Go world did not expect AlphaGo to make a move in this position, because in the experience of human chess players, this move was "inexplicable". After the game, Lee Sedol also admitted that he had never considered this way of walking at all.

AlphaGo is not a "god hand" written by the backrest chess score, but is explored independently after trial and error, long-term planning, and optimization strategies in countless self-game games. This is not only the essence of reinforcement learning.

Lee Sedol, who was disrupted by AlphaGo's "God One Hand" | Picture source: AP

Strengthening learning and even turning the guest on the main influence on human intelligence. Just like after AlphaGo revealed the "God One Hand", the players began to learn and study the way of playing Go in AI. Scientists are also using algorithms and principles of reinforcement learning to try to understand the learning mechanisms of the human brain. One of the research results of Barto and Santo, that is, to establish a computing.Model to explain the role of dopamine in human decision-making and learning.

In addition, reinforcement learning is particularly good at dealing with environments with complex rules and changing states, and finding the best solutions in it, such as Go, autonomous driving, robot control, and chatting and laughing with vague humans.

These are the most cutting-edge and popular AI application fields at present, especially in large language models. Almost all leading large language models use RLHF (reinforcement learning from human feedback) training method, that is, let humans score the model's answers and the model improves based on feedback.

But this is exactly what Bartot is worried about: after the big company built the bridge, it used the method of letting people walk back and forth on the bridge to test the safety of the bridge.

"It is not a responsible approach to pushing the software directly to millions of users without any safeguards," Barto said in an interview after the awards.

"The development of technology should have been accompanied by control and evasion of potential negative impacts, but I have not seen these AI companies really do this," he added.

03. What are the top AI players worried about?

AI threat theory is endless because scientists are most afraid of the future created by their own hands getting out of control.

The "winning reviews" of Barto and Thornton did not blame the current AI technology, but was filled with dissatisfaction with AI companies.

They all warned in the interview that the current development of artificial intelligence is based on large companies rushing to launch powerful but error-prone models, and they have raised a lot of money and continued to invest billions of dollars in an arms race for chips and data.

Major investment banks are revaluing the AI ​​industry|Picture source: Goldman Sachs

This is true. According to Deutsche Bank's research, the total investment of technology giants in the AI ​​field is about US$340 billion, which is already exceeding Greece's annual GDP. Industry leader OpenAI, with a valuation of US$260 billion, is preparing to launch a new round of US$40 billion in new financing.

In fact, many AI experts coincide with Barto and Thornton's views.

Previously, former Microsoft executive Stephen Sinovsky said that the AI ​​industry is in a dilemma of scale and relying on spending money to trade technological progress is not in line with the trend of gradually decreasing rather than rising in the history of technological development.

On March 7, former Google CEO Eric Schmidt, Scale AI founder Alex Wang, and Dan Hendrix, director of the AI ​​Security Center, jointly published a warning paper.

Three top tech circles believe that the current development situation in the cutting-edge field of artificial intelligence is similar to the nuclear weapons competition that gave birth to the Manhattan program, and AI companies are quietly carrying out their ownThe "Manhattan Project" has doubled its investment in AI every year in the past decade. If it no longer intervenes in regulation, AI may become the most unstable technology since the nuclear bomb.

"Super Intelligence Strategy" and co-author | Image source: nationalsecurity.ai

Joshua Benxio, who won the Turing Award in 2019 for deep learning, also posted a long article in his blog to warn that the AI ​​industry now has trillions of dollars in value for capital pursuit and snatch, and has an influence that is enough to seriously disrupt the current world order.

Many technological people from technical backgrounds believe that the AI ​​industry today has deviated from the study of technology, the examination of intelligence, and the vigilance of technology abuse, and has moved towards a big capital profit-seeking model of spending money to pile up chips.

"Build a huge data center, collecting money from users and allowing them to use software that is not necessarily safe. This is not a motivation for me to agree with," Barto said in an interview after winning the award.

The first edition of the International Science Report on Advanced Artificial Intelligence Security, written by 75 AI experts in 30 countries, wrote that "the methods of managing general AI risks are often based on the assumption that artificial intelligence developers and policy makers can correctly evaluate the capabilities and potential impact of AGI models and systems. However, scientific understanding of the internal operation, capabilities and social impact of AGI is actually very limited."

Joshua Bengio's long warning article | Image source: Yoshua Bengio

It is not difficult to see that today's "AI threat theory" has shifted the finger from technology to large companies.

Experts are warning big companies: You burn money, pile materials, and roll parameters, but do you really understand the products you develop? This is also the origin of Barto and Thornton borrowing the metaphor of "building a bridge", because technology belongs to all mankind, but capital belongs to large companies.

What's more, Barto and Thornton's field of research has always been: reinforcement learning. Its principle is more in line with human intelligence and has the characteristics of "black box". Especially in deep reinforcement learning, AI behavior patterns will become complex and difficult to explain.

This is also the concern of human scientists: it has helped and witnessed the growth of artificial intelligence, but it is difficult to interpret its intentions.

The Turing Award winners who created deep learning and reinforcement learning technology are not worried about the development of AGI (general artificial intelligence), but are worried about the arms race between large companies, which has caused a "intelligent explosion" in the field of AGI. Accidentally created ASI (super artificial intelligence). The difference between the two is not only a technical issue, but also a matter of the future destiny of human civilization.

ASI, which surpasses human intelligence, has the amount of information, decision-making speed, and self-evolution, and will be far beyond the scope of human understanding. If ASI is not designed and governed extremely carefully, it may become the last and most unreliable in human history.The technical singularity of law against.

In the current AI fanaticism, these scientists may be the most qualified to "spray cold water". After all, fifty years ago, when computers were still a behemoth, they had already started research in the field of artificial intelligence. They had shaped the present from the past and had a position to doubt the future.

Will AI leaders usher in an Oppenheimer-like ending? |Picture source: Economist

In an interview with The Economist in February, DeepMind and Anthropic CEOs said:

I will be unable to sleep all night because I am worried that I will become the next Oppenheimer.

Keywords: Bitcoin
Share to: