AI turns scientific research upside down! DeepMind 36-page report: Global laboratories are exponentially taken over by “AI scientists”

Image source: Generated by Unbounded AI

In the past two years, AI has focused on user growth and successfully achieved popularization. After all, attracting new customers is the way to go in business.

However, the daily application of AI has almost reached the ceiling. Many LLMs can give pretty good answers to common queries that most people in the world have.

The speed and fluency are enough to meet the needs of most users. Even if it is further optimized, the room for improvement is limited - after all, the technical difficulty of this type of problem is not high.

Perhaps what is really worthy of attention in the future is the field of science and engineering.

OpenAI scientist Jason Wei recently posted a prediction: In the next year, the focus of AI may shift from daily use to the scientific field.

He believes that in the next five years, the focus of AI will shift to hard-core fields - using AI to accelerate science and engineering. Because this is the engine that truly drives technological progress.

For ordinary users’ simple problems, there is not much room for improvement.

But there is huge room for improvement in every scientific frontier, and AI can be used to solve those "1% of the top problems" that can promote technological leaps.

AI has the potential to not only answer these questions, but also inspire people to think about larger challenges.

Moreover, the progress of AI can also accelerate the research of AI itself and help itself become stronger. The progress of AI is compound interest, and it can be said to be the king of positive feedback.

To put it bluntly, the next five years will be the era of "AI scientists" and "AI engineers".

A recent paper published by DeepMind also hints at this trend: in laboratories around the world, scientists’ use of AI is increasing exponentially.

Report address: https://storage.googleapis.com/deepmind-media/DeepMind.com/Assets/Docs/a-new-golden-age-of-discovery_nov-2024.pdf

AI accelerates the golden age of scientific innovation and discovery

Today, one in three postdoctoral researchers uses large language models to assist with literature reviews, programming and article writing Waiting for work.

This year’s Nobel Prize in Chemistry also exceeded everyone’s expectations and was awarded to Demis Hassabis and John Jumper, the inventors of AlphaFold 2. At the same time, this has also inspired a large number of scientists to apply AI to their own scientific fields in order to achieve more innovative discoveries.

The number of scientists has increased dramatically in the past half century, more than sevenfold in the United States alone, but social progress brought about by science and technology has slowed down.

One reason is that modern scientists face the scale ofand complexity challenges are getting tougher.

However, deep learning is good at handling this complex situation and can significantly reduce the time cost of scientific discovery.

For example, traditional X-ray crystallography takes several years and $100,000 to figure out a protein structure, but AlphaFold directly gives you 200 million predictions for free, beating traditional methods in an instant.

Five major opportunities

For research that is difficult to break through at different stages of scientific research For scientists facing bottlenecks, seizing key opportunities to use AI may lead to the birth of powerful new discoveries.

Five opportunities to use AI to advance scientific research

1. Knowledge - changing the way scientists acquire and deliver knowledge

To drive new discoveries, scientists must master an increasingly specialized set of diversified and exponentially growing knowledge system.

This "knowledge burden" has made disruptive discoveries increasingly dependent on older scientists and interdisciplinary teams at top universities. It has also led to a continued decline in the proportion of independent papers written by small teams.

Moreover, most scientific results are still shared in the form of obscure, English-based papers, limiting the attention and interest of policymakers, businesses and the public.

Now, scientists and the public can use LLM to break the problem.

For example, a team used Google Gemini to extract relevant insights from 200,000 papers in one day; ordinary people can also use LLM to easily summarize and answer questions, obtain professional academic knowledge, and instantly get closer to cutting-edge science. .

2. Data - Generating, extracting and annotating large scientific data sets

Although we are in the era of data explosion, there is a serious shortage of scientific data in many natural and social fields, such as soil, deep sea, atmosphere and informal economy.

AI is helping to change this situation. It reduces noise and errors that can occur when sequencing DNA, detecting specific cell types in a sample, or capturing animal sounds.

Scientists can also take advantage of LLM’s growing multi-modal capabilities to extract unstructured scientific data from sources such as scientific publications, archival documents, and video images and transform it into structures ize the data set for subsequent research.

AI can also help add auxiliary information to scientific data that scientists need. For example, at least one-third of microbial proteins fail to reliably annotate the details in how they perform their functions.

AI models verified by reliability assessments can also serve as a source of new synthetic scientific data. For example, the AlphaProteo protein design model is trained on more than 100 million AI-generated protein structures in AlphaFold 2 as well as experimental structures in the Protein Database.

3. Experiments—Simulate, accelerate, and guide complex experiments

Scientific experiments are often difficult to perform because they are expensive, complex, and time-consuming. There is one moreSome experiments cannot be performed because researchers do not have access to the required facilities, manpower, or experimental materials.

Nuclear fusion is a typical example. It promises to provide a virtually limitless, emission-free source of energy and could potentially support innovative, energy-intensive large-scale applications such as desalination. But the tokamak reactors needed to control the plasma are complex and expensive. The ITER prototype has been built since 2013, and experiments are not expected to begin until the mid-2030s.

AI can accelerate the experimental process through simulation.

One approach is to use reinforcement learning agents to simulate physical systems. For example, researchers collaborated with the Ecole Polytechnique Fédérale de Lausanne to use reinforcement learning to control the shape of tokamak plasma. This method can also be used in particle accelerators, telescopes and other facilities.

The ways in which AI simulation experiments are used may vary in different disciplines, but one thing in common is that these simulations are usually used to guide and optimize real-life experiments rather than completely replace them.

Take genetic research as an example. The average person has more than 9,000 missense variants, most of which are harmless, but a few can cause disease. In reality, the effects can only be tested on a protein-by-protein basis. AlphaMissense can quickly classify 89% of 71 million potential variants, helping scientists focus on high-risk variants and accelerate disease research.

AlphaMissense predictions of pathogenicity for all possible 71 million missense variants

4. Models – Modeling the interactions between complex systems and their components

1960 In 2006, Nobel Prize-winning physicist Eugene Wigner lamented the "unexpected effectiveness" of mathematical equations in simulating natural phenomena such as planetary motion.

However, in the face of complex systems such as biology, economics, and weather, traditional equation models are increasingly powerless because these systems are full of dynamics, randomness, often accompanied by emergence and chaos, and are difficult to predict and control. These equations provide very useful but imperfect approximations, and running these methods is computationally expensive.

AI can mine patterns from complex data. For example, Google's deep learning system can quickly predict the weather for the next 10 days, with speed and accuracy that doubles as traditional numerical models.

At the same time, AI can also help mitigate climate problems. For example, AI can be used to predict the time and location of wet areas, helping pilots avoid contrails that will exacerbate global warming.

Even though AI is very powerful, it more enriches rather than replaces traditional modeling of complex systems.

For example, agent-based modeling simulates interactions between individual actors (such as businesses and consumers) to understand how these interactions affect larger, more complex systems (such as socioeconomics) .

In traditional methods, scientists need to specify in advance how these agents will behave.

Scientists can now use large language models to create more flexible generative agents that can communicate and act, such as searchinformation or purchases, while also reasoning and remembering those actions.

Scientists can also use reinforcement learning to study how these agents learn and adjust their behavior in more dynamic simulations, such as in response to new energy prices or epidemic response policies.

5. Solutions - Proposing solutions to large-scale search space problems

Many important scientific problems are accompanied by many almost incomprehensible potential solutions.

For example, biologists and chemists need to determine the structure, properties, and functions of molecules such as proteins in order to design new molecules for use as antibody drugs, enzymes that degrade plastics, or new materials.

However, to design a small molecule drug, scientists need to face more than 10^60 potential choices; to design a protein composed of 400 standard amino acids, they need to face 20^400 choose.

This large-scale search space is not limited to molecules, but also widely exists in many scientific problems, such as finding the best proof of mathematical problems, the best design architecture of computer chips, etc.

Traditionally, scientists have relied on some combination of intuition, trial and error, iteration, or brute force calculations to find the best molecule, proof, or algorithm. However, these methods struggle to fully traverse the huge search space and thus fail to discover better solutions.

Now, AI is better able to explore these vast search spaces while focusing more quickly on solutions that are most likely to be feasible and effective.

In July this year, AlphaProof and AlphaGeometry2 successfully solved four of the six questions in the International Mathematical Olympiad. They utilize the Gemini large language model architecture to generate a large number of potential solutions to a given mathematical problem, and combine it with a system based on mathematical logic to iteratively achieve candidate solutions that are close to the most likely to be correct.

AI scientists or AI-empowered scientists?

Even as the capabilities of AI systems continue to improve, their greatest marginal benefits will still come from applying them in scenarios that highlight their relative advantages.

For example, the ability to quickly extract information from massive data sets and help solve real bottlenecks in scientific progress; rather than trying to automate tasks that human scientists are already good at.

As AI drives science to become more cost-effective, society’s demand for science and scientists will also increase.

Unlike other industries, the demand for science is almost unlimited, and technology will not reduce the demand for scientists. New developments will always open up new and unpredictable areas in the scientific landscape, and the same is true for AI.

As Sima He envisioned, the AI system itself is also the object of scientific research, and scientists will play a leading role in evaluating and interpreting its scientific capabilities and developing new human-AI scientific systems.

KeyElements

In this part, the article deeply discusses several key factors in realizing "AI for Science" and summarizes them into a model of "AI for Science Production Function".

The model shows how to use AI to promote different stages of scientific research and innovation and the core content that needs attention.

Starting from problem selection (Problem selection) and model evaluation (Evaluations) of scientific research, through the support of infrastructure such as computing resources (Compute) and data (Data), pay attention to the organizational model during the research process Design (Organizational design) and interdisciplinarity (Interdisciplinarity), form results, and finally transform research results into practical impact through adoption. Partnerships, Safety & Responsibility at the bottom are implemented throughout to ensure that the entire process is efficient and ethical.

While many of the elements seem intuitive, DeepMind’s paper reveals some important lessons learned in practice.

1. Problem selection

The key to scientific progress is to find problems that are truly worth solving.

At DeepMind, the scientific team usually first evaluates whether a research question is important enough and worthy of investing a lot of time and resources.

DeepMind CEO Demis Hassabis proposed a thinking model: treat the entire science as a tree of knowledge.

Then, the most important thing is to find the root of the tree - basic "root problems" such as protein structure prediction and quantum chemistry. Once they are solved, they can branch out and unlock new ones. research and application.

Among these problems, to judge whether AI can bring gains, we need to look for problems with specific characteristics, such as a huge combinatorial search space, a large amount of data, and a clear objective function that can be used to measure performance. .

Many recent breakthroughs result from the collision of important scientific issues and mature AI methods.

For example, DeepMind’s progress in nuclear fusion research has benefited from the newly released reinforcement learning algorithm—maximum a posteriori policy optimization.

Choosing the right questions is important, but the difficulty of the questions also needs to be just right. A problem suitable for AI is usually one that can produce intermediate results.

If the problem is too difficult, it won’t generate enough feedback to drive progress. To do this requires a combination of intuition and experimentation.

2. Model evaluation

In scientific AI research, model evaluation methods are also very important.

Scientists often evaluate the scientific capabilities of AI models through evaluation methods such as benchmarks, indicators, and competitions.

If designed properly, these assessment methods can be used not only to track progress, but also to stimulate methodological innovation and activate researchers' interest in scientific questions.

Different situations require different assessment methods.

For example, DeepMind’s weather prediction team initially used “progress indicators” based on several key variables (such as surface temperature) to improve model performance.

When the model reaches a certain performance level, they adopt a more comprehensive evaluation method that includes more than 1,300 indicators. The design of these indicators is inspired by the European Center for Medium-Range Weather Forecasts (ECMWF) evaluation scorecard.

The team also found that the AI model sometimes "cheats" on certain indicators, such as the "double penalty" problem - "fuzzy" predictions (such as predicting rainfall to occur in a large geographical area) are more accurate than "accurate" predictions. Forecasts (such as predicting a storm's location slightly off its actual location) are less penalized.

For further validation, the team also evaluated the model's utility in downstream tasks, such as its ability to predict cyclone tracks and characterize "atmospheric rivers" (narrow strips of concentrated moisture) that can lead to flooding. strength.

The most influential scientific AI evaluation methods are often community-led, such as the Competition for Protein Structure Prediction (CASP).

The competition was initiated by Professors John Moult and Krzysztof Fidelis in 1994 and is held every two years. The goal of CASP is to promote technological innovation in related fields and deepen the understanding of protein folding and structure by testing the accuracy of the protein structure prediction methods of each participating team.

However, this also brings the risk that the benchmark may be "leaked" into the AI model training data, allowing the model to "cheat", thereby reducing the utility of the benchmark for tracking model progress.

There is no perfect solution to the "cheating" problem, but at least the benchmark needs to be updated regularly to encourage more open third-party evaluation and competition.

3. Computing resources

Computing resources are the core engine of AI and scientific development, but they are also one of the focuses of energy conservation and emission reduction.

AI laboratories and policymakers need to balance model needs and efficiency improvements from a long-term perspective.

For example, the protein design model is small and efficient, while the training of a large language model is computationally intensive, but the amount of calculation required for fine-tuning and inference is relatively small; by optimizing the data or "distilling" the large model into a small model , which can also further reduce computational costs.

At the same time, it is also necessary to compare the resource consumption of AI and other scientific methods.

For example, although the training of AI-driven weather prediction models consumes resources, the overall efficiency may be better than traditional methods. Continuous tracking of empirical data can help clarify these trends and inform planning for future computing needs.

In addition, computing strategies should not only focus on the adequacy of chip supply, but also need to prioritize building critical infrastructure and improving engineering skills to ensure resource access and system reliability. However, academia and public research institutions are often under-resourced in these areas and need more support.

4. Data

Like computing resources, data is the infrastructure for the development of scientific AI and requires continuous development, maintenance and updating.

People often focus on the creation of new data sets driven by policymakers.

For example, the materials project launched by the Obama administration in 2012 mapped inorganic crystals, providing data support for DeepMind's recent GNoME project to predict 2.2 million new materials.

But many scientific AI breakthroughs often emerge from more organic data that benefit from the efforts of visionary individuals or small teams.

For example, the gnomAD genetic variation data set developed by Daniel MacArthur of the Broad Institute at the time provided the basis for DeepMind's AlphaMissense project.

Also, the mathematical tool Lean was originally developed by Leonardo de Moura and has now become an important training resource for AI mathematical models (such as AlphaProof).

These cases illustrate that in addition to top-down strategic planning, researchers also need to be motivated to play a more active role in data collection, organization, and sharing.

Currently, many wet laboratory experimental data are discarded due to lack of financial support; while the high-quality data of the Protein Data Bank (PDB) benefit from the unified standards set by journal requirements and professional data organizers. In contrast, the organization of genomic data often requires additional integration and cleaning due to different standards.

In addition, there are many high-quality data sets that are completely untapped, such as biodiversity data that cannot be made public due to licensing restrictions, or historical data from decades of nuclear fusion experiments. Whether due to lack of resources, time, or data embargo periods, these bottlenecks will hinder the release of AI's potential in science.

5. Organization model design

Academia is more bottom-up and industry is more top-down, but top laboratories can often find a balance between the two.

The golden years of companies like Bell Labs and Xerox Palo Alto Research Center were famous for their free exploration research model. This also provided inspiration for the creation of DeepMind.

Recently, a number of emerging scientific institutions have tried to learn from these examples and replicate this research model. They hope to promote more high-risk, high-reward research, cut bureaucracy and provide better incentives for scientists.

These institutions are committed to solving some problems in science that are too large for academia to afford, but not profitable enough for industry, such as extending the Lean proof assistant, a tool that is crucial to AI mathematical research. .

The core goals of these institutions areYu, combines top-down coordination with bottom-up empowerment of scientists. Neither can we rely entirely on the free play of scientists (which may lead to inefficiency or scattered research directions), nor can we forcefully control every step (which can stifle creativity).

Ideally, institutions provide clear goals, resources, and support to scientists, but specific research methods and processes are led by the scientists themselves.

Finding this balance is not only important in attracting top research leaders, it is also key to success. Demis Hassabis calls it the core secret to coordinating cutting-edge research.

This balance also applies to specific projects. For example, at DeepMind, research often switches between two modes: "exploration" state (the team looks for new ideas) and "exploitation" state (the team focuses on engineering and performance expansion).

Mastering the timing of mode switching and adjusting the team's rhythm is an art.

6. Interdisciplinary

Interdisciplinary cooperation is the key to solving scientific problems, but it is often blocked by disciplinary barriers.

Scientific AI research often requires a multi-disciplinary start, but real breakthroughs come from deep integration across disciplines. It’s not just about bringing people together, it’s about getting teams to work together to develop shared methods and ideas.

For example, DeepMind’s Ithaca project uses AI to repair damaged ancient Greek inscriptions. To succeed, AI research leaders need to delve into inscriptions, and inscriptionists need to understand AI models, because intuition is crucial to this work.

Fostering this kind of team dynamic requires the right incentives. The team can do this by focusing on solving problems rather than grabbing paper signatures - this is also the key to the success of AlphaFold 2.

This kind of focus, which is easier to achieve in industrial laboratories, also highlights the importance of long-term public research funding - it needs to move away from over-reliance on the pressure to publish.

To achieve true interdisciplinary collaboration, organizations also need to create roles and career paths for people who can help bring disciplines together.

At DeepMind, research engineers drive a virtuous cycle of research and engineering, and project managers strengthen team collaboration and connect different projects. DeepMind also prioritizes recruiting people who are good at discovering intersections between disciplines, and encourages scientists and engineers to switch projects regularly.

The key is to create a culture that is driven by curiosity, respects differences, and dares to debate. Economic historian Joel Mokyr calls this culture "contestability": researchers from different backgrounds can discuss openly, criticize each other and make progress together.

The practice of this culture can be achieved through regular interdisciplinary workshops, open discussion platforms, and encouraging interaction within and outside the team.

This restored inscription (IG I3 4B) records a decree related to the Acropolis, dating to 485-484 BC

7. Adopt

Scientific AI tools such asAlphaFolds are both specialized and versatile: they focus on a small number of tasks but serve a wide range of scientific communities, from studying diseases to improving fisheries.

However, translating scientific progress into practical applications is not simple. For example, it takes a long time for germ-theory to be widely accepted, and downstream products (such as new antibiotics) generated by scientific breakthroughs are often not fully developed due to the lack of appropriate market incentives. .

In order to promote the implementation of the model, we have found a balance between scientists’ adoption and business goals, security risks and other factors, and established a dedicated Impact Accelerator to promote the implementation of research. application, and encourage cooperation in the direction of social welfare.

To make new tools more accessible to scientists, the integration process must be simple.

In the development of AlphaFold 2, we not only open sourced the code, but also collaborated with EMBL-EBI to create a database for scientists with limited computing resources to easily query 200 million protein structures.

AlphaFold 3 further expands functionality, but predicts a surge in demand. To this end, we launched AlphaFold Server, which allows scientists to generate structures on demand.

At the same time, the scientific community has also spontaneously developed tools such as ColabFold, demonstrating the importance of attaching importance to diverse needs and cultivating computing capabilities in the scientific community.

To date, more than 2 million users from more than 190 countries around the world have accessed the AlphaFold protein structure database and browsed more than 7 million structures

Scientists will use AI models only when they trust them it. The key to promotion is to clarify the purpose and limitations of the model.

For example, in the development of AlphaFold, we designed an uncertainty indicator to display the model’s confidence in predictions through intuitive visualization, and cooperated with EMBL-EBI to launch a training module to guide how to interpret the confidence and use actual cases Strengthen trust.

Similarly, the Med-Gemini system performs well in health questions and answers. It evaluates answer disagreement computational uncertainty by generating multiple chains of reasoning. When uncertainty is high, network searches are automatically called to integrate the latest information.

This method not only improves reliability, but also allows scientists to understand the decision-making process at a glance, doubling their trust.

Med-Gemini-3D is able to generate reports for CT scans, which are much more complex than standard X-ray imaging. In this example, Med-Gemini-3D's report correctly includes a lesion (outlined in green) that was missed in the original radiologist's report

8. Collaboration

Scientific AI is inseparable from multiple fields Collaboration, between the public and private sectors, is particularly critical.

This collaboration spans the entire project, from data set creation to result sharing.

For example, can new materials designed by AI models be used?Yes, it requires the evaluation of senior materials scientists; whether the anti-SARS-CoV-2 protein designed by DeepMind can bind the target as expected also needs to be verified by wet experiments in cooperation with the Crick Institute. Even in the field of mathematics, FunSearch's solution to the Cap Set problem also benefited from the professional guidance of mathematician Jordan Ellenberg.

Given the central role of industrial laboratories in promoting the development of AI and the need for rich domain knowledge, public-private collaboration will become increasingly important in advancing the development of scientific AI frontiers. To do this, there must be greater support for public-private partnerships, such as more funding for joint teams between universities and research institutions and businesses.

But cooperation is not simple. All parties need to agree on goals and key issues as early as possible: ownership of research results, whether to publish papers, whether data and models are open source, applicable license agreements, etc., all of which may lead to disputes. These disagreements often reflect different incentives on both sides, but successful collaborations are often built on a clear exchange of value.

For example, the AlphaFold protein database can cover 2 million users precisely because it combines our AI model with EMBL-EBI’s biological data management expertise. This kind of complementary cooperation is not only efficient, but also maximizes the potential of AI.

Online Consultation