David Luan, an early OpenAI employee, latest interview: DeepSeek has not changed the narrative of AI technology

Image source: Generated by Unbounded AI

Recently, on Redpoint Venture’s podcast “Unsupervised Learning”, Redpoint Venture Partner Jacob Effron held an interview with David Luan. From a technical perspective, they discussed the inspiration that DeepSeek brings to the research and practice of the big model field, and shared the current bottleneck thinking and potential breakthrough directions of AI models.

David Luan was an early employee of OpenAI. He graduated from Yale University in 2009 and first joined iRobot to work in robotics. He then worked in several companies (including Microsoft) until he joined in 2017. In the early stages of OpenAI, there were only 35 R&D teams at that time. In this interview, he also mentioned that the reason for joining an artificial intelligence company is because of his interest in robots. He believes that "the biggest limitation of robots lies in the intelligence level of the underlying algorithm."

In 2020, David Luan left OpenAI and joined Google, but not long after, he co-founded Hungry Adept with two colleagues he met during Google and served as CEO. Last August, he joined Amazon as head of AGI's San Francisco Lab.

The following is the main text of the interview compiled by "Bright Company" (slightly deleted):

Limitations and enhancements of the big model The Value of Learning

Jacob: David Luan is the head of Amazon AGI Labs. He was previously the co-founder and CEO of Adept, which raised more than $400 million to develop AI Agents. He has been involved in many key breakthroughs during his tenure as Vice President of Engineering at OpenAI. I'm Jacob Effron.

Today in the show, David and I discussed many interesting topics, including his views on DeepSeek, predictions of future model progress, and we discussed the current status of Agents and how to make them reliable, And when they will be everywhere. He also shared some interesting stories about the early days of OpenAI and the unique culture there. It was a very interesting conversation because David and I have known each other for over a decade. I think the audience will like it. David, thank you for coming to our podcast.

David: Thank you for inviting me. It will be very interesting because we have known each other for over a decade.

Jacob: I remember when you first joined OpenAI, I thought it seemed interesting, but I wasn't sure if it was a wise career choice. Then it's obvious that you always see opportunities earlier than others.David: I was really lucky because I was always interested in robots, and the biggest limitation of robots (at that time) was the intelligence of the underlying algorithm. So I started working on AI and seeing these technologies make progress in our lifetimes, which is really cool.

Jacob: Today I want to discuss a lot of topics with you. I want to start with the recent hot topics. Apparently, the past few weeks have been a big reaction to DeepSeek. People were talking about this and stocks plummeted. Some say this is not good for OpenAI and Anthropic. I think people's emotions have now eased from the initial panic. But I'm curious, what are people in the broader discussions about the impact of this incident and what are wrong?

David: I still remember that morning when everyone was following the news about DeepSeek. After I woke up, I looked at my phone and found that five missed calls. I thought to myself, what exactly happened? The last time this happened was when SVB (Silicon Valley Bank) went bankrupt, because all the investors were calling me to pull the funds out of SVB and First Republic Bank. So I think something bad must have happened. I checked the news and found that the stock plunged because of the release of DeepSeek R1. I immediately realized that people were completely wrong about this matter. DeepSeek does a great job, but it is part of this broader narrative—that is, we first learn how to make new mockups smarter, and then we learn how to make them more efficient.

So this is actually a turning point. What everyone misunderstood is that just because you can achieve more intelligence at a lower cost does not mean that you will stop pursuing intelligence. On the contrary, you will use more intelligence. So when the market realizes this, we are now regaining rationality.

Jacob: Given that at least the basic model seems to have been trained on OpenAI, you can make the basic DeepSeek model behave like ChatGPT in various ways. So, going forward, will OpenAI and Anthropic stop publishing these models more publicly, given the reasons for knowledge distillation?

David: I think what happens is that people always want to build the smartest models, but sometimes these models are not always inferred efficiently. So I think we're going to see more and more, while people may not discuss that explicitly, people will train these huge "teacher models" in their internal labs, using what they can get All computing resources. They then try to compress it into an efficient model that is suitable for customers.

The biggest problem I see right now is that I imagine the use cases of artificial intelligence as concentric circles of complexity. The innermost complexity may be likeWe have a simple chat conversation with the basic language model, and we are already able to do this well in GPT-2. Every added intelligence level, such as being able to perform mental arithmetic, programming, or later agents, or even drug discovery, requires a smarter model. But each previous intelligence level has become almost so cheap that it can be quantized (quantize) to reduce the numerical accuracy of the model to reduce resource consumption). Jacob: This reminds me of the trend of test-time compute. This seems like a very exciting path forward, especially in areas where it is easy to verify, such as programming, mathematics, etc. How far can this paradigm take us?

David: There is a series of papers and podcasts that document my years of discussion on how to build AGI (General Artificial Intelligence).

Jacob: Let's add something new to these discussions. David: So now we can prove that we had this conversation at this moment. But back in 2020, when we started to see the emergence of GPT-2, GPT-3 may have been in development or completed. We start thinking about GPT-4, we live in a world where people are not sure if they only need to predict the next token (next token prediction) to solve all AGI problems.

My point of view, and the opinions of some people around me, are actually “no.” The reason is that if a model is trained to the next token prediction, it will essentially be punished for discovering new knowledge, because the new knowledge is not in the training set. So what we need to do is that we need to look at other known machine learning paradigms that can really discover new knowledge. We know reinforcement learning (RL) can do this, and RL can do this in search, right? Yes, or like AlphaGo, this may be the first time that the public realizes that we can discover new knowledge using RL. The question has always been when will we combine large language models (LLMs) with RL to build systems that have both human knowledge and can be built on this basis. Jacob: So, for areas that are not easy to verify, such as healthcare or law, can this test-time computing paradigm allow us to build models that can handle these problems? Or will we become very good at programming and math, but still can't tell a joke?

David: This is a topic worth debating, and I have a very clear point of view.

Jacob: What is your answer?

David: The generalization ability of these models is stronger than you think. Everyone is saying that I used GPT-1 and it seems better in math, but it may be a little bit unsuccessful when waiting for it to think about itSuch as ChatGPT or other models. I think these are just small twists and turns leading to stronger. Today, we have seen some signs that by explicitly verifying whether the model solves the problem correctly (as we saw in DeepSeek), it does lead to migration on some slightly vague issues in similar fields. . I think everyone is working hard, my team and other teams are working hard to address human preference issues in these more complex tasks to meet those preferences.

Jacob: Yes. And you always need to be able to build a model to verify, like "Hey, this output is a good legal opinion", or "This output is a good medical diagnosis", which is obviously more important than verifying whether a mathematical proof or code can work Much more difficult.

David: I think we are taking advantage of the gap between the good and bad of these models - the ability of the same set of neural network weights to determine whether one has done a good job, and to generate the correct answer Comparison of ability. We always see these models stronger in judging whether they "done a job well done" than "generate good answers." In a way, we are taking advantage of this, through some RL tools (stuffs) to make it feel like it is doing something well.

Jacob: In order to truly launch a model like this, what research problems need to be solved?

David: There are so many problems, I think I might only list three questions we need. First, I think the first problem is that you need to really know how to build an organization and process to make a model reliably.

I have always told the people I work with that today, if you run a modern AI lab, your job is not to build models, but to build a factory that can reliably make models. . When you think about it this way, it completely changes your investment direction. Before it reaches reproducibility, I don't think there is much progress to some extent. We have just gone through the process from alchemy to industrialization, and the way these models are built has changed. Without this foundation, these models cannot work.

I think the next part is that you have to take slow as fast. But I think this is the first part. I always believe that people are always attracted to algorithms because they look cool and sexy. But if we look at what really drives this, it's actually an engineering problem. For example, how do you do large-scale cluster computing to ensure they can run reliably for a long enough time? If a node crashes, you won't waste too much time on your tasks. To push the forefront of scale, this is a real problem.

Now, the entire field of reinforcement learning (RL), we will soon enter a world where there will be many data centers, each of which will do a lot of reasoning on the basic model, and maybe still in Test in a new environment brought by customers to learnLearn how to improve the model and feed this new knowledge back to a central position to make the model smarter.

Jacob: There are some people like Yann LeCun who have been criticizing the limitations of large language models (LLMs). I want you to sum up this criticism for our audience and then talk about your opinion on people who say these models can never make real original thinking. David: I think we already have counterexamples, and AlphaGo is an original thinking. If you look back at the early days of OpenAI, we used RL to play Flash games, and if you were someone of that age, you probably remember MiniClip and something like that. These used to be a pastime in middle school, but it was really fun to see them become the cornerstone of AI. We were working on how to use our algorithm to get through these games at the same time, and you will soon find that they have learned how to quickly pass through walls, etc., which humans have never done before.

Jacob: In terms of verification, it mainly focuses on finding clever methods to find verification methods for these different fields.

David: Just use the model .

How to build reliable Agents

Jacob: I want to put the topic Turn to the world of Agents. How do you describe the current status of these models?

David: I am still extremely excited about agents. This reminds me of 2020 and 2021, when the first wave of truly powerful models such as GPT4 came out. When you try these models, you will feel great potential - it can create excellent rap songs, make wonderful complaints, and three-digit addition is basically the best. But when you ask it to "order a pizza for me", it will only imitate the conversation mode of Domino's pizza customer service and cannot complete the actual task at all. This obviously exposes the major flaws of these systems, right?

Since then, I have firmly believed that the problem of Agents must be solved. When I worked at Google, we started looking at what became known as the topic of "tool use" - how to show the operational interface to a large language model (LLM) and let it decide independently when to take action. Although the academic community has always called it "agents", the public had not yet formed a unified cognition at that time. To this end, we try to create a new term "large action model" to replace "large language model", a concept that has sparked some discussion. But the mostFinally, the industry chose the title of "agent". Now this term has been abused and lost its true meaning, which is regrettable, but it is still cool to explore this field as the first modern Asian company.

When we founded Adept, the best open source LLMs at the time were underperforming. Since there were no multimodal LLMs at that time (such as LLMs with image input, like later GPT-4v), we had to train our own models from scratch\We had to do everything from scratch, which was a bit like founded in 2000 Internet companies have to call TSMC to make their own chips, which is crazy.

So along the way, what we learned is that large language models are essentially behavioral clones without today's RL technology, and they do them to see in training data What to do - this means that once they enter a situation that they have never seen before, their generalization ability will be poor and their behavior will become unpredictable. So Adept has always focused on useful intelligence. So what does practicality mean? It's not launching a cool demo that sparks virality on Twitter. Instead, put these technologies into the hands of people so that they no longer have to do the tedious work that most knowledge workers have to do, such as dragging files on a computer. So these knowledge workers are concerned about reliability. So one of our early use cases is: Can we handle invoices for people?

Jacob: Everyone likes to process invoices (laughs). This seems like a natural beginning for these general models.

David: This is a great "Hello World". So no one actually did these things at the time, and we chose an obvious "Hello World" use case. We did some other projects like Excel. If this system deletes a third of your QuickBooks entry once every seven times, you will never use it again. Reliability remains a problem, and even today, systems like Operator are very impressive and it seems to be better than other cloud computer Agents. But if you look at both systems, they both focus on end-to-end task execution, like if you type "I want you to find me a place to take 55 weekend getaways" and it will try to complete the task. But end-to-end reliability is very low and requires a lot of manual intervention. We still haven't reached a point where companies can truly trust these systems and achieve "one and all".

Jacob: We have to solve this problem. Maybe explain to our audience what work actually needs to be done behind the fact that you start with the existing basic multimodal model and to transform it into a large action model?

David: I can discuss this issue from a higher dimension, but basically there are two things to do. The first is the engineering problem, that is, how to demonstrate what can be done in a way that is understandable to the model. For example, here is the API that can be called, and here is the UI element that you can call. Let's teach it a little bit about how Expedia.com (Note: Travel Services Website) or SAP works. This is some research projects. This is the first step, which is to give it a sense of its own abilities and basic ability to act. The second part is the interesting part, that is, how to teach it to plan, reason, re-plan, follow user instructions, and even infer what the user really wants and complete these tasks for it. This is a difficult R&D problem, which is very different from conventional language model work, because conventional language model work is "let us generate a piece of text", and even today's reasoning work, such as math problems, has a final answer.

So it's more like a single-step process, and even if it involves multi-step thinking, it just gives you the answer. It's a completely multi-step decision-making process that involves backtracking, involves trying to predict the consequences of your actions, and realizing that deleting the button can be dangerous and you have to do all of this in the basic settings.

You then put it in a sandbox environment and let it learn under your own conditions. The best analogy is that it should be Andrej Karpathy (Note: Member of the founding team of OpenAI, founded the AI+ educational institution Eureka Labs in 2024) said that modern AI training is a bit like the way textbooks are organized. First, you have a full explanation of a physical process, and then some example questions. The first part is pre-training, the example questions are supervised fine-tuning, and the last step is open-ended questions, maybe there are answers later in the textbook. We are just following this process.

Andrej Karpathy's description of the big model (Source: X.com, Bright Company)

Jacob: I think you have definitely thought a lot about how these smart agents can truly enter the world . I want to ask a few questions. First, you mentioned that part of the problem is letting the model know what it can access. So, how will the model interact with the browser and program over time? Will this be similar to how humans interact? Or just through the code? Is there any other way? David: If I were to comment on this field, I think the biggest problem at the moment is that people lack creativity in how to interact with these increasingly smart big models and agents. You still remember when the iPhone first came out, the App Store also came out, and people started making various apps, such as hiccups when pressing a button, or apps that pour beer into their mouths by tilting their phones. Our interface is like that now and it feels bad.Because chat is a super constrained, low bandwidth interaction, at least in some ways. For example, I don’t want to decide the ingredients for my pizza through seven rounds of conversation.

This lack of creativity makes me feel very frustrated. I think part of the reason is that the excellent product designers who can help us solve these problems have not really understood the limitations of these models yet. This is changing rapidly, but in turn, people who are able to drive technological advancements have always seen it as “I deliver a black box here” rather than “I deliver an experience here.”

When this situation changes, I'm looking forward to seeing a system like this, when you interact with the proxy, it will actually synthesize a multimodal user interface for you to list it needs to be from The content you get there and create a shared context (Context) between humans and AI, rather than just chatting with it like the current paradigm. It's more like you do something on a computer with it, looking at the screen, more like parallel rather than vertical.

Jacob: I think you mentioned that Operator is impressive now but sometimes not perfect. So, when do you think we can have reliable smart proxy? David: I think Operator is amazing, but the last piece of puzzle is missing in the entire field at present. Jacob: I think, considering the history of autonomous driving, maybe as early as 1995, they had a demonstration of autonomous driving, and vehicles could span the country and complete 99% of their journeys.

David: Yes.

Jacob: Do we need to wait for another 30 years? David: I don't think so, because I think we actually have the right tools.

Jacob: You mentioned before that AGI (General Artificial Intelligence) is actually not far away.

David: The main milestone I'm looking for in the Agents field is that I can give this agent any task during training and come back in a few days and it's already 100% done. Yes, just like humans bring us a 5% increase in reliability, but this agent has learned how to solve this problem.

Jacob: As you mentioned before, when you founded Adept, there was no truly open source model, let alone the multimodal open source model. Do you think if someone started a company like Adept today, can a startup succeed here? Or will it be the underlying model companies and hyperscale cloud service providers that ultimately drive the ball forward?

David: I have great uncertainty about this issue. But my current point of view is that I personally think that AGI is not far away.

Jacob: When you mention AGI, how did you define it?

David: A model that can accomplish any useful tasks humans do on a computer, which is part of the definition. Another definition I like is that it is a model that can learn to do these things as quickly as humans. I don't think these are too far away, but I don't think they will spread quickly into society either. As we know, according to Amdahl's Law, once you really speed up something, other things become bottlenecks, and the overall acceleration you get is not as big as you think. So, I think what will happen is that we will have this technology, but the ability of human beings to really use these technologies efficiently will last for quite a long time. Many of my colleagues call it "capability overhang", a huge overhang.

Jacob: Have you ever had any initial thoughts on possible acceleration factors once we have these abilities?

David: I think it depends on people. It's about how to co-design interactions with models and how to use them. This will be a matter of social acceptance. For example, imagine you have a model that comes out tomorrow and it says, "I invented a completely new way of doing things, and everyone should use it." Humans need to make peace with it and decide if it is really a Better solution, this is not as fast as we thought.

Jacob: As you said, even if the lab is the first place to develop these models, there may be an opportunity for startups to truly bridge these model capabilities and what end users actually want to interact with The gap between.

David: I'm basically sure that's what will happen. Because at the end of the day, I still firmly believe that in a world with AGI, the relationship between people is really important. Ultimately, understanding and owning customers and getting closer to them to understand their needs will be more important than merely controlling this tool owned by many other labs.

Jacob: How do you think humans will use computers in the next decade? All of these models meet your definition of AGI. Will I still sit in front of the computer? What is your vision for the future of how humans interact with these technologies?

David: I think we will get a new toolbox for interacting with the computer. Today, there are still people using the command line, right? Just like people still use graphical user interfaces (GUIs). In the future, people will still use voice interfaces. But I think people will also use more ambient computing. And, I think one indicator we should focus on is the leverage obtained by humans per unit of energy when interacting with computers. I think that with the development of these systems, thisThe target will continue to grow.

Jacob: Maybe we can talk a little about the world of this future model and whether we will eventually have any model in any particular field.

David: Let's look at the hypothetical legal expert model. You may want this hypothetical legal expert to know some basic facts about the world.

Jacob: Many people will take a regular degree before going to law school.

David: That’s right. So I think there will be some domain-specific models, but I don't want to cover up the point, I just say there will be some domain-specific models. I think there will be domain-specific models for technical reasons, but there will be policies.

Jacob: This is very interesting, what does it mean? David: It's like some companies really don't want their data to be mixed together. For example, imagine you are a big bank, you have sales and trading departments, you have investment banking departments, AI employees or LLMs to support these departments, just like today these employees cannot share information, and the model should not be able to pass its weights. Share information.

Jacob: What else do you think needs to be solved? On the model side, it seems that you are confident that if we just expand our current computing power, we can be very close to solving the problems we need to solve. But are there other major technical challenges to overcome to continue to scale the intelligence of the model? David: Actually, I don't agree with the view that everything can migrate the existing technology directly to the computing power cluster two years later. Although scale will remain a key factor, my confidence comes from the analysis of current core open issues – we need to evaluate the difficulty of solving these issues. For example, are there super-conundrum problems that must be overcome through disruptive innovation? For example, completely replace the gradient descent algorithm (Note: gradient descent, the core algorithm for parameter optimization of deep learning models, iteratively update parameters by calculating the negative gradient direction of the loss function.), or relying on quantum computers to realize general artificial intelligence (AGI) . But I don't think these are inevitable technical paths.

Jacob: When new models come out, how do you evaluate them? Do you have some fixed questions to test, or how do you judge the quality of these new models?

David: My evaluation methodology is based on two core principles: Methodological Simplicity: This is the most fascinating feature in the field of deep learning—when a study comes with a methodological document (this Nowadays, you only need to examine its implementation path and you may find a solution that is simpler and more effective than traditional solutions. Such breakthroughs often load deep learning classics (deep learning canon), and brings the moment of epiphany that 'this really demonstrates the beauty of algorithms'.

Benchmark Misalignment: The current hype has caused a large number of benchmarks to be disconnected from the actual demand of the model, but are over-emphasized in the R&D process. These tests are essentially a game. The complexity of assessment and measurement is severely underestimated – they deserve more academic reputation and resource investment than many current research directions.

The accumulation of differentiated technologies is actually very small

Jacob: It seems that everyone has their own internal benchmarks, and they don't publish publicly, such as something they believe more. Just like you can see OpenAI's models perform better on many programming benchmarks, but everyone uses Anthropic's models and they know that these models are better. It's interesting to see the evolution of this field. I want to hear your current situation on Amazon, how do you view Amazon’s role in the broader ecosystem?

David: Yes, Amazon is a very interesting place. Actually, I learned a lot there. Amazon is very serious about building universal intelligent systems, especially general intelligent agents. I think what is really cool is that I think everyone on Amazon understands that computing itself is moving from the basic element we know to a call to a large model or a large proxy, which is probably the most important basic element of computing in the future. So people are very concerned about this, which is great.

I think it's interesting that I'm in charge of Amazon's Agent business, and it's cool that you can see how wide the range of agents are in big companies like Amazon. Peter and I have opened a new research lab for Amazon in San Francisco, in large part because many at Amazon’s top leaders really believe we need new research breakthroughs to solve the problem we discussed earlier. The main issues towards AGI.

Jacob: Are you focusing on any of these alternative architectures, or more cutting-edge research areas?

David: Let me think about it. I always focus on things that might help us better map model learning to computing. Can we use more calculations more efficiently? This provides a huge multiplier effect for what we can do. But I actually spend more time focusing on data centers and chips because I find this very interesting. There are some interesting moves going on now.

Jacob: It seems that one of the main factors driving the development of the model is data annotation, and obviously, all laboratories are spending a lot of money on this. Is this still relevant in the test-time calculation paradigm? How do you view this issue?

David: The first thing I can think of is two tasks that need to be solved by data annotation, and the first is to teach the model the basics of how to complete a task by cloning human behavior. If you have high-quality data, then you can use it to better inspire what the model has seen during pre-training. Then I think the second task is to teach the model what is good and what is bad, for those vague tasks. I think both are still very important. …

Jacob: You have obviously been at the forefront of this field, for the past decade. Is there one thing you've changed your mind over the past year?

David: What I have been thinking about is the construction of team culture. I think we always knew it, but what I became more convinced is that hiring people who are really smart, dynamic, and internally motivated, especially early in their careers, is actually a big engine for our success. In this field, every few years, the best strategy changes. So if people are too adapted to the best strategy before, they will actually slow you down. So I think it would be better to bet on newcomers than what I had previously thought.

Another thing I changed my view is that I used to think that building AI will actually have real long-term technical differentiation, which you can continue to accumulate on this basis. I used to think that if you do a good job of text modeling, it should help you naturally be a winner in the multimodal field. If you do a good job with multimodality, you should be a winner in the field of reasoning and agency…these advantages should continue to accumulate. But in practice, I have seen very little accumulation. I think everyone is trying similar ideas.

Jacob: The implication is that just because you break through A first does not mean that you will have an advantage in B. For example, OpenAI has made breakthroughs in language models, but this does not necessarily mean that they will make breakthroughs in reasoning.

David: They are related, but it doesn't mean you will definitely win the next chance.

When will the robot enter the home

Jacob: I want to ask Yes, you first entered artificial intelligence through the robotics field. So, what do you think about the current situation in the field of artificial intelligence robots today?

David: Similar to what I think of Digital Agent, I think we already have a lot of raw materials. And, I think it's interesting that Digital Agent provides us with an opportunity to solve some tough problems before physical Agents.

Jacob: Let’s talk about how the reliability of digital agents continues to the physical agent?

David: Take a simple example, falseSuppose you have a warehouse that needs to be rearranged, you have a physical agent, and you ask it to figure out the best plan for rearrangement of the warehouse. This can be difficult if you learn in the physical world, or even in a robot simulation environment. But if you have done this in the digital space and you have all the knowledge of training recipes and tuning algorithms to learn from simulated data, it's like you've already done this task on the training wheel.

Jacob: This is very interesting. I think when people think of robots, there are two extremes. Some people think that the laws of scale we find in language models will also be found in the field of robotics, and we are on the verge of huge change. You often hear Jensen (NVIDIA founder Jensen Huang) talk about this issue. And then there are others who think that it's like the 1995 self-driving car, a great demonstration, but it's still a long time before it can actually work. Which end of this spectrum are you at?

David: I went back to what I mentioned before, what gave me the most confidence was our ability to build training recipes so that we could complete tasks 100%. We can do this in the digital space. Although there are challenges, it will eventually migrate to physical space.

Jacob: When will we have robots at home?

David: I think this is actually back to the question I mentioned earlier. I think the bottleneck of many problems is not modeling, but modeling diffusion.

Jacob: What about video models? Obviously, there are a lot of people entering this field now, which seems to be a new frontier field that involves understanding of world models and physics for a more open exploration. Maybe you can talk about what you see in this field and what you think about this field.

David: I'm very excited about it. I think it solves one of the main problems we mentioned before, that we have discussed before, today we can make reinforcement learning work on problems with Verifiers, such as theorem proof.

We then talked about how to generalize it to the Digital Agents realm where you don't have a validator, but you might have a reliable emulator because I can start a staging environment for an application, teaching agents How to use it. But I think the remaining main question is, what happens when there is no clear validator or explicit emulator? I think the world modeling is how we answer this question.

OpenAI's path to organizational growth

Jacob: That's great. I want to change the topic a little bit and talk about OpenAI and your time there. Obviously, you've been involved in a very special period in the company and played a similar role in many advancements. I think we will see a lot of analysis on OpenAI culture in the future, about what is special about the era when GPT-1 to GPT-4 was developed. What do you think will be said about those analyses? What made this organization so successful? David: When I joined OpenAI, the research community was still very small. It was 2017, and OpenAI was just over a year old. I know the founding team and some early employees who are looking for someone who can blur the boundaries of research and engineering, and I happen to fit that need.

So it is a very lucky thing to join OpenAI. At that time, the team had only 35 people, but they were all extremely outstanding talents. They did a lot of work in supercomputing, and there were many other people, which I can list. They were all very outstanding people in the team at that time.

Interestingly, my job at the beginning was to help OpenAI build a scalable infrastructure, from a small team to a larger scale. But soon, my work began to transform into how to define a differentiated research strategy that allows us to make the right judgments for machine learning in this period. I think we realized earlier than others that the previous model of research—you and your three best friends wrote a paper that changed the world—that era was over. What we really need to think about is this new era, where we try to use a larger team, combining researchers and engineers, to solve major scientific goals, regardless of whether this solution is defined as "novel" by academics. We are willing to take responsibility for this. When GPT-2 was first released, people said it looked like a Transformer, "Yes, it's a Transformer." And we are proud of it.

Jacob: So, what was your consideration for joining OpenAI at that time?

David: I was very excited at the time because I wanted to be at the forefront of research. The choice at that time was OpenAI, DeepMind or Google Brain. … As I mentioned before, betting on people who are truly intrinsic, especially those in the early stages of their careers, is a very successful strategy, and there are many others that defined a certain field at that time In fact, people do not have Phd degree or 10 years of work experience.

Jacob: Have you found any common traits of these outstanding researchers? What makes them so good? What did you learn from it, about how to combine them into a team to achieve your goals?

David: It is largely internal motivation and intellectual flexibility. There was a guy who was very excited and devoted to the research he did in our team-I won’t mention his name for now. About a month and a half later, I had a one-on-one conversation with him and he suddenly mentioned that he moved to the Bay Area to join us, but before he could install Wi-Fi for his apartment, he didn't have the power, he put all the He spent all his time in the office and was doing experiments all the time, which was completely unimportant to him.

Jacob: This enthusiasm is really impressive. I've heard you mention before that Google has not made progress on GPT breakthroughs, although Transformer was invented at Google. It was obvious at the time how great the potential of this technology was, but it was difficult for Google as a whole to gather around it. What do you think about this?

David: Thanks to Ilya, who is our scientific leader in basic research and later led to the birth of GPT, CLIP and DALL·E. I remember he often went to the office, like a missionary, telling people, “Man, I think this paper is important.” He encouraged people to experiment with Transformer.

Jacob: Do you think these basic model companies are doing a lot now. Will there be another "recipe" appear at some point in the future?

David: I think losing focus is very dangerous.

Jacob: You may be one of the biggest fans of Nvidia and Jensen (Huang Renxun). Apart from the achievements that everyone knows, what else do you think Nvidia has not been widely discussed but is actually very important to this company?

David: I like Jensen very much, he is a true legend. I think he has made a lot of right decisions over a long period of time, and the past few years have really been a huge turning point for Nvidia, they internalized interconnects and chose to build business around the system. A very wise move.

Jacob: We usually have a quick Q&A session at the end of the interview. Do you think the model will progress more, less or the same this year than last year?

David: It may seem that progress is about the same on the surface, but it is actually more.

Jacob: What do you think are overhyped or undervalued in the AI field at present? David: What is overhyped is that "the skills are dead, we are completely finished, don't buy chips." What is underestimated is how we can really solve the problem of hyperscale simulations so that these models can learn from them.

Jacob: David, this is a very wonderful conversation. I'm sure you'll want to learn more about your work at Amazon and some exciting things you're doing, where can you find more information?

David: For Amazon, you can follow Amazon SF AI Lab. I don't actually use Twitter very often, but I plan to start using it again. So you can follow my Twitter account @jluan.