Deconstructing the AI Framework: Exploration from Intelligent Agents to Decentralization

Author: Zeke, YBB Capital Researcher

Foreword

In previous articles, we have discussed this many times Opinions on the current status of AI Meme and the future development of AI Agent. However, the rapid development and dramatic evolution of the narrative of the AI Agent track is still a bit overwhelming. In just two months since “Terminal of Truth” opened Agent Summer, the narrative of the combination of AI and Crypto has changed almost every week. Recently, the market's attention has begun to focus on "framework" projects dominated by technical narratives. In the past few weeks alone, this segment has produced many dark horses with a market value of over 100 million or even over 1 billion. This type of project has also spawned a new asset issuance paradigm, that is, the project uses the Github code base to issue coins, and the Agent built based on the framework can also issue coins again. Take the frame as the bottom and the Agent as the top. It looks like an asset issuance platform, but in fact it is an emerging infrastructure model unique to the AI era. How should we examine this new trend? This article will start from the introduction of the framework and combine it with its own thinking to interpret what the AI framework means for Crypto?

1. What is a framework?

By definition, an AI framework is a low-level development tool or platform that integrates a set of pre-built modules, libraries and tools to simplify the process of building complex AI models. These frameworks also typically include functionality for processing data, training models, and making predictions. In short, you can also simply understand the framework as an operating system in the AI era, such as Windows and Linux in desktop operating systems, or iOS and Android in mobile terminals. Each framework has its own advantages and disadvantages, and developers can choose freely according to specific needs.

Although the term "AI framework" is still an emerging concept in the Crypto field, judging from its origin, starting from Theano, which was born in 2010, AI The development process of the framework is actually close to 14 years. In the traditional AI circle, there are already very mature frameworks to choose from in both academia and industry, such as Google's TensorFlow, Meta's Pytorch, Baidu's Flying Paddle, and Byte's MagicAnimate. Each of these frameworks has its own advantages for different scenarios. .

The framework projects currently emerging in Crypto are based on this wave of AIIt was created with a large number of Agent needs at the beginning of the craze, and then derived from other Crypto tracks, eventually forming an AI framework in different subdivisions. Let’s take several mainstream frameworks in the current circle as examples to expand on this sentence.

1.1 Eliza

First, take Eliza of ai16z as an example. This framework is a multi-Agent simulation. Framework specifically designed for creating, deploying, and managing autonomous AI Agents. The advantage of developing based on TypeScript as a programming language is better compatibility and easier API integration.

According to official documents, the main scenario Eliza targets is social media, such as multi-platform integration support. The framework provides full-featured Discord integration and supports voice Channels, automated accounts for the X/Twitter platform, Telegram integration and direct API access. In terms of media content processing, it supports PDF document reading and analysis, link content extraction and summarization, audio transcription, video content processing, image analysis and description, and dialogue summarization.

Eliza currently supports four main use cases:

AI assistant applications: customers Support agents, community managers, personal assistants;

Social media roles: automated content creators, interactive bots, brand representatives;

Knowledge workers: research assistants, content analysts, document processors;

Interactive roles: role-playing characters, educational counselors, entertainment robots.

Eliza currently supports models:

Open source model local inference: such as Llama3, Qwen1. 5. BERT;

Use OpenAI's API based on cloud reasoning;

The default configuration is Nous Hermes Llama 3.1B;

Integrate with Claude for complex queries.

1.2 G.A.M.E

G.A.M.E (Generative Autonomous Multimodal Entities Framework) is a multi-modal AI framework that is automatically generated and managed by Virtual. It is mainly targeted at scenes in games. Intelligent NPC design, another special feature of this framework is that it can also be used by users with low-code or even no-code foundation. According to its trial interface, users only need to modify parameters to participate in Agent design.

In terms of project architecture, the core design of G.A.M.E works together through multiple subsystems Modular design, detailed architecture is as shown below.

Agent Prompting Interface: The interface for developers to interact with the AI framework. Through this interface, developers can initialize a session and specify parameters such as session ID, agent ID, user ID;

Perception Subsystem: The perception subsystem is responsible for receiving input information and synthesize it and send it to the strategic planning engine. It also handles the response of the dialogue processing module;

Strategic Planning Engine: The strategic planning engine is the core part of the entire framework and is divided into High Level Planner (High Level Planner) and Low Level Policy. The high-level planner is responsible for formulating long-term goals and plans, while the low-level strategy converts these plans into specific action steps;

World Context: The world context contains environmental information , world state, game state and other data, this information is used to help the agent understand the current situation;

Dialogue Processing Module: The dialogue processing module is responsible for processing messages and Response, which can generate dialogue or reactions as output;

On Chain Wallet Operator: The on-chain wallet operator may be involved in the application scenarios of blockchain technology, and the specific functions are unclear;

Learning Module : The learning module learns from feedback and updates the agent's knowledge base;

Working Memory: Working memory stores short-term information such as the agent's recent actions, results, and current plans. ;

Long Term Memory Processor: The long-term memory processor is responsible for extracting important information about the agent and its working memory and ranking it according to factors such as importance score, recency and relevance;

Agent Repository: The agent repository saves the agent's goals, reflections, experience, personality and other attributes;

Action Planner: The action planner generates specific actions based on low-level strategies Action plan;

Plan Executor: The plan executor is responsible for executing the action plan generated by the action planner.

Workflow: The developer starts the Agent through the Agent prompt interface, and the perception subsystem receives input. And pass it to the strategic planning engine. The strategic planning engine uses the information in the memory system, world context and Agent library to formulate and execute action plans. The learning module continuously monitors the results of the Agent's actions and adjusts the Agent's behavior based on the results. /p>

Application scenarios: From the perspective of the entire technical architecture, this framework mainly focuses on the Agent's decision-making, feedback, perception and personality in the virtual environment. In terms of use cases, in addition to games, it is also applicable to the Metaverse. You can see it in the list below of Virtual Seeing that a large number of projects have been built using this framework

1.3 Rig

Rig is. One in Rust An open source tool written in the language designed to simplify the development of large language model (LLM) applications. It enables developers to easily work with multiple LLMs by providing a unified interface.Interact with service providers (such as OpenAI and Anthropic) and various vector databases (such as MongoDB and Neo4j).

Core features:

Unified interface: no matter which LLM provider or vector Storage and Rig can provide consistent access methods, greatly reducing the complexity of integration work;

Modular architecture: The framework adopts a modular design internally, including key parts such as "provider abstraction layer", "vector storage interface" and "intelligent agent system", ensuring the flexibility and scalability of the system;< /p>

Type safety: Using the features of Rust to implement type-safe embedding operations, ensuring code quality and runtime security;

Efficient performance: supports asynchronous programming mode and optimizes concurrent processing capabilities; built-in logging and monitoring functions help maintenance and troubleshooting.

Workflow: When a user requests to enter the Rig system, it will first go through the "provider abstraction layer", which is responsible for standardizing the differences between different providers and Ensure consistent error handling. Next, in the core layer, the intelligent agent can call various tools or query the vector store to obtain the required information. Finally, through advanced mechanisms such as Retrieval Augmented Generation (RAG), the system can combine document retrieval and context understanding to generate accurate and meaningful responses before returning them to the user.

Application scenarios: Rig is not only suitable for building question answering systems that require fast and accurate answers, but can also be used to create efficient document search tools and context-aware Chatbots or virtual assistants even support content creation, automatically generating text or other forms of content based on existing data patterns.

1.4 ZerePy

ZerePy is a Python-based open source framework designed to simplify programming in X ( The process of deploying and managing AI Agents on the former Twitter platform. It was born out of the Zerebro project, inheriting its core functionality but designed in a more modular and easily extensible way. Its goal is to enable developers to easily create personalized AI Agents andA variety of automated tasks and content creation on X.

ZerePy provides a command line interface (CLI) to facilitate users to manage and control their deployed AI Agent「1」. Its core architecture is based on a modular design, allowing developers to flexibly integrate different functional modules, such as:

LLM integration: ZerePy supports large language models from OpenAI and Anthropic (LLM), developers can choose the model that best suits their application scenarios. This enables the Agent to generate high-quality text content;

X platform integration: The framework directly integrates the API of the X platform, allowing the Agent to post, reply, like, Forwarding and other operations;

Modular connection system: This system allows developers to easily add support for other social platforms or services and expand the functionality of the framework;

Memory system (future planning ): While it may not be fully implemented in the current version, ZerePy's design goals include integrating a memory system that enables the Agent to remember previous interactions and contextual information to generate more coherent and personalized content.

While both ZerePy and a16z’s Eliza project are dedicated to building and managing AI Agents, they have slightly different architectures and goals. Eliza focuses more on multi-agent simulation and broader AI research, while ZerePy focuses on simplifying the process of deploying AI Agents on a specific social platform (X), and is more focused on simplification in practical applications.

2. A replica of the BTC ecosystem

In fact, in terms of development path, AI Agent has many similarities with the BTC ecosystem in late 2023 and early 2024. The BTC ecosystem The development path can be simply summarized as: BRC20-Atomic/Rune and other multi-protocol competition-BTC L2-BTCFi with Babylon as the core. AI Agent has developed more rapidly based on the mature traditional AI technology stack, but its overall development path does have many similarities with the BTC ecosystem. I will briefly summarize it as follows: GOAT/ACT-Social Agent/ Analysis AI Agent framework competition. In terms of trends, infrastructure projects that focus on Agent decentralization and security will most likely take over this wave of framework craze and become the main theme of the next stage.

Will this track become homogenized and bubble like the BTC ecosystem? I think it is not the case. First of all, the narrative of AI Agent is not to reproduce the history of the smart contract chain. Secondly, whether the existing AI framework projects are technically powerful or stagnant in the PPT stage or ctrl c+ctrl v, at least they provide a new ideas for infrastructure development. Many articles compare the AI framework to an asset issuance platform and Agent to assets. In fact, compared to Memecoin Launchpad and Inscription Protocol, I personally feel that the AI framework is more like the public chain of the future, and the Agent is more like the Dapp of the future.

In today’s Crypto we have thousands of public chains and tens of thousands of Dapps. Among general chains, we have BTC, Ethereum, and various heterogeneous chains, while application chains are more diverse, such as game chains, storage chains, and Dex chains. The public chain corresponds to the AI framework. In fact, the two are very similar in appearance, and Dapp can also correspond to Agent very well.

Crypto in the AI era is very likely to move towards this form, and the future debate will shift from the debate between EVM and heterogeneous chains to In the framework dispute, the current question is more about how to decentralize or chain? On this point, I think subsequent AI infrastructure projects will be launched on this basis. Another point is, what is the point of doing this on the blockchain?

3. What is the meaning of winding?

No matter what the blockchain is combined with, it will eventually face a question: Does it make sense? In last year's article, I criticized GameFi for putting the cart before the horse and Infra's transition to advanced development. In previous articles on AI, I also expressed that I was not optimistic about the combination of AI x Crypto in the practical field at this stage. After all, the driving force of narrative has become weaker and weaker for traditional projects. The few traditional projects that performed well in currency prices last year basically have the ability to match or exceed currency prices. What use can AI have for Crypto? What I thought of before was the relatively common but in-demand ideas of Agent operating on behalf of the implementation intention, Metaverse, Agent as employees, etc. However, there is no need for these requirements to be fully linked to the chain, and the loop cannot be closed from a business logic perspective. Age mentioned in the previous issueThe nt browser's implementation intention can actually derive requirements for data labeling, inference computing power, etc. However, the combination of the two is still not close enough and the computing power part is still dominated by centralized computing power in many aspects.

Rethink the success of DeFi. The reason why DeFi can be distinguished from traditional finance Get a piece of the pie because it has higher accessibility, better efficiency, lower cost, and security without trusting the centralization. If you think along this line of thinking, I think there may be several reasons to support Agent chaining.

Can the chaining of Agents achieve lower usage costs and achieve higher accessibility and selectivity? Ultimately, the AI "rental rights" that belong exclusively to Web2 major companies will be able to be participated by ordinary users;

Security, according to the simplest definition of Agent, one can be The AI called Agent should be able to interact with the virtual or real world. If the Agent can intervene in reality or my virtual wallet, then a blockchain-based security solution is also a necessity;

Agent Can a set of financial gameplay unique to the blockchain be realized? For example, LP in AMM allows ordinary people to participate in automatic market making. For example, Agent requires computing power, data labeling, etc., and users invest in the protocol in the form of U if they are optimistic. Or new financial gameplay can be formed based on Agents in different application scenarios;

DeFi currently does not have perfect interoperability, and the Agent combined with the blockchain If transparent and traceable reasoning can be achieved, it may be more attractive than the agent browsers provided by traditional Internet giants mentioned in the previous article.

4. Creativity?

Framework projects will also provide entrepreneurial opportunities similar to GPT Store in the future. Although currently publishing an Agent through a framework is still very complicated for ordinary users, I think frameworks that simplify the Agent building process and provide some complex function combinations will still prevail in the future, thus forming a more interesting Web3 than GPT Store. creative economy.

The current GPT Store still prefers practicality in traditional fields and most of the popular apps are made by traditionalWeb2 is created by companies, and the revenue is solely owned by the creators. According to OpenAI’s official explanation, this strategy only provides financial support and a certain amount of subsidies to some outstanding developers in the United States.

Web3 still has many aspects that need to be filled in terms of demand, and in terms of the economic system, it can also make the unfairness of Web2 giants more fair. In addition, In addition, we can naturally introduce community economy to make the Agent more perfect. Agent's creative economy will be an opportunity for ordinary people to participate, and future AI memes will be far smarter and more interesting than the Agents released on GOAT and Clanker.