Reddio Technology Overview: A Narrative Review from Parallel EVM to AI

Author: Wuyue, Geek web3

As blockchain technology iterates faster and faster, optimization of performance has become a key issue, and the Ethereum roadmap has been very clear. Rollup is the center, and the serial transaction processing characteristics of EVM are a constraint and cannot meet future high-concurrency computing scenarios.

In the previous article - "The Optimization Road of Parallel EVM from Reddio", we gave a brief overview of Reddio's parallel EVM design ideas, and in today's article, we will A more in-depth explanation of its technical solutions and scenarios of its combination with AI.

Since Reddio’s technical solution uses CuEVM, this is a project that uses GPU to improve EVM execution efficiency. We will start with CuEVM first.

CUDA Overview

CuEVM is a project to accelerate EVM with GPU. It converts the opcodes of Ethereum EVM into CUDA Kernels for parallel execution on NVIDIA GPUs. Improve the execution efficiency of EVM instructions through the parallel computing capabilities of the GPU. N card users may often hear the word CUDA - Compute Unified Device Architecture, which is actually a parallel computing platform and programming model developed by NVIDIA. It allows developers to utilize the parallel computing capabilities of the GPU for general computing (such as mining in Crypto, ZK operations, etc.), not just limited to graphics processing.

As an open parallel computing framework, CUDA is essentially an extension of the C/C++ language. Any low-level programmer familiar with C/C++ can quickly get started. A very important concept in CUDA is Kernel (kernel function), which is also a C++ function.

But unlike regular C++ functions that are only executed once, these kernel functions will be called when called by the startup syntax <<<...>>> Executed N times in parallel by N different CUDA threads.

Each thread of CUDA is assigned an independent thread ID, and uses a thread hierarchy to allocate threads into blocks and grids so that Used to manage a large number of parallel threads. Through NVIDIA's nvcc compiler, we can compile CUDA code into a program that can run on the GPU.

Basic workflow of CuEVM

After understanding a series of basic concepts of CUDA, you can take a look at the workflow of CuEVM.

The main entrance of CuEVM is run_interpreter. From here, transactions to be processed in parallel are input in the form of json files. Use cases from projectsAs can be seen from the figure, the input is all standard EVM content, and there is no need for developers to process, translate, etc. separately.

As you can see in run_interpreter(), it calls the kernel_evm() kernel function using the <<...>> syntax defined by CUDA. As we mentioned above, kernel functions are called in parallel on the GPU.

In the kernel_evm() method, evm->run() will be called. We can see that there are a large number of branch judgments to convert EVM opcodes into CUDA operations. .

Taking the addition operation code OP_ADD in the EVM as an example, you can see that it converts ADD into cgbn_add. CGBN (Cooperative Groups Big Numbers) is CUDA's high-performance multi-precision integer arithmetic operation library.

These two steps convert EVM opcodes into CUDA operations. It can be said that CuEVM is also the implementation of all EVM operations on CUDA. Finally, the run_interpreter() method returns the operation result, which is the world state and other information.

At this point, the basic operating logic of CuEVM has been introduced.

CuEVM has the ability to process transactions in parallel, but the purpose of the CuEVM project (or the main use case shown) is to do Fuzzing testing: Fuzzing is an automated software testing technology, which uses Programs are fed large amounts of invalid, unexpected, or random data to observe the program's response to identify potential errors and security issues.

We can see that Fuzzing is very suitable for parallel processing. CuEVM does not deal with issues such as transaction conflicts, that is not its concern. If you want to integrate CuEVM, you also need to handle conflicting transactions.

We have already introduced the conflict handling mechanism used by Reddio in our previous article "Optimization of Parallel EVM from Reddio" and will not go into details here. After Reddio sorts the transactions using the conflict handling mechanism, it can then be sent to CuEVM. In other words, the transaction sorting mechanism of Reddio L2 can be divided into two parts: conflict processing + CuEVM parallel execution.

Layer2, parallel EVM, the three-way intersection of AI

As mentioned above, parallel EVM and L2 are just the starting point for Reddio, and its future roadmap will clearly integrate it with AI narrative. Reddio, which uses GPUs for high-speed parallel transactions, is naturally suitable for AI operations in many features:

GPUs have strong parallel processing capabilities and are suitable for performing convolution operations in deep learning. These operations are large-scale in nature. Matrix multiplication, which the GPU is designed forTask optimization.

GPU's thread hierarchical structure can match the correspondence between different data structures in AI computing, improving computing efficiency and masking memory latency through thread overprovisioning and Warp execution units.

Computing intensity is a key indicator to measure AI computing performance. GPU optimizes computing intensity, such as introducing Tensor Core, to improve the performance of matrix multiplication in AI computing and achieve computing and data An efficient balance between transmissions.

So how do AI and L2 combine?

We know that in the architectural design of Rollup, the entire network is not just a sorter, but also has roles such as supervisors and forwarders to verify or collect transactions. They essentially use sorting The same client as the server, but with different functions. In traditional Rollup, the functions and permissions of these secondary roles are very limited. For example, the watcher role in Arbitrum is basically passive, defensive and public welfare, and its profit model is also questionable.

Reddio will adopt a decentralized sorter architecture, and miners will provide GPUs as nodes. The entire Reddio network can evolve from a pure L2 to a comprehensive L2+AI network, which can well implement some AI+blockchain use cases:

AI Agent's interactive basic network

With the development of blockchain technology With continuous evolution, AI Agent has great potential for application in blockchain networks. Let's take AI Agents that perform financial transactions as an example. These intelligent agents can autonomously make complex decisions and execute trading operations, and can even respond quickly under high-frequency conditions. However, it is basically impossible for L1 to carry huge transaction loads when handling such intensive operations.

As an L2 project, Reddio can greatly improve its parallel transaction processing capabilities through GPU acceleration. Compared with L1, L2, which supports parallel execution of transactions, has higher throughput and can efficiently handle high-frequency transaction requests from a large number of AI Agents to ensure smooth operation of the network.

In high-frequency trading, AI Agents have extremely strict requirements on transaction speed and response time. L2 reduces the verification and execution time of transactions, thereby significantly reducing latency. This is crucial for AI agents that need to respond at the millisecond level. By migrating a large number of transactions to L2, the congestion problem of the main network is also effectively alleviated. Making the operation of AI Agents more cost-effective.

As L2 projects such as Reddio mature, AI Agent will play a more important role in the blockchain, promoting innovation in DeFi and other blockchain application scenarios combined with AI.

Decentralized computing power market

Reddio will adopt a decentralized sorter architecture in the future. Miners use GPU computing power to determine sorting rights. The overall networkThe performance of participants' GPUs will gradually improve with competition, and can even reach the level used for AI training.

Build a decentralized GPU computing power market to provide lower-cost computing power resources for AI training and reasoning. From small to large computing power, from personal computers to computer room clusters, all levels of GPU computing power can join the market to contribute their idle computing power and earn income. This model can reduce AI computing costs and allow more people to participate. AI model development and application.

In the use case of the decentralized computing power market, the sequencer may not be mainly responsible for the direct calculation of AI. Its main function is to process transactions and coordinate AI computing power in the entire network. Regarding computing power and task allocation, there are two models:

Top-down centralized allocation. Thanks to the sorter, the sorter can allocate the computing power requests received to nodes that meet the needs and have a good reputation. Although this allocation method has problems of centralization and unfairness in theory, in fact the efficiency advantages it brings far outweigh its disadvantages, and in the long run, the sorter must satisfy the positive sum of the entire network in order to develop in the long run, that is, there is Implicit but direct constraints ensure that the sorter is not too biased.

Bottom-up spontaneous task selection. Users can also submit AI computing requests to third-party nodes. In specific AI application fields, this is obviously more efficient than submitting them directly to the sequencer, and can also prevent censorship and bias by the sequencer. After the operation is completed, the node will synchronize the operation results to the sequencer and upload them to the chain.

We can see that in the L2 + AI architecture, the computing power market has extremely high flexibility, and can gather computing power from two directions to maximize resources. utilization rate.

On-chain AI reasoning

Currently, the maturity of open source models is sufficient to meet diverse needs. With the standardization of AI inference services, it becomes possible to explore how to put computing power on the chain to achieve automated pricing. However, this requires overcoming a number of technical challenges:

Efficient request distribution and recording: Large model inference has high latency requirements, and an efficient request distribution mechanism is very critical. Although the request and response data are large and private and should not be made public on the blockchain, a balance between recording and verification must be found - for example, by storing hashes.

Verification of computing power node output: Has the node actually completed the specified computing tasks? For example, node false reporting uses small model calculation results instead of large models.

Smart contract reasoning: Combining AI models with smart contracts for calculations is necessary in many scenarios. Since AI reasoning is uncertain and cannot be used in all aspects on the chain, the logic of future AI dApps is likely to be partially located off-chain and the other part located in on-chain contracts. The on-chain contracts are valid for inputs provided off-chain. Limited by nature and numerical legality. In the Ethereum ecosystem, combining with smart contracts requires facing EVM.Inefficient seriality.

But in Reddio's architecture, these are relatively easy to solve:

The sorter's distribution of requests is far more efficient than L1, and can be considered equal to the efficiency of Web2. As for the recording location and retention method of data, it can be solved by various cheap DA solutions.

The results of AI operations can ultimately be verified by ZKP for their correctness and goodwill. The characteristic of ZKP is that verification is very fast, but generating proofs is slow. The generation of ZKP can also be accelerated using GPU or TEE.

Solidty → CUDA → GPU, the EVM parallel mainline, is the foundation of Reddio. So on the surface this seems to be the easiest problem for Reddio. Reddio is currently cooperating with AiI6z’s eliza to introduce its modules into Reddio. This is a direction worth exploring.

Summary

On the whole, the fields of Layer 2 solutions, parallel EVM and AI technology seem to be unrelated to each other, but Reddio cleverly integrates these major innovative fields by making full use of the computing characteristics of GPU. The ground is joined together.

By utilizing the parallel computing characteristics of GPU, Reddio improves transaction speed and efficiency on Layer 2, thereby enhancing the performance of Ethereum's second layer. Integrating AI technology into the blockchain is a novel and promising attempt. The introduction of AI can provide intelligent analysis and decision support for on-chain operations, thereby achieving more intelligent and dynamic blockchain applications. This cross-field integration has undoubtedly opened up new paths and opportunities for the development of the entire industry.

However, it should be noted that this field is still in its early stages and still requires a lot of research and exploration. The continuous iteration and optimization of technology, as well as the imagination and actions of market pioneers, will be the key driving forces for this innovation to mature. Reddio has taken an important and bold step at this intersection, and we look forward to seeing more breakthroughs and surprises in this integration field in the future.

Online Consultation