V God will show you how to understand Ethereum's scalability solution Rollup

Original title: An Incomplete Guide to Rollups

Original article by: VitalikButerin

Original translation: 0x33, BlockBeats

Rollups are very popular in the Ethereum community and will be a key scalability solution for Ethereum for the foreseeable future. But what exactly is this technology, what can you expect from it, and how will you use it? This post will try to answer some of these key questions.

Background: What are Layer 1 and Layer 2 extensions?

There are two ways to scale the blockchain ecosystem. First, you can make the blockchain itself have a higher transaction capacity. The main challenge of this technology is that a blockchain with "larger blocks" will itself be more difficult to verify and may become more centralized. To avoid such risks, developers can improve the efficiency of client software or more sustainably use technologies such as sharding to split the work of building and verifying the chain across multiple nodes; as we all know, "ETH2.0" is currently working to build an upgrade to Ethereum.

Second, you can change the way you use the blockchain. Instead of putting all your activity directly on the blockchain, users perform most of their activity off-chain in a "Layer 2" protocol. There is an on-chain smart contract that has only two tasks: processing deposits and withdrawals, and verifying proof that everything that happened off-chain followed the rules. There are multiple ways to implement these proofs, but they all have a common property that verifying the proof on-chain is much cheaper than doing the original computation off-chain.

State channels vs. Plasma vs. Rollup

There are three main ways to expand Layer 2: State channels, Plasma, and Rollup. They are three different ways, each with its own advantages and disadvantages. At this point, we are quite confident that all Layer 2 extensions can be roughly classified into these three categories. (Although there is controversy in naming, such as "validium").

How do state channels work?

You can click here: https://www.jeffcoleman.ca/state-channels and statechannels.org

Suppose Alice provides Bob with an Internet connection, and in exchange Bob pays her $0.001 per megabyte. Instead of making a transaction for each payment, Alice and Bob use the following Layer 2 scheme.

First, Bob puts $1 (or some ETH or stablecoin equivalent) into the smart contract. To make his first payment to Alice, Bob signs a “ticket” (an off-chain message) that simply says “$0.001” and sends it to Alice. To make the second payment, Bob signs another note that says “$0.002” and sends it to Alice. And so on, for as many payments as needed. When Alice and Bob have completed their transaction, Alice can post the highest-value note to the chain, wrapped in another signature from herself.

The smart contract verifies Alice and Bob's signatures, pays Alice the amount on Bob's note, and returns the rest to Bob. If Alice is unwilling to close the channel (due to malicious intent or technical failure), Bob can initiate an exit period (for example, 7 days); if Alice does not provide a new note within this period, then Bob can get all his money back.

This technology is powerful: it can be adapted to handle bidirectional payments, smart contract relationships (eg Alice and Bob create a financial contract within a channel), and composition (if Alice and Bob have a public channel, and Bob and Charlie do too, then Alice can interact with Charlie trustlessly).

But channels are limited in what they can do. Channels cannot be used to send funds off-chain to people who are not yet involved. Channels cannot be used to represent objects that do not have a clear logical owner (e.g. Uniswap). Channels require locking up large amounts of funds when used to do anything more complex than simple recurring payments.

How does Plasma work?

Check out the original Plasma paper and Plasma Cash.

To deposit assets, users send them to the smart contract that manages the Plasma chain.

The Plasma chain assigns a unique ID to the asset (like 537). Each Plasma chain has an operator (this can be a centralized actor, or a multi-signature, or something more complex like PoS or DPoS). Every once in a while (it could be 15 seconds, or 1 hour, or anything in between), the operator generates a "batch" containing all the Plasma transactions they received off-chain.

They generate a Merkle tree, where at each index X in the tree, if there is a transaction, there is a transaction to transfer asset ID X, otherwise it is zero. They send this Merkle tree root to the chain. They also send the Merkle branch of each index X to the current owner of the asset. To withdraw assets, users publish the Merkle branch of the latest transaction that sent the asset to them. Note: Merkle tree, called Merkle tree or hash tree in Chinese. In cryptography and computer science, it is a tree data structure, where each leaf node is labeled with the hash of the data block, and nodes other than leaf nodes are labeled with the encrypted hash of their child node labels. An obvious advantage of Merkle Tree is that you can take out a branch alone (as a small tree) to verify part of the data, which brings convenience and efficiency that hash lists cannot match in many use cases.

The contract begins with a challenge period, during which anyone can attempt to invalidate the exit using a different Merkle branch by proving that (1) the sender did not own the asset at the time of the send, or (2) they sent the asset to someone else at some point in the future. If no one can prove within (for example) 7 days that the exit was inauthentic, the user can withdraw their assets.

Plasma offers more powerful properties than state channels: you can send assets to participants who have never been part of the system, and the capital requirements are much lower. But this comes at a cost: channels do not require any data to run on the chain during "normal operation", while Plasma requires each chain to publish a hash periodically. In addition, Plasma transfers are not immediate: you have to wait for the interval to end and the block to be published.

Furthermore, Plasma and state channels share a common weakness: why are they secure? The game theory behind this relies on the idea that every object controlled by both systems has some logical "owner". If that owner doesn't care about their asset, then it's possible to cause "invalid" outcomes involving that asset. This is acceptable for many applications, but not for many others (eg. Uniswap). Even systems where the state of the object sending the asset can change without the owner's consent (eg. account-based systems where you can increase someone's balance without their consent) don't work well with Plasma.

All this means that in any realistic Plasma or state channel deployment, a lot of "application-specific reasoning" is required, and it is impossible to make a Plasma or channel system that simulates the entire Ethereum environment (or "EVM").

To solve this problem, we can look at...rollup.

Rollup

For more details, please click here: EthHub on optimistic rollups and ZK rollups.

Plasma and state channels are full Layer 2 initiatives that attempt to move both data and computation off-chain. However, fundamental game theory issues around data availability mean that it is impossible to do this securely for all applications. Plasma and channels rely on a clear notion of owner to solve this problem, but this hinders their generality. On the other hand, rollup is a "hybrid" Layer 2 approach.

Rollup moves computation (and state storage) off-chain, but keeps some data about each transaction on-chain. To be efficient, they use a bunch of fancy compression tricks to replace data with computation whenever possible. As a result, the scalability of the system is still limited by the data bandwidth of the underlying blockchain, but at a very favorable ratio: an Ethereum underlying ERC20 token transfer costs about 45,000 gas, while an ERC20 token transfer in a rollup takes up 16 bytes of on-chain space and costs less than 300 gas.

The fact that the data is on-chain is key (note: putting data "on IPFS" doesn't work because IPFS doesn't agree on whether any particular data is available; the data must be on the blockchain) Putting data on-chain and having consensus on that fact allows anyone to process all operations in the rollup locally if they so choose, allowing them to detect fraud, initiate withdrawals, or start producing transaction batches themselves. The lack of data availability issues means that malicious or offline operators can do less harm (eg. they can't cause 1 week delays), opening up a larger design space for who has the power to publish batches and making rollups easier to reason about.

Most importantly, the lack of data availability issues means that there is no longer a need to map assets to owners, which is a key reason why the Ethereum community is more enthusiastic about rollups than previous forms of Layer 2 scaling: rollups are completely general, and even EVM can be run in rollups, allowing existing Ethereum applications to migrate to Rollups with almost no need to write any new code.

OK, so how great is rollup work?

There is a smart contract on-chain that maintains a state root: the Merkle root of the rollup state (meaning, account balances, contract code, etc., which are "inside" the rollup).

Anyone can publish a batch, which is a collection of transactions in a highly compressed form, along with the previous state root and the new state root (the Merkle root after the transactions are processed). The contract checks if the previous state root in the batch matches the current state root; if so, it switches the state root to the new one.

To support deposits and withdrawals, we added the ability for a transaction’s inputs or outputs to “exit” the rollup state. If the batch has inputs from the outside, the transaction submitting the batch also needs to transfer those assets to the rollup contract. If the batch has outputs to the outside, then the smart contract will initiate a withdrawal when processing the batch.

So, there’s just one major detail here: how do you know that the post-state root in the batch is correct? If someone can submit a batch with any post-state root without any consequences, then they can transfer all the tokens in the rollup to themselves.

This question is critical because it has two very different sets of solutions that lead to two types of rollups.

Optimistic rollup vs. ZK rollup

These two types of rollups are Optimistic rollup and ZK rollup.

Optimistic rollup, they use fraud proofs: the rollup contract tracks the entire history of its state root and the hash of each batch. If someone discovers that a batch has an incorrect post-state root, they can publish a proof to the chain that the batch was calculated incorrectly. The contract will verify the proof and revert that batch and all batches after it.

ZK rollup, using validity proofs: Each batch contains a cryptographic proof called a ZK-SNARK (e.g., using the PLONK protocol) that proves that the state root is the correct result of executing the batch. The proof can be quickly verified on-chain, no matter how large the computation is.

There are complex trade-offs between these two types of rollups:

In general, my view is that in the short term, Optimistic rollup has a chance to win for general-purpose EVM computation, while ZKrollup is expected to win at the simple payment level, trading platforms, and other application-specific use cases. However, in the medium to long term, ZK rollup will win in all use cases as ZK-SNARK technology improves.

Dissecting a Fraud Proof

The security of optimistic rollup depends on if someone posts an invalid batch to the rollup, anyone who follows the chain and detects fraud can post a fraud proof proving to the contract that the batch was invalid and should be reverted.

A fraud proof claiming that a batch is invalid will contain data in green: the batch itself (which can be checked against the hash stored on-chain) and the parts of the Merkle tree needed only to prove specific accounts that the batch read and/or modified.

The nodes in the yellow tree can be reconstructed from the green nodes, so no provision is needed. This data is sufficient to perform the batch and compute the post-state root (note that this is exactly the same as how a stateless client verifies a single block). If the post-state root computed in the batch is not the same as the provided post-state root, the batch is fraudulent.

It is guaranteed that if a batch is constructed incorrectly, and all previous batches were constructed correctly, then it is possible to create a fraud proof showing that the batch construction was incorrect. Note the statement about previous batches: if there are multiple invalid batches published to the rollup, then it is best to try to prove the earliest one is invalid. Of course, if a batch is constructed correctly, then it is never possible to create a fraud proof showing that the batch is invalid.

How does compression work?

A simple Ethereum transaction (sending ETH) takes about 110 bytes. However, an ETH transfer on a rollup only takes about 12 bytes:

Part of this is due to more advanced encodings: Ethereum’s RLP wastes 1 byte on the length of each value. But there are also some very clever compression tricks going on:

Nonce: prevents replay. If the current nonce for an account is 5, then the next transaction for that account must have nonce 5, but once the transaction is processed, the nonce in the account will be incremented to 6, so the transaction cannot be processed again. In a rollup, we can ignore the nonce completely because we are just restoring the nonce from the pre-state; if someone tries to replay the transaction with an earlier nonce, the signature will fail to verify because the signature will be checked against the data containing the new higher nonce

Gas prices : We could allow users to pay within a fixed gas price range, for example, for 2 to the 16th power. Alternatively, we could set a fixed fee level per batch, or even move gas payments outside of the rollup protocol entirely, having transaction processors pay batch creators through a channel.

Gas: Similarly, we can write the total amount of gas as a power of 2. Alternatively, we can set the gas limit only on a batch basis.

To: We can replace the 20-byte address with an index. If an address is the 4527 address added to the tree, we just use the index 4527 to reference it. We'll add a subtree to the state to store the mapping of indices to addresses).

Value: We can store values in scientific notation. In most cases, only 1-3 significant digits are required for transmission.

Signature: We can use BLS aggregate signatures, which allows many signatures to be aggregated into a single ~32-96 bytes (depending on the protocol) signature. This signature can then be checked against the entire set of messages and senders in the batch at once. The ~0.5 in the table indicates that there is a limit to how many signatures can be included in an aggregate, as the signatures need to be verified in a single fraud proof.

An important compression trick unique to ZK rollup is that if part of a transaction is only used for verification and has nothing to do with computing state updates, then that part can be left off-chain. This is not possible with optimistic rollup, because this data still needs to be included on-chain in case it needs to be checked later in a fraud proof, while in ZK rollup, the SNARK proving the correctness of the batch has already proven to provide the data needed for any verification.

An important example is the privacy-preserving OLLPS total: in optimistic rollup, ~500 bytes of ZK-SNARK for privacy of each transaction needs to be placed on-chain, while in ZK rollup, the ZK-SNARK covering the entire batch already proves beyond a doubt that the “internal” ZK-SNARK is valid.

These compression tricks are key to rollup scalability; without them, rollups might only provide about a 10x improvement in scalability over the base chain (though there are some computationally expensive applications where even simple rollups are powerful), whereas with compression tricks, scaling factors of over 100x can be achieved for almost all applications.

Who can submit a batch transaction for review?

There are many opinions on who can submit batches in optimistic or ZK rollup. Generally speaking, everyone agrees that in order to be able to submit a batch, the user must submit a large deposit. If a user ever submits a fraudulent batch (for example, an invalid state root), that deposit will be partially burned and partially used as a reward to the fraud prover. But there are many possibilities besides this approach:

Completely disorderly approach: Anyone can submit a batch of transactions at any time. This is the simplest approach, but it has some serious drawbacks. In particular, there is a risk that multiple participants will generate and attempt to submit batches of transactions together, and only one of these batches can be successfully processed. This results in a lot of energy being wasted on generating proofs, and also in wasted gas because of the need to put batches of transactions on the chain.

Centralized Sequencer: There is a single actor: the sequencer. He can submit batches of transactions (except for withdrawals: the usual technique is that the user can first submit a withdrawal request, and then if the sequencer does not process the withdrawal request in the next batch, the user can submit a single-operation batch for processing). This is the most "efficient", but it relies on a centralized role to maintain operation.

Sequencer Auction: Hold an auction to decide who has the right to become the sequencer (for example, the next day). The advantage of this technology is that the funds it raises can be distributed by a DAO controlled by a rollup, for example (see: MEV Auction)

Random selection from the PoS set: Anyone can deposit ETH (or the rollup's own protocol token) into the rollup contract, and the sorter for each batch is randomly selected from a pool of depositors with a probability proportional to the amount deposited. The main disadvantage of this technique is that it can lead to a lot of unnecessary capital lockup.

DPoS voting: There is a single sorter that is selected through an auction, but if they underperform, token holders can vote to kick them out and hold a new auction to replace them

Amortized batch processing and state root provisioning

Some rollups currently in development use a "split batching" approach, where the operation of submitting a batch of Layer 2 transactions is separate from the operation of submitting the state root. This approach has some key advantages:

You can allow many orderers to publish batches of transactions in parallel to increase censorship resistance without worrying about some batches becoming invalid because other batches were included first.

If a state root is fraudulent, you don't need to revert the entire batch, you can just revert the state root and wait for someone to provide a new state root for the same batch. This provides better assurance to the transaction sender that their transaction cannot be reverted.

So, all in all, there is a fairly complex technology park that attempts to balance complex trade-offs between efficiency, simplicity, censorship resistance, and other goals. It’s too early to say which combination will work best; time will tell.

How much scalability can rollup give you?

On the existing Ethereum chain, the gas limit is 12.5 million, and each byte of data in a transaction costs 16 gas. This means that if a block contains only a single batch (let's say ZK rollup is used, and 500k gas is spent on proof verification), then the batch is (12 million / 16) = 750,000 bytes of data.

As shown above, the summary of ETH transfers only requires 12 bytes per user operation, which means that the batch processing can contain up to 62,500 transactions. At an average block time of 13 seconds, this means about 4,807 TPS (for comparison, the TPS of ETH transfers directly on Ethereum itself is 12.5 million / 21,000 / 13 ~= 45 TPS).

Here are some other use cases:

The maximum scalability gain is calculated as (L1 gas cost) / (number of rollup bytes * 16) * 12 million / 12.5 million.

Now, it’s worth remembering that these numbers are overly optimistic for several reasons. Most importantly, a block will almost never contain just one batch, there will be at least multiple rollups. Second, deposits and withdrawals will continue to happen all the time. Third, usage will be low in the short term, so fixed costs will dominate. But even taking these factors into account, over 100x scalability is expected to become the norm.

Now, what if we want to exceed 1000-4000 TPS (depending on the specific use case)? This is where eth2 data sharding comes in. The sharding scheme opens up 16mb of space every 12 seconds, which can be filled with any data, and the system guarantees consensus on the availability of the data. This data space can be used for rollups. This ~1398k bytes/second is a 23x increase over the ~60kb/second of the existing Ethereum chain, and in the long run, data capacity is expected to grow further. Therefore, rollups using eth2 sharded data can handle up to 100k TPS, and even more in the future.

What challenges are there in rollup that have not yet been fully solved?

While rollups are now a basic concept that is easy to understand, and we are very confident that they are inherently viable and secure, with multiple rollup solutions already deployed to mainnet, there are still many areas of rollup design that are not well developed, and there are still significant challenges to fully migrate most of the Ethereum ecosystem to rollups to take advantage of their scalability.

Some key challenges include:

User and ecosystem adaptation: Few applications use rollups, users are unfamiliar with rollups, and few wallets have started integrating rollups. Merchants and charities do not yet accept payments in this manner.

Cross-Rollup Transactions: Efficiently move assets and data (e.g. via oracle outputs) from one rollup to another without going through the base layer.

Audit incentives: How to maximize the probability that at least one honest node will actually fully verify the optimistic rollup, so as to issue a fraud proof when something goes wrong? For small rollups (up to a few hundred TPS), this is not a significant issue and one can simply rely on altruism, but for large rollups, more explicit reasoning is needed.

Exploring the design space between Plasma and Rollup: Are there techniques that can bring some, but not all, data related to state updates on-chain, and what useful conclusions can be drawn from this?

Maximizing pre-confirmation security: Many rollups offer the concept of “pre-confirmation” for a faster user experience, where the sorter immediately provides a promise that the transaction will be included in the next batch, and if they break their promise, the sorter deposit is destroyed. But the economic security of this scheme is limited because it is possible to make many promises to many participants at the same time. Can this mechanism be improved?

Improve responsiveness to absent sorters: If a rollup’s sorter suddenly goes offline, it is valuable to restore the sorter as quickly and cheaply as possible, either by quickly and cheaply exiting in bulk to a different rollup or by quickly replacing the sorter.

Efficient zk: Generate a ZK-SNARK proof that generic EVM code (or some different VM that an existing smart contract can compile to) was executed correctly, with a given outcome.

in conclusion

Rollup is a powerful new Layer 2 scaling paradigm that is expected to be a cornerstone of Ethereum scaling in the short and medium term (and possibly the long term). They have seen a huge interest in the Ethereum community for this, and unlike previous Layer 2 scaling attempts, they can support general EVM code, allowing existing applications to migrate easily. To do this, they made a key compromise: instead of trying to go completely off-chain, they leave a small amount of data about each transaction on-chain.

There are many kinds of Rollups, and many choices in design: like fraud proofs with optimistic rollups, or validity proofs with ZK rollups (aka ZK-SNARKs). The sequencer (the user who can publish batches of transactions to the chain) can be a centralized actor, a free actor, or many other options in between. Rollups are still an early technology, and development is still proceeding rapidly, but they do work, and some of them (notably Loopring, ZKSync, and DeversiFi) have been running for several months. There will be more exciting work in the Rollup space in the coming years.

<<: Institutions are driving up the price of Bitcoin

>>: EOS insider: I want to leave

The University of Edinburgh and IOHK jointly established a blockchain research center with an annual investment of up to one million US dollars

Cryptocurrency

Bitcoin price halving: How miners are a key factor in the surge

Use digital currency to purchase financial and tax services, Linmengbao and Yihoufang work together to promote the upgrade of blockchain applications

Cryptocurrency

What does the eyebrow physiognomy knowledge represent?

Blog

BTC breaks through 35,000 again. In the short term, we need to pay attention to the linkage between trading volume and price.

Yesterday, BTC briefly stabilized around 34,400 b...

Background: What are Layer 1 and Layer 2 extensions?

State channels vs. Plasma vs. Rollup

How do state channels work?

How does Plasma work?

Rollup

OK, so how great is rollup work?

Optimistic rollup vs. ZK rollup

Dissecting a Fraud Proof

How does compression work?

Who can submit a batch transaction for review?

Amortized batch processing and state root provisioning

How much scalability can rollup give you?

What challenges are there in rollup that have not yet been fully solved?

in conclusion

Recommend