Vitalik: How do centralized exchanges provide proof of funds?

Vitalik: How do centralized exchanges provide proof of funds?
In the future, we may also see cryptographically “constrained” CEXs, where user funds are held in validium-like smart contracts.

By: Vitalik Buterin

Compiled by: Dong Yiming, ChainCatcher

Whenever a major centralized exchange blows up, a common question that comes up is whether we can use cryptography to fix the problem. Instead of relying solely on “fiat” methods such as government licensing, auditors, and scrutiny of corporate governance and the backgrounds of the individuals running the exchange, exchanges could create cryptographic proof that the funds they hold on-chain are sufficient to cover their liabilities to users.

Exchanges could build a system where it is simply impossible to withdraw depositors’ funds without their consent. Potentially, we could explore the entire spectrum between aspiring good-guy CEXs that “don’t be evil” and on-chain DEXs that “can’t be evil” but are currently inefficient and privacy-compromising. This post will dive into historical attempts to get trading one or two steps closer to trustlessness, the limitations of those techniques, and some newer and more powerful ideas that rely on ZK-SNARKs and other advanced techniques.

Balance Sheets and Merkle Trees: Old-school Proof of Solvency

Exchanges have long tried to prove cryptographically that they didn’t defraud their users. In 2011, MtGox, the largest Bitcoin exchange at the time, proved that they had funds by sending a transaction to transfer 424,242 BTC to a pre-announced address. In 2013, people began discussing how to solve the other side of the problem: proving the total size of customer deposits. If you prove that a customer’s deposit is equal to X (“proof of liability”) and prove ownership of the private key for X coins (“proof of assets”), then you have proof of solvency: you have proven that the exchange has the funds to repay all depositors.

The simplest way to prove a deposit is to simply publish a list of (username, balance) pairs. Each user can check whether their balance is included in the list, and anyone can check the complete list to see that (i) each balance is non-negative and (ii) the total amount is the claim amount. Of course, this breaks privacy, so we can change the scheme slightly: publish a list of (hash(username, salt), balance) pairs and privately send each user their salt value. But even this will reveal the balances, and it will reveal the changing pattern of the balances. The desire to protect privacy brings us to the next invention: Merkle tree technology.

Green: Charlie's node. Blue: David's node, which Charlie will receive as part of his proof. Yellow: Root node, which is publicly displayed to everyone.

Merkle tree technology is to put the customer's balance table into a Merkle sum tree. In the Merkle sum tree, each node is a (balance, hash) pair. The bottom leaf nodes represent the balance of individual customers and the salted username hash value. In each high-level node, the balance is the sum of the two balances below, and the hash value is the hash value of the two nodes below. The Merkle sum proof, like the Merkle proof, is a "branch" of the tree, consisting of sister nodes on the path from the leaf to the root.

The exchange will send each user a Merkle sum proof of their balance to prove their balance. The user will then get a guarantee that their balance is correctly included as part of the total. A simple code example can be found here.

The privacy leakage in this design is much lower than a fully public list, and can be further reduced by shuffling the branches every time the root directory is released, but some privacy leakage still exists. Charlie can learn that someone has a balance of 164ETH, there are two users with a balance of 70ETH, etc. An attacker who controls many accounts can still learn a lot of information about the users of the exchange.

An important subtlety of the scheme is the possibility of negative balances: what if an exchange has 1390ETH in customer balances, but only 890ETH in reserves, and tries to make up the difference by adding a -500ETH balance under some fake account in the tree? It turns out that this possibility does not break the scheme, although this is why we specifically need a Merkle sum tree instead of a normal Merkle tree. Suppose Henry is a fake account controlled by the exchange, and the exchange puts the -500ETH there.

Greta’s proof verification will fail: the exchange will have to give her Henry’s -500 ETH node, which she will reject as invalid. Eve and Fred’s verification will also fail, because Henry’s intermediate node above has a total of -230 ETH, so it’s also invalid! To get away with theft, the exchange will have to hope that no one in the right half of the tree checks their balance proofs.

If the exchange can identify users with 500 ETH worth of ETH that they believe either won’t bother checking the proof or won’t be believed when they complain about never receiving a proof, then they can be confident they can get away with stealing. However, the exchange can also exclude these users from the tree to achieve the same effect.

So, Merkle tree technology is basically as good as proof-of-liabilities if it's just for the purpose of achieving proof-of-liabilities. But its privacy properties are still not ideal. You can get away with using Merkle trees in cleverer ways, like having each satoshi or wei be a separate leaf, but ultimately with more modern technology there are better ways to do this.

Improving Privacy and Robustness with ZK-SNARKs

ZK-SNARKs are a powerful technology. ZK-SNARKs may do for cryptography what the transformer did for AI: it is such a powerful general purpose technology that it will completely crush a whole bunch of problems in application-specific technologies that were developed decades ago. So, of course, we can use ZK-SNARKs to greatly simplify and improve privacy in proof-of-liability protocols.

The simplest thing we can do is put all users’ deposits into a Merkle tree (or, more simply, a KZG commitment) and use ZK-SNARKs to prove that all balances in that tree are non-negative and add up to some claimed value. If we add a layer of hashing for privacy, the Merkle branch (or KZG proof) given to each user will not reveal any other user’s balance.

Using KZG commitments is one way to avoid privacy leaks because there is no need to provide a "sister node" as proof, a simple ZK-SNARK can be used to prove the sum of the balances and that each balance is non-negative.

We can prove the sum and non-negativity of the balances in the above KZG using a special-purpose ZK-SNARK. Here is a simple example that does this. We introduce an auxiliary polynomial that "builds up the bits for each balance" (for the sake of example, we assume the balances are in ), and every 16 positions keeps track of a running total with an offset so that it only sums to zero when the actual total agrees with the declared total. If z is a root of unity to the power of -128, we can prove the following equality.

The first value for the valid settings is 0 0 0 0 0 0 0 0 0 0 1 2 5 10 20 -165 0 0 0 0 0 0 0 0 1 3 6 12 25 50 -300 ...

See my ZK-SNARK article for further explanation on how to turn equations like this into polynomial checks and then into ZK-SNARKs. This isn’t an optimal protocol, but it does show how these types of cryptographic proofs aren’t that weird these days!

With just a few additional formulas, constraint systems like this can be adapted to more complex settings. For example, in a leveraged trading system, it is acceptable for individual users to have negative balances, but only if they have enough other assets to cover the funds with some collateralized margin. A SNARK can be used to prove this more complex constraint, giving users peace of mind that the exchange will not put their funds at risk by implicitly exempting other users from the rules.

In the longer term, this kind of ZK proof of debt might be used not just for customer deposits on exchanges, but for lending more broadly. Anyone taking out a loan would put a record into a polynomial or tree that encompasses that loan, and the root of that structure would be published on-chain. This would let anyone seeking a loan provide a lender with a ZK proof that they haven’t taken out too many other loans. Eventually, legal innovations could even allow loans that have been committed in this way to have higher priority than those that haven’t. This leads us in exactly the same direction as an idea discussed in “Decentralized Society: Finding the Soul of Web3”: the concept of building a negative reputation or collateral on-chain through some form of “soulbound token.”

Proof of assets

The simplest version of proof of assets is the protocol we saw above: to prove that you hold X coins, you simply move X coins at some pre-agreed time or in a transaction that contains "these funds belong to Binance" in the data field. To avoid paying transaction fees, you can sign an off-chain message instead; both Bitcoin and Ethereum have standards for signing off-chain messages.

There are two practical problems with this simple proof-of-asset technique.

  • Cold storage treatment

  • Dual use of collateral

For security reasons, most exchanges keep the vast majority of their customers’ funds in “cold storage”: on offline computers where transactions need to be manually signed and transferred to the internet. The cold storage I once set up for personal use involved a permanently offline computer that would generate a QR code containing the signed transaction that I could scan with my phone. Modern exchange protocols are even crazier, often involving multi-party computations between several devices. Given this setup, even one additional piece of information to prove control of an address is an expensive operation!

A transaction can take the following paths:

Keep a few public addresses for long-term use . The exchange will generate several addresses, issue a proof once for each address to prove ownership, and then reuse the addresses. This is by far the simplest scheme, although it does add some limitations on how security and privacy can be protected.

Set up many addresses, and prove a few at random . An exchange will have many addresses, perhaps even each used only once and retired after a transaction. In this case, the exchange might have a protocol that from time to time randomly selects a few addresses that must be "opened" to prove ownership. Some exchanges already do something similar with auditors, but in principle this technique could be turned into a fully automated process.

More complex ZKP options . For example, an exchange could set up all of its addresses as a 1/2 multisig, where the key for each address is different, and the other is a blinded version of some "big" emergency backup key, stored in some complex but very high-security way, like a 12/16 multisig. To protect privacy and avoid exposing its entire set of addresses, the exchange could even run a zero-knowledge proof on the blockchain that proves the total balance of all addresses on the chain that have this format.

Another major issue is preventing double use of collateral. Exchanges could easily ship collateral back and forth between each other for proof of reserves, which would allow them to pretend to be solvent when they are not. Ideally, proof of solvency would be done in real time, with the proof updated after every block. If this is not realistic, the next best option would be to coordinate a fixed schedule between different exchanges, for example, proving reserves every Tuesday at 14:00 UTC.

The final question is: can asset proofs be done in fiat currency? Exchanges don’t just hold cryptocurrencies, they also hold fiat currency within the banking system. Here, the answer is: yes, but such a program will inevitably rely on the “fiat currency” trust model: banks themselves can prove balances, auditors can prove balance sheets, and so on. Given that fiat currency is not cryptographically verifiable, this is the best that can be done within this framework, but it is still worth doing.

Another approach is to cleanly separate an entity that runs an exchange and handles asset-backed stablecoins like USDC from another entity that handles the cash in and out process of moving between cryptocurrencies and the traditional banking system (USDC itself). Because USDC’s “liabilities” are just ERC20 tokens on-chain, proof of liabilities is “free” and only requires proof of assets.

Plasma and validiums: Can we make CEXs unfettered?

Let’s say we want to go a step further: we don’t want to just prove that the exchange has the funds to pay back the user. Instead, we want to prevent the exchange from stealing the user’s funds altogether.

The first major attempt at this was Plasma, a scaling solution that became popular in Ethereum research circles in 2017 and 2018. Plasma works by splitting balances into a set of separate "coins", each of which is assigned an index and located at a specific position in the Plasma block's Merkle tree. Valid transfers of a coin require placing a transaction in the correct position in the tree, the root of which is published on-chain.

An oversimplified diagram of a version of Plasma. Coins are stored in a smart contract, and the rules of the Plasma protocol are enforced when withdrawing.

OmiseGo attempted to make a decentralized exchange on top of this protocol, but they have since moved on to other ideas — as has the Plasma Group itself, for that matter, which is now the optimistic EVM rollup project Optimism.

The technical limitations of Plasma as envisioned in 2018 (e.g., proof-of-coin fragmentation) are not worth looking at. Since the peak of the Plasma discourse in 2018, ZK-SNARKs have become more viable for use cases related to scaling, and as we said above, ZK-SNARKs change everything.

A more modern version of the Plasma idea is what Starkware calls a validium: essentially the same as a ZK-rollup, except the data is kept off-chain. This structure can be used for a lot of use cases, imagine any situation where a centralized server needs to run some code and prove that it executed it correctly. Within a validium, there is no way for the operator to steal funds, although depending on the details of the implementation, some amount of user funds could be stuck if the operator disappeared.

This is all really good: far from being a binary opposition between CEX and DEX, it turns out there’s a whole range of options, including various forms of hybrid centralization where you get some benefits like efficiency but still have many crypto guardrails that prevent centralized operators from engaging in most forms of abuse.

However, the right half of this design space still requires us to discuss the most basic problem: handling user errors. By far the most important type of error is: what if a user forgets their password, loses their device, gets hacked, or otherwise loses access to their account?

Exchanges can solve this problem: first with email recovery, and if even that fails, more complex forms of recovery via KYC. However, to be able to solve such a problem, the exchange needs to have real control over the coins. In order to have the ability to restore funds to a user's account for good reasons, the exchange needs to have power that could also be used to steal funds from a user's account for bad reasons. This is an unavoidable trade-off.

The ideal long-term solution would rely on self-custody, supplemented by technologies like multi-sig and social recovery wallets to help users in emergencies. But in the short term, there are two obvious alternatives with significantly different costs and benefits.

Conclusion: Better Exchanges in the Future

In the short term, there are two distinct "categories" of exchanges: custodial exchanges and non-custodial exchanges. Today, the latter category is just DEXes, like Uniswap, and in the future, we may also see cryptographically "constrained" CEXes, where user funds are held in something like validium smart contracts. We may also see semi-custodial exchanges, where we trust them with fiat currency instead of crypto.

Both types of exchanges will continue to exist, and the simplest backwards-compatible way to improve the security of custodial exchanges is to add proof of reserves. This involves a combination of proof of assets and proof of liabilities. There are technical challenges in developing good protocols for both, but we should also make progress on both as much as possible and open source the software and processes as much as possible so that all exchanges can benefit.

In the longer term future, I hope we get closer to all exchanges being non-custodial, at least on the crypto side. Wallet recovery will exist, and there may be a need for highly centralized recovery options for new users handling small amounts, and for institutions that require such arrangements for legal reasons, but this can be done at the wallet layer rather than within the exchange itself. The way magic.link interacts with platforms like Polymarket is an example of this approach. On the fiat side, flows between the traditional banking system and the crypto ecosystem can be accomplished through native cash in/cash out flows for asset-backed stablecoins like USDC. However, it will be some time before we fully reach this goal.

Special thanks to Balaji Srinivasan, and the folks at Coinbase, Kraken, and Binance for discussions.

<<:  Security company Zellic Lianchuang discovered and reported a minor and harmless bug in WETH

>>:  GBTC negative premium rises to a record high, is another crash coming?

Recommend

What is the difference between peach blossom eyes and phoenix eyes?

The eyes are the windows to the soul because they...

Is it good to have a man with great virtue?

Everyone knows very well that everyone’s blessing...

Mole location and destiny - what is the meaning of having moles around eyebrows

The emotional fortune of people with moles at the...

How to tell your life fortune from your forehead

How to tell your life fortune from your forehead ...

Coin Zone Trends: Bitcoin Price Trends Based on Big Data This Week (2017-04-25)

The medium- and long-term upward trend remains un...

A mole under a woman's right breast is related to money.

Mole physiognomy is a relatively special way of f...

Hackers "kidnapped" the printer and demanded 0.05 Bitcoins

Recently, the Hull University’s HullChina Weibo a...

What is the fate of a woman with a mole on her forehead?

As we all know, moles are something that everyone...

What are the characteristics of late bloomer palmistry

Whether you achieve success early in your life or...