How to build a decentralized storage solution? Which solutions are better or worse?

How to build a decentralized storage solution? Which solutions are better or worse?

Decentralized storage landscape

Unlike Layer 1 blockchains (such as Bitcoin and Ethereum) whose primary purpose is to trustlessly transfer value, decentralized storage networks not only need to record transactions (for storage requests), but must also ensure that data is stored within a specific time and overcome other challenges related to storage. Therefore, it is common to see decentralized storage blockchains apply multiple consensus mechanisms that work together to ensure that different aspects of storage and retrieval can function.

In the following non-exhaustive list of decentralized storage projects, we can get a glimpse into the decentralized storage landscape as well as niche data storage use cases like P2P file sharing and data marketplaces.

This research focuses on storage networks (both IPFS-based and non-IPFS-based).

Figure 1: Overview of decentralized storage protocols (non-exhaustive)

Decentralized storage design challenges

As demonstrated in the first section of this article, blockchains are not suitable for storing large amounts of data on-chain due to the associated costs and impact on block space. Therefore, decentralized storage networks must apply other techniques to ensure decentralization. However, not using the blockchain as the primary storage space leads to a long list of other challenges if the network wants to remain decentralized.

Essentially, a decentralized storage network must be able to store data, retrieve data, and maintain data while ensuring that all actors within the network are incentivized for the work they do, while also upholding the trustless nature of decentralized systems.

Therefore, from a design point of view, we can summarize the main challenges in the following illustrative paragraphs.

Data Storage Format – First, the network must decide how to store the data: whether the data should be encrypted, and whether the data should be saved as a whole set or in small chunks.

Replication of Data — The network then needs to decide where to store the data: on how many nodes the data should be stored, and whether all the data should be replicated to all nodes, or if each node should get a different fragment to further protect data privacy. The data storage format and the network propagation of the data will determine the probability that the data is available on the network despite device failures over time (persistence).

Storage Tracking – From here, the network needs a mechanism to keep track of where the data is stored. This is important because the network needs to know which network locations to ask to retrieve specific data.

Proof of Data Stored — Not only does the network need to know where the data is stored, but storage nodes also need to be able to prove that they are indeed storing the data they intend to store.

Data Availability over Time — The network also needs to ensure that data is where it is supposed to be, when it is supposed to be. This means that mechanisms must be designed to ensure that nodes do not delete old data over time.

Storage Price Discovery — Nodes expect to pay for the ongoing storage of files.

Persistent Data Redundancy - While the network needs to know where the data is, due to the nature of public open networks, nodes are constantly leaving the network and new nodes are constantly joining the network. Therefore, in addition to ensuring that individual nodes store what they are supposed to store when they are supposed to store it, the network also needs to ensure that when a node leaves and its data disappears, enough copies of the data or data fragments are maintained across the entire network.

Data Transmission — Then, when the network connects to a node to retrieve data requested (by a user or data maintenance workload), the node storing the data must be willing to transmit the data, as bandwidth also has a cost.

Network Tokenomics - Finally, in addition to ensuring that data resides within the network, the network must ensure that the network itself will persist in the long term. If the network disappears, it will take all the data with it - therefore, strong tokenomics are necessary to ensure the permanence of the network and thus the long-term availability of the data.

Overcoming the challenges of data decentralization

In this section, I will compare and contrast various aspects of the decentralized storage network designs of IPFS, Filecoin, Crust Network, Arweave, Sia, Storj, and Swarm, and how they overcome the above challenges. This reflects on mature and emerging decentralized storage networks that use a wide range of technologies to achieve decentralization.

The table below summarizes the technical aspects and token economics of each network, which will be covered in more detail in this section, as well as what the author believes are strong use cases for these chains following their various design elements.

Figure 2: Summary of technical design decisions for storage networks reviewed

Figure 3: Summary of token design decisions for reviewed storage networks

Figure 4: Summary of strong use cases for reviewed storage networks

Since many concepts are closely related in each protocol design, it is impossible to clearly divide each challenge, so there will be some overlap between the subsections.

Data storage format and data replication

Data format and replication of data refers to how data is stored on a single node instance and how data is propagated across multiple nodes when a user or application requests a file to be stored (hereafter users and applications are collectively referred to as clients). This is an important distinction because data can also be stored on a node as a result of data maintenance processes initiated by the network or other network participants.

In the table below, we can see a brief overview of how the protocol stores data:

Figure 5: Review of storage network data storage methods and data replication

From the above projects, Filecoin and Crust use the Interstellar File System (IPFS) as the network coordination and communication layer for transferring files between peers and storing files on nodes. Both IPFS and Filecoin are developed by Protocol Labs.

When new data is to be stored on the Filecoin network, a storage user must connect to a storage provider through the Filecoin storage market and negotiate storage terms before placing a storage order. The user must then decide which type of erasure coding (EC) to use and the replication factor therein. With erasure coding, data is broken down into fragments of constant size, each of which is expanded and redundant data is encoded, so that only a subset of the fragments is required to reconstruct the original file. The replication factor refers to how often the data should be copied to more storage sectors of the storage miner. Once the storage miner and the user agree on the terms, the data is transferred to the storage miner and stored in the storage miner's storage sector.

Figure 6: Data replication and erasure coding of data

If users want to further increase redundancy, they need to make additional storage deals with additional storage providers, as there is still a risk that one storage miner will go offline and all committed storage sectors will go offline with it. Filecoin applications such as NFT.Storage and Web3.Storage built on the Filecoin protocol solve this problem by using multiple storage miners to store files, but at the protocol level, users must manually interact with multiple storage miners.

In contrast, Crust replicates data to a fixed number of nodes: when a storage order is submitted, the data is encrypted and sent to at least 20 Crust IPFS nodes (the number of nodes can be adjusted). On each node, the data is divided into many smaller fragments, which are hashed into a Merkle tree. Each node retains all the fragments that make up the complete file. While Arweave also uses replication of full files, Arweave takes a somewhat different approach. After a transaction is submitted to the Arweave network, the first single node will store the data as a block on blockweave (Arweave's blockchain representation). From there, a very aggressive algorithm called Wildfire ensures that the data is quickly replicated on the network, because in order for any node to mine the next block, they must prove that they have access to the previous block.

Figure 7: Data storage format affects retrieval and reconstruction

Sia, Storj, and Swarm use erasure coding (EC) to store files. With Crust's implementation, 20 complete data sets are stored on 20 nodes. While this is very redundant and makes the data very durable, it is very inefficient from a bandwidth perspective. Erasure coding provides a more efficient way to achieve redundancy by increasing the durability of data without a large bandwidth impact.

Sia and Storj directly propagate EC shards to a specific number of nodes to meet certain durability requirements. Swarm, on the other hand, manages nodes in a way that closer nodes form neighborhoods, and these nodes actively share data blocks (a specific fragment format used in Swarm) between each other. If popular data is frequently called from the network, other nodes are incentivized to also store popular blocks - this is called opportunistic caching. Therefore, in Swarm, the number of data fragments in the network may be far more than the minimum "healthy" number considered. While this does affect bandwidth, this can be thought of as preloading future retrieval requests by reducing the distance to the requesting node.

Storage Tracking

After storing data on one or more nodes, the network needs to know where the data is stored. This way, when a user requests to retrieve their data, the network knows where to look.

Figure 8: Reviewing storage network storage tracking

Filecoin, Crust, Sia, and Arweave all use blockchains or blockchain life structures to manage storage orders and record every storage request placed on the network. In Filecoin, Crust, and Sia store proofs (i.e., proof that the file has been stored by a miner, stored on-chain). This allows these networks to know what data is where at any point in time. With Arweave, the network incentivizes all nodes to store as much data as possible, however, nodes do not need to store every piece of data. Since Arweave stores data as blocks on its blockchain and nodes do not need to store all data, nodes may lose some data that can be retrieved later. Hence why Arweave's blockweave is a "blockchain-like" structure.

On Filecoin, Crust, and Sia, storage nodes all maintain a local table with details of which storage nodes store which data. This data is regularly updated between nodes through gossip between each other. However, with Arweave, when requesting content, nodes do so opportunistically, rather than contacting specific nodes that are known to hold the content.

Neither Storj nor Swarm have their own layer 1 blockchain, and therefore track storage in different ways. In Storj, storage order management and file storage are divided into two different types of nodes, Satellite nodes and Storage nodes. Satellites, which can be single servers or redundant collections of servers, only track the data that users submit to them for storage, and only store it on storage nodes with which they have an agreement. Storage nodes can work with multiple Satellites and store data from multiple Satellites. This architecture means that storing files in Storj does not require consensus from the entire network, which means greater efficiency and less computing resources required to store data. However, it also means that if a Satellite goes offline, the data managed by that Satellite will be inaccessible.

In Swarm, the address where data is stored is recorded directly in the hash of each block during the data-to-block conversion process. Since blocks are stored in the same address space (i.e., neighborhood) across nodes, the neighborhood of a file can be identified simply by the block hash itself. This means that there is no need for a separate mechanism to track where files are stored, as the storage location is implied by the block itself.

Proof of Storage Data, Availability over Time, and Storage Price Discovery

In addition to the network knowing where the data is stored, the network must also have a way to verify that the data to be stored on a specific node is indeed stored on that specific node. Only after verification occurs can the network use other mechanisms to ensure that the data remains stored over time (i.e., the storage node does not delete the data after the initial storage operation). Such mechanisms include algorithms that prove that data is stored for a specific period of time, financial incentives for the duration of successful completion of storage requests, and disincentives for outstanding requests, etc. It should be noted here that the availability of data over time is not the same as permanence, although permanent storage is a form of long-term data availability. Finally, to illustrate the proof of stored data and how data availability is ensured over time, this section will walk through the full storage process for each protocol.

Figure 9: Proof of stored data, availability over time, and pricing mechanisms for vetted storage networks

Filecoin

On Filecoin, storage miners must deposit collateral into the network as a commitment to provide storage to the network before receiving any storage requests. Once completed, miners can offer storage and price their services on the storage market. Users who want to store data on Filecoin can set their storage requirements (e.g., required storage space, storage duration, redundancy, and replication factor) and make inquiries.

The storage market then matches clients and storage miners. Clients then send their data to miners, who store the data in a sector. The sector is then sealed, a process that converts the data into a unique copy of the data, called a replica, associated with the miner's public key. This sealing process ensures that each replica is a physically unique copy and forms the basis of the Filecoin Proof of Replication algorithm. The algorithm uses the Merkle tree root of the replica and the hash of the original data to verify the validity of the provided storage proof.

Over time, storage miners need to consistently prove their ownership of the stored data by running the algorithm periodically. However, consistency checks like this require a lot of bandwidth. The novelty of Filecoin is that to prove that data is stored over time and reduce bandwidth usage, miners generate proofs of replication in sequence, using the output of the previous proof as the input of the current proof. This is performed over multiple iterations that represent the duration for which the data is to be stored.

Crust Network

In the Crust Network, nodes must also deposit collateral before they can accept storage orders on the network. The amount of storage space a node provides to the network determines the maximum amount of collateral, which is pledged and allows the node to participate in creating blocks on the network. This algorithm is called Guaranteed Proof of Stake, which guarantees that only nodes with stake in the network can provide storage space.

Nodes and users are automatically connected to the Decentralized Storage Market (DSM), which automatically selects which nodes to store the user's data on. Storage prices are determined based on user needs (such as storage duration, storage space, replication factor) and network factors (such as congestion). When a user submits a storage order, the data is sent to multiple nodes on the network, which use the machine's Trusted Execution Environment (TEE) to split the data and hash the fragments. Since TEE is a closed hardware component that is not accessible even to the hardware owner, the node owner cannot reconstruct the file on his own.

Once the file is stored on the node, a work report containing the file hash is published to the Crust blockchain along with the node's remaining storage. From here, to ensure that the data is stored over time, the network periodically requests random data checks: in the TEE, the random Merkle tree hash is retrieved along with the relevant file fragment, which is decrypted and rehashed. The new hash is then compared to the expected hash. This implementation of storage proof is called Meaningful Proof of Work (MPoW).

Sia

As is the case with Filecoin and Crust, storage nodes must deposit collateral to provide storage services. On Sia, nodes must decide how much collateral to post: collateral directly affects the storage price for users, but at the same time posting low collateral means that nodes have nothing to lose if they disappear from the network. These forces push nodes toward balanced collateral.

Users connect to storage nodes through an automated storage market, which functions similarly to Filecoin: nodes set storage prices, and users set expected prices based on target prices and expected storage durations. Users and nodes then automatically connect to each other.

After a user and a node agree on a storage contract, funds are locked into the contract and erasure coding is used to split the data into fragments, each of which is individually hashed using a different encryption key, and each fragment is then replicated to several different nodes. The storage contract, recorded on the Sia blockchain, records the terms of the agreement along with the Merkle tree hash of the data. From there, storage proofs are periodically submitted to the network to ensure that the data is stored for the expected storage time. These storage proofs are created based on a list of hashes of a randomly selected portion of the original storage file and the Merkle tree of the file recorded on the blockchain. Nodes are rewarded for each storage proof submitted over a period of time, and finally when the contract is completed.

On Sia, storage contracts can last up to 90 days. To store files for longer than 90 days, users must manually connect to the network using the Sia client software to extend the contract for another 90 days. Skynet, another layer on top of Sia, similar to Filecoins Web3.Storage or NFT.Storage platforms, automates this process for users by having Skynet's own client software instance perform contract renewals for them. While this is a workaround, it is not a Sia protocol-level solution.

Arweave

Arweave uses a very different pricing model compared to previous solutions, as Arweave does not allow temporary storage: on Arweave, all stored data is permanent. On Arweave, the price of storage is determined by the cost of storing data on the network for 200 years, assuming that these costs decrease by -0.5% per year. If storage costs decrease by more than -0.5% in a year, the savings are used to add additional years of storage at the end of the storage period. In Arweave's own estimates, a -0.5% annual storage cost reduction is very conservative. If the reduction in storage costs is always greater than Arweave's assumptions, then storage duration will continue to grow indefinitely, making storage permanent.

The price of storing files on Arweave is determined dynamically by the network, based on the previously mentioned 200-year storage cost estimate and the network's difficulty. Arweave is a Proof-of-Work (PoW) blockchain, which means that nodes must solve a cryptographic hash puzzle to mine the next block. If more nodes join the network, solving the hash puzzle becomes more difficult, so more computing resources are required to solve the puzzle. The dynamic price difficulty adjustment reflects the cost of additional computing power to ensure that nodes remain motivated to stay on the network and mine new blocks.

If a user accepts the price for storing a file on the network, the node accepts the data and writes it into a block. This is where Arweave’s Proof of Access algorithm comes into play. The Proof of Access algorithm works in two phases: first, the node must prove that they have access to the previous block in the blockchain, and then they must prove access to another randomly selected block, called a recall block. If the node can prove access to both blocks, they enter the PoW phase. During the PoW phase, only miners who can prove access to both blocks begin trying to solve the cryptographic hash puzzle. When a miner successfully solves the puzzle, they write the block along with the data into the blockchain. From here, for nodes on the network to be able to mine the next block, they must include the newly mined block.

Miners are rewarded with a combination of data and block rewards from the network token emission. In addition to transaction fees, the rest of the price paid by users is stored in an endowment fund that is paid over time to the miners holding the data. This fee is only paid when the network deems that transaction fees and block rewards are not enough to make mining operations profitable. This creates floating tokens in the endowment fund, further extending the 200-year minimum storage period.

In Arweave’s model, there is no tracking of storage locations. Therefore, if a node does not have access to the requested data, it will ask the nodes in the peer list it maintains locally for the block data.

Storj

In the Storj decentralized storage network, there is no blockchain or blockchain-like structure. The absence of a blockchain also means that the network has no network-wide consensus on its state. Instead, tracking where data is stored is handled by Satellite nodes, and data storage is handled by Storage nodes. Satellite nodes can decide which Storage nodes to use to store data, and Storage nodes can decide which Satellite nodes to accept storage requests from.

In addition to handling data location tracking across storage nodes, Satellites are responsible for billing and payment for storage and bandwidth usage by storage nodes. In this arrangement, storage nodes set their own prices, and Satellites connect users to each other as long as they are willing to pay those prices.

When a user wants to store data on Storj, the user must select a Satellite node to connect to and share their specific storage requirements. The Satellite node then picks out storage nodes that meet the storage requirements and connects the storage node with the user. The user then transfers the file directly to the storage node while paying the Satellite. The Satellite then pays the storage node monthly for the files stored and the bandwidth used.

To ensure that storage nodes continue to store the data fragments they are supposed to store, Satellites perform regular audits of storage nodes. Satellites that do not store any data randomly select file fragments before applying erasure coding and ask all nodes storing erasure coded fragments to verify the data. When enough nodes return data, Satellites can identify nodes that report incorrect data.

To prevent nodes from disappearing and taking data offline, and to ensure they always verify file fragments through audits, Storj Satellite withholds a large portion of storage node revenue, making it financially infeasible to leave the network early or fail an audit. The withheld percentage of revenue is released the longer a node stays in the network. The network will only return the remaining withheld funds when a storage node determines they want to leave the network after running for at least 15 months, and the storage node signals to the network that they want to leave the network allowing the network to move all data.

Swarm

While Swarm does not have a layer 1 blockchain for tracking storage requests, storing files on Swarm is handled through smart contracts on Ethereum. Therefore, storage orders containing some details about the files can be tracked. And since the address of each block in Swarm is included in the block, the neighborhood of the block can also be identified. Therefore, when data is requested, the nodes within the neighborhood communicate with each other to return the block requested by the user.

Through client software, Swarm lets users determine the amount of data and the duration that data will be stored on Swarm, and perform calculations using smart contracts. When data is stored on Swarm, blocks are stored on one node and then replicated to other nodes in the same neighborhood as the uploading node. When data is stored on a node, it is split into blocks that map the data to a block tree that builds a Merkle tree, and the root hash of the tree is the address used to retrieve the file. The root hash of the tree is therefore proof that the file has been correctly chunked and stored. Each block in the tree is further embedded with proofs of inclusion, which nodes who want to sell long-term storage (also known as committed storage) must verify and lock up a stake - essentially a security deposit - through an Ethereum-based smart contract when making the commitment. If, during the commitment period, the node fails to prove ownership of the data they have committed to store, they will lose their entire security deposit.

Finally, to further ensure that data is not deleted over time, Swarm employs random selection, where nodes are rewarded for holding random data picked via Swarm ’s RACE selection system.

Persistent data redundancy

If data is stored on a set number of nodes, it can be assumed that over the long term, as nodes leave and join the network, this data will eventually disappear. To address this, nodes must ensure that data stored in any form is replicated regularly to maintain a minimum level of redundancy at all times for the user-defined storage period.

Figure 10: Data persistence mechanisms of the reviewed storage network

In every block mined on the Filecoin network, the network checks whether the proofs required to store the data exist and whether they are valid. If a certain failure threshold is exceeded, the network considers the storage miner to be faulty, marks the storage order as failed, and reintroduces a new order for the same data on the storage market. If the data is deemed unrecoverable, it will be lost and the user will receive a refund.

Curst Network, the youngest of the networks with a mainnet launch in September 2021, does not yet have a mechanism to replenish file redundancy over time, but such a mechanism is currently under development.

On Sia, the number of erasure coded fragments available on the network is converted into a health indicator. As nodes and erasure coded fragments disappear over time, the health of a piece of data degrades. To ensure that health remains high, users must manually open the Sia client, which checks the health status and if it is not 100%, the client replicates the data fragment to other nodes on the network. Sia recommends opening the Sia client once a month to run this data repair process to avoid data falling below the unrecoverable fragmentation threshold and eventually disappearing from the network.

Storj takes a similar approach to Sia, but instead of having users take steps to ensure that there are enough erasure-coded file fragments on the network, the work is taken over by satellite nodes. Satellite nodes periodically perform data audits on the shards stored on storage nodes. If the audit returns a defective fragment, the network will rebuild the file, regenerate the missing fragments and store them back to the network.

For Arweave, consistent data redundancy is achieved through a proof-of-access algorithm that requires nodes to store old data in order to mine new data. This requirement means that nodes are incentivized to search for and retain older and "rare" blocks to increase their chances of being allowed to mine the next block and receive mining rewards.

Finally, Swarm ensures persistent redundancy through neighborhood replication as a key measure to prevent data from disappearing over time. Swam requires each set of a node's nearest neighbors to hold a copy of that node's data chunks. As nodes leave or join the network, these neighbors are reorganized over time, and each node's nearest neighbors are updated, requiring them to resynchronize the data on their node. This leads to eventual data consistency. This is an ongoing process that is performed entirely off-chain.

Stimulus data transmission

Figure 11: Mechanisms that facilitate censorship of data transfers over storage networks

Once a user stores data on the network, the data must also be retrievable when a user, another node, or a network process requests access to the data. Once a node receives and stores data, it must be willing to send the data when requested.

Filecoin does this through a separate type of miner called retrieval miners. Retrieval miners are miners who specialize in providing data fragments and are rewarded with FIL tokens for doing so. Any user in the network can become a retrieval miner (including storage nodes), and retrieval orders are processed through the retrieval market. When a user wants to retrieve data, they place an order in the retrieval market and the retrieval node provides the service for them. Although Filecoin is built on the same underlying stack as IPFS, Filecoin does not use IPFS's Bitswap exchange protocol to transfer user data. Instead, the Bitswap protocol is used to request and receive blocks from the Filecoin blockchain.

Crust uses IPFS's Bitswap mechanism directly to retrieve data and incentivize nodes to be willing to transfer data. In Bitswap, each node maintains a credit and debt score of the nodes it communicates with. A node that only requests data (for example, when a user submits a data retrieval request) will eventually incur a high enough debt that other nodes will stop responding to its retrieval requests until it itself begins to satisfy enough retrieval requests. In addition to this, in the Crust Network, the first four nodes that can provide storage proofs for data storage requests will be awarded a percentage of the storage fee by the user who initiated the order, which means that nodes benefit from being able to receive data quickly, depending on how active they are in providing data.

Therefore, Swarm's SWAP protocol (Swarm Account Protocol) works in the same way as IPFS's Bitswap mechanism, with additional features integrated. Here nodes also maintain a local database of other nodes' bandwidth credits and debts, creating a service-to-service relationship between nodes. However, SWAP assumes that sometimes a node does not need data to rebalance the credit-to-debt ratio in the short term. To solve this problem, nodes can pay other nodes checks to repay their debts. A check is an off-chain certificate that a node promises to pay to another node, and can be exchanged for BZZ tokens through a smart contract on the Ethereum blockchain.

Figure 12: Group accounting protocol. Source: Swarm white paper

In Sia and Storj, users pay for the bandwidth they use. In Sia, upload, download, and repair bandwidth are paid by the user, while in Storj, the bandwidth required for uploads is borne by the storage nodes. In Storj, this is to prevent nodes from deleting data immediately after receiving it. Due to this setup, nodes have no reason to avoid using bandwidth, as bandwidth is paid at the price they stipulate before accepting the storage order.

Finally, in Arweave, nodes allocate bandwidth based on how reliably peers share transactions and blocks, and how reliably they respond to requests. The node then keeps track of these factors for all the peers it interacts with, and prefers to communicate with peers that have higher scores. This increases the willingness of nodes to transmit data and share information, because receiving blocks in a slower manner means that they have less time to solve the cryptographic hash puzzles of Arweave's PoA consensus algorithm than other nodes.

Token Economics

Finally, the network must decide on a token design. While the above ensures that data is available when it should be available, token economics design ensures that the network will exist in the future. Without the network, users and hosts have no underlying data to interact with. Here we will take a closer look at the purpose of a token and the factors that influence token supply.

Note: While all of the above components impact token economics design, here we focus primarily on token utility and token emission design.

Figure 13: Token economics design decisions of reviewed storage networks

In the Filecoin network, FIL tokens are used to pay for storage orders and retrieval bandwidth. The Filecoin network has an inflationary token emission model that uses two types of minting: simple minting, which generates additional token emissions on a 6-year halving schedule (compared to Bitcoin's 4 years) and base minting. The network reaches a total storage space milestone (see Figure 23). This means that storage miners on the network are incentivized to provide as much storage as possible to the network.

There are two ways to reduce the circulating supply of FIL on the market. If miners fail to meet their commitments, their collateral will be burned and permanently removed from the network (30.5 million FIL at the time of writing). Finally, time-locked storage orders temporarily remove FIL from circulation and pay miners over time. This means that the more storage is used, the fewer tokens are in circulation in the short term, creating deflationary price pressure on the value of the token.

Figure 14: Max and Min Minting for Storage Mining and Max Baseline Minting. Source: https://filecoin.io/blog/filecoin-circulating-supply/

In Crust Network, CRU tokens are used to pay for storage orders and for staking as part of the Crust Network Guaranteed Proof of Stake (GPoS) consensus mechanism. In this model, the emission of network tokens is also inflationary and is used as block rewards. However, Crust Network has no token cap - the inflation rate has decreased year-on-year for 12 years, after which the token inflation rate has remained at a constant 2.8%.

In Crust, the stake locked by validators and their guarantors also serves as collateral for staking. If a validator is found to have acted maliciously or is unable to provide the required proof, their stake will be slashed and burned. Finally, collateral and time-locked storage orders temporarily remove tokens from circulation. Since miners’ network storage capacity determines their staking limit, miners are incentivized to provide more storage capacity to maximize their staking income in proportion to other miners. Staked tokens and tokens locked in time-limited storage orders create deflationary price pressure on the value of tokens.

Figure 15: Crust Network token emission. Source: Crust Economic White Paper

Sia has two tokens used in the network; one is the utility token Siacoin, and the other is a revenue-generating token called Siafunds. Siafunds were sold to the public when the network first went live and are primarily held by the Sia Foundation. Siafunds entitle the holder to a percentage of revenue for every storage order placed on the network. Siafunds have no substantial impact on Sia's token economics, so they will not be discussed here.

Siacoin has an inflation token emission model as a block reward with no token cap. Block rewards are permanently reduced in a linear way per block until the block height is 270,000 (approximately 5 years of operation; reached in 2020). Since then, each block contains 30,000 SC fixed block rewards. In 2021, the Sia Foundation hard forks the Sia network, providing an additional 30,000 SC subsidy for each block to fund the Sia Foundation, a nonprofit entity designed to support, develop and promote the Sia network.

Figure 16: Annual growth in Siacoin supply and Foundation coin minting. Source: https://siastats.info/macroeconomics

Sia also uses a proof of combustion mechanism, requiring miners to burn 0.5-2.5% of revenue to prove that there are legitimate nodes on the network. This puts downward pressure on the token supply, although the annual burn volume reflects only about 500,000 SC, while the token emissions are 3.14 billion SC. Finally, pledged collateral and long-term storage orders also temporarily remove tokens from Sia's circulation.

The native token of the Arweave network is an AR token that pays for permanent and theoretical permanent storage on the Arweave network. Arweave also uses the inflation token model with a maximum supply cap of 66 million AR tokens. In Arweave, the main deflationary impact is driven by Arweave’s donation, which is the implementation of long-term storage contracts by Arweave. When a user wants to store files on Arweave, only a small portion of the storage fee is handed over to the miner—the rest will be deposited into an endowment fund with a storage time of at least 200 years based on Arweave’s highly conservative assumptions. This means that any storage order placed locks the token for at least 200 years and is slowly paid over that 200-year term.

Figure 17: AR token inflation and team allocation. Source: https://medium.com/amber-group/arweave-enabling-the-permaweb-870ade28998b

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

Figure 18: Batch relocking timeline. Source: https://www.storj.io/blog/using-timelocked-tokens-to-support-long-term-sustainability

Finally, Swarm uses BZZ tokens as a practical token to pay for network storage fees. The token economics model deployed by Swarm is a joint curve that determines the price of the token based on the supply of tokens. Users can sell their tokens back to the joint curve at any time at the current market price. In Swarm, long-term storage orders need to be staked in the form of a “commitment.” Similar to previous networks, more storage usage means fewer tokens available on the market, which will create deflationary pressure on token prices because users who want to buy tokens must buy from the joint curve, which will increase the price token sale.

Figure 19: The shape of the BZZ bonding curve. Source: https://medium.com/ethereum-swarm/swarm-and-its-bzzaar-bonding-curve-ac2fa9889914

discuss

It is impossible to say that one network is objectively better than another. There are countless tradeoffs that must be considered when designing a decentralized storage network. While Arweave is great for permanent storage, Arweave is not necessarily suitable for migrating Web2.0 industry players to Web3.0 - not all data needs to be permanently saved. However, there is a strong data sub-field that does require permanentity: NFT and DApps.

If we look at other networks, we'll see a similar trade-off: Filecoin is incentivizing Web2.0 storage providers to migrate their storage to Web3.0, and is therefore the driving force behind decentralization. Filecoin's spatiotemporal proof algorithm is computationally expensive and slow to write, which means it's better suited for high-value data that doesn't change frequently (such as their slogan "Storing the most important data for humans." However, many applications need to constantly change their data. Crust Network fills this gap by providing storage with less computational strength.

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

If we compare any project to Filecoin, we see that other chains support a higher level of storage decentralization, but may be more centralized in other ways, such as a single satellite node that can control Storj of large clusters of storage nodes. If that satellite node is offline, all access to files will be lost. However, letting the satellite self-control the repair process is a huge upgrade compared to the manual repair process required by Sia. By allowing any form of payment between the user and the satellite, Storj also provides Web2.0 users with the first step to easier access to decentralized storage.

If we further compare Storj's decentralized approach to other projects' approaches, we will find that Storj's lack of system-wide consensus is indeed a purposeful design decision to improve network performance, because the network does not need to wait for consensus to continue to satisfy storage requests.

Swarm and Storj are the only protocols that do not have their own layer1 blockchain network, but rely on ERC20 tokens deployed on the Ethereum network. Swarm is directly integrated into the Ethereum network, and storage orders are directly controlled through Ethereum smart contracts. Due to the convenience of proximity and the same environment, this makes Swarm a powerful choice for Ethereum native dApps and storing Ethereum-based NFT metadata. Storj, although also based on Ethereum, is not highly integrated into the Ethereum ecosystem, but it can also benefit from smart contracts.

Sia and Filecoin use storage market mechanisms where storage vendors can set prices and match storage users willing to pay those prices based on specific requirements, while in other networks, storage pricing is based on protocol provisions for network-specific factors. Using storage markets means that users can get more choices in how to store and protect their files, but setting prices by networks can reduce complexity and provide a easier user experience.

in conclusion

There is no single best approach to the various challenges facing decentralized storage networks. Depending on the purpose of the network and the problems it is trying to solve, it must be weighed in terms of the technology of network design and token economics.

Figure 20: Summary of powerful use cases for reviewed storage networks

Finally, the purpose of the network and the specific use cases it attempts to optimize will determine various design decisions.

Comparative network analysis

Below is a summary overview of various storage networks that compare with each other on a set of scales defined below. The scales used reflect the comparative dimensions of these networks, but it should be noted that methods to overcome the challenges of decentralized storage are not good or bad in many cases, but simply reflect design decisions.

  • Storage parameter flexibility: The degree to which the user controls the file's storage parameters

  • Storage persistence: To what extent can file storage achieve theoretical persistence through the network (i.e., no intervention required)

  • Redundancy persistence: The ability of the network to maintain data redundancy through supplementation or repair

  • Data transmission incentives: The extent to which the network ensures that nodes can transmit data generously

  • The universality of storage tracking: the degree of consensus among nodes on the location of data storage

  • Guaranteed data accessibility: The network's ability to ensure that individual participants in the stored procedure cannot delete access to files on the network

Filecoin’s token economics supports increasing storage space across the network for storing large amounts of data in an immutable way. In addition, their storage algorithms are more suitable for data that are unlikely to change significantly over time (cold storage).


Figure 21: Filecoin summary overview

Crust’s token economics ensures ultra-redundant and fast retrieval, making it suitable for high-traffic dApps and for fast retrieval of data from popular NFTs.

Crust scores low on storage persistence because its ability to provide permanent storage is severely affected without persistence redundancy. Still, persistence can be achieved by manually setting extremely high replication factors.

Figure 22: Crust summary overview

Sia is about privacy. The reason why users need to recover manually is because the node does not know which fragments of data they store and what data they belong to. Only the data owner can reconstruct the original data from shards in the network.

Figure 23: Sia summary overview

Arweave, by contrast, is about persistence. This is also reflected in their endowment design, which makes storage more expensive, but also makes them an attractive option for NFT storage.

Figure 24: Arweave summary overview

Storj’s business model seems to have a large impact on their billing and payment methods: Amazon AWS S3 users are more familiar with monthly billing. By removing complex payment and incentive systems common in blockchain-based systems, Storj Labs sacrifices some decentralization, but significantly lowers the barrier to entry for key target groups for AWS users.

Figure 25: Storj summary overview

Swarm's joint curve model ensures that storage costs remain relatively low overtime as more data is stored on the network, and its proximity to the Ethereum blockchain makes it a strong contender for the major storage of more complex Ethereum-based dApps.

Figure 26: Swarm summary overview

Next boundary

Going back to the Web3 infrastructure pillars (consensus, storage, compute), we see that decentralized storage has a few strong players who have positioned themselves in the market for specific use cases. This does not rule out new networks to optimize existing solutions or capture new niches, but it does raise the question: What’s next?

The answer is: computing. The next frontier to realizing a truly decentralized internet is decentralized computing. Currently, only a few solutions can bring trustless, decentralized computing solutions to the market that can power complex dApps that can perform more complex computing at a much lower cost than executing smart contracts on the blockchain.

Internet Computers (ICP) and Holochain (HOLO) are networks that have a strong position in the decentralized computing market at the time of writing. Nevertheless, computing space is not as crowded as consensus and storage space. Sooner or later, strong competitors will enter the market and position themselves accordingly. One such competitor is Stratos (STOS). Stratos provides a unique network design through its decentralized data grid technology, combining blockchain technology with decentralized storage, decentralized computing and decentralized databases.

We view decentralized computing, especially the network design of Stratos networks, as the field of future research.

<<:  Wall Street Journal: After reading this article, you will know what is wrong with Meta's metaverse

>>:  Ethereum MEV Dark Forest: From Gas War to PBS

Recommend

Ethereum Berlin hard fork is completed, what impact will it have on gas fees?

Quick Facts The Berlin hard fork changes the gas ...

Judging People by Their Eyes in Physiognomy

Judging People by Their Eyes in Physiognomy When ...

Mining giant Todis Fund invests tens of millions in 73 Street Trading Platform

According to IT Weekly, mining giant Todis Fund i...

How will the fortune be for Libras with moles on their philtrum in 2019?

Libras are born between September 24 and October ...

What is the impact of not being able to see ears?

A person's face actually reveals certain info...

How to tell the success or failure of a woman's marriage from her face

The success or failure of a woman’s marriage is c...

[Aladdin Review] New first-line FPGA multi-currency mining machine MultMiner M1

MultMiner M1 contains 3 computing boards and uses...

Does a man with a protruding nose have a good fortune?

Whether the nose is straight or not is determined...

What are the characteristics of earth-shaped people?

The correct earth shape - People with the correct...

Palmistry analysis: what is the Tongguan hand?

In physiognomy, palmistry is said to be the most ...