What does an IPFS file look like? How do we build it?

At its core, IPFS is a distributed system for storing and accessing files, websites, applications, and data. It is transport-layer agnostic, which means it can communicate over a variety of transport layers, including Transmission Control Protocol (TCP), uTP, UDT, QUIC, TOR, and even Bluetooth.

Compared to HTTP, IPFS has a faster transmission speed because IPFS finds files by hash identification. When you have the hash, you ask and connect to the network "who owns this content (hash)". Then connect to the corresponding node and download it, that is, this can form a point-to-point coverage, thus achieving very fast, extensive and ready-to-use routing.

I PFS Node

IPFS is essentially a P2P system for retrieving and sharing IPFS objects. An IPFS node is a data structure with two fields.

Data: Capacity for unstructured binary data less than 256 kB in size
Links: can link to other IPFS nodes

The link structure has three data fields:

Name: The name of the link
Hash: The hash of the IPFS object being linked
Size: The cumulative size of the linked IPFS node, including where its links are followed

IPFS nodes are usually referenced by their Base58-encoded hash. For example, let's use the IPFS command-line tool to view an IPFS object with the hash QmarHSr9aSNaPSR6G9KFPbuLV9aEqJfTk1y9B8pdwqK4Rq:

You may notice that all hashes start with "Qm". Because the hash is actually a multihash, which means that the hash itself specifies the hash function and the length of the hash in the first two bytes of the multihash. In the example above, the first two bytes in hexadecimal are 1220, where the 12 indicates that this is a SHA256 hash function, and the 20 indicates the length of the hash in bytes, which is 32 bytes.

The data and named links give IPFS a collection of objects a structure called a Merkle DAG - DAG means directed acyclic graph, and Merkle represents this as a cryptographically authenticated data structure that uses cryptographic hashes to address content.

To visualize the graph structure, we will visualize IPFS objects through a graph with Data in the nodes, and Links pointing to other IPFS objects on the graph edges, where the name of the Link is the label on the graph edge.

We will now give examples of the various data structures that can be represented by IPFS objects.

File System

IPFS can easily represent a file system consisting of files and directories. We can break down the representation of files through the following examples.

Small archive

A small file (<256 kB) is represented by an IPFS object with the data being the file contents (plus a small header and footer) and no links, i.e. the links array is empty. Note that the file name is not part of the IPFS object, so two files with different names and the same contents will have the same IPFS object representation and therefore the same hash. We can add a small file to IPFS using the ipfs add command:

We can use ipfs cat to view the file contents of the above IPFS object:

Using the ipfs object to view the infrastructure yields:

We visualize the file as follows:

Large Files

Large files (> 256 kB) are represented by a list of links pointing to file chunks < 256 kB, and with only minimal Data specifying that this object represents a large file, the links to the file chunks have the empty string as name.

Directory Structure

A directory is represented by a list of links pointing to IPFS objects that represent files or other directories. The names of the links are the names of the files and directories. For example, consider the following directory structure for the directory test_dir:

The files hello.txt and my_file.txt both contain the string Hello World!\n. The file testing.txt contains the string Testing 123\n.

When this directory structure is represented as an IPFS object, it looks like this:

Note that the file containing Hello World!\n is automatically deduplicated, and the data in that file is stored in only one logical location in IPFS (addressed by its hash address).

The IPFS command-line tool can seamlessly follow directory link names to traverse the file system:

Versioned file system

IPFS can represent the data structures that Git uses for versioned file systems. The Git commit object is described in the Git Book. The main properties of a Commit object are that it has one or more links, with names like parent0, parent1, etc., pointing to previous commits, and a link to a name object (a tree in Git) that points to the file system structure that the commit refers to.

Let's use the same example where we had a previous file system directory structure with two commits together: the first commit is the original structure, and in the second commit, we have updated the file my_file.txt to another world instead of the original Hello World.

Also note here that we have automatic deduplication enabled, so the new objects in the second commit are just the main directory, the new directory my_dir, and the updated file my_file.txt.

Blockchain

Blockchains have a natural DAG structure because past blocks are always linked by the hash of their successor blocks. More advanced blockchains such as the Ethereum blockchain also have an associated state database that has a Merkle-Patricia tree structure that can also be emulated using IPFS objects.

We assume a simple blockchain model where each block contains the following data:

List of trading partners
Link to previous block
Hash of the state trie/database

This blockchain can then be modeled in IPFS as follows:

We saw the deduplication we gain when putting the state database on IPFS; between two blocks, only the state entries that have changed need to be explicitly stored, rather than the entire state (which would significantly increase the data burden).

An interesting point here is the difference between storing data on the blockchain and storing a hash of the data on the blockchain. On the Ethereum platform, we need to pay a high fee to store data in the associated state database to minimize the bloat of the state database. Therefore, it is a common design pattern that larger data does not store the data itself, but rather stores the IPFS hash of the data in the state database.

Typically, blockchains distinguish between what’s in the global ledger that every miner replicates (that is, the data stored in the chain itself) and data that may be referenced in the chain but is not replicated among all nodes.

If a blockchain with an associated state database is already represented in IPFS, then the distinction between storing a hash on the blockchain and storing data on the blockchain becomes blurred, since everything is stored in IPFS anyway, and only the hash of the block requires the hash of the state database. In this case, if someone stores an IPFS link in the blockchain, we can seamlessly follow that link to access the data as if the data was stored in the blockchain itself.

However, we can still distinguish between on-chain and off-chain data storage by looking at what miners need to process when creating a new block. In the current Ethereum network, miners need to process transactions that will update the state database. To do this, they need access to the full state database to be able to update it wherever it is changed.

Therefore, in the blockchain state database represented by IPFS, we still need to mark data as "on-chain" or "off-chain". For miners, "on-chain" data is essential for local mining, and this data will be directly affected by transactions. "Off-chain" data will have to be updated by users without being touched by miners.

<<: [100 Questions and Answers about Filecoin with Pictures] Question 35: What is lucky bias?

>>: IPFS Official @ You | 106th Weekly Report

Two crypto lawyers discuss Three Arrows Capital’s bankruptcy and liquidation: How investors can learn from it and reduce losses

Cryptocurrency

Which lines on the face represent bad luck?

What does an IPFS file look like? How do we build it?

Two crypto lawyers discuss Three Arrows Capital’s bankruptcy and liquidation: How investors can learn from it and reduce losses

Which lines on the face represent bad luck?

What is your fortune like if you have many short wealth lines?

Bitcoin Market Outlook: $300 is Expected

What does Tan Lang entering the spouse palace represent?

How to see a person's past through his face

How is the quality of life of the couple?

Girls with fox eyes have good luck in love

Is it good for a woman to have a scar on her forehead? What do scars on other parts of her body represent?

Is the halving really a disaster for Bitcoin miners? The international mining layout shows how fierce the competition is

Recommend

Will Gong Mi become the second Cecilia Cheung?

Has the difficulty adjustment mechanism failed due to the halving delay caused by the decline in computing power?

What does a man with many crow's feet mean? Does a man with many crow's feet have a good fate?

A woman with a wealthy face

What kind of people are prone to losing money?

What does deep nasolabial folds mean for women? What are the effects of deep nasolabial folds on women?

Facial features of people who often overestimate their abilities

What kind of eyebrow shape is most suitable for a prosperous husband?

Palmistry Sun Hill fortune telling diagram

What is the fate of a man with a high nose bridge? Is his fortune good?

How to tell a man's wife's appearance

"Blockchain: Reshaping the Economy and the World" takes you into the application scenarios of blockchain

Basic course of fortune telling (1) fortune telling with head ★ Dream Interpreter fortune telling class ★

Is Bitcoin a US conspiracy?

IPFS Weekly Report 120: Join IPFS at ETHDenver 2021