A Deeper Understanding of IPFS (Part 1): A Complete Guide from Beginner to Advanced

This article is the first in a series of articles titled “In-depth Understanding of IPFS”. This series will help you understand the basic concepts of IPFS. We will try to make this series as non-boring as possible.

The series is divided into six parts:

● In-depth understanding of IPFS (1/6): Beginner to advanced guide: In this part, we will try to explain what IPFS is, why we need it and what we can do with it. We will briefly introduce all the underlying components of IPFS (later parts will analyze this in depth) and understand how they work together. If you want a brief summary without understanding what is "under the hood", then this part is for you.

● In-depth understanding of IPFS (2/6): What is InterPlanetary Linked Data (IPLD)? : In this section, we will take a deeper look at the data model based on the content-addressable network. We will explore the details and specifications of IPLD, gradually become familiar with IPLD and use it.

● In-depth understanding of IPFS (3/6): What is the Interstellar Naming System (IPNS)?: In this section, we will take a deep look at the naming system of the distributed network. We will study its usage specifications and how it works. We will also compare it with today's naming system (i.e. DNS) and list a list of advantages and disadvantages of IPNS vs. DNS.

● In-depth understanding of IPFS (4/6): What is MultiFormats? In this part, we will discuss why we need MultiFormats, how it works, and what you as a user/developer can do with it?

● In-depth understanding of IPFS (5/6): What is Libp2p? In this part, we will study the network layer of IPFS and its great contribution to IPFS. We will explain it through its working, specifications and usage methods to facilitate everyone to understand it more clearly.

● In-depth understanding of IPFS (6/6): What is Filecoin? In this section, we will discuss the incentive layer of IPFS, Filecoin. We studied the white paper and implementation specifications of Filecoin, including DSN Distributed Storage Network, Proof of Replication, Proof of Storage, Data Storage Market and Retrieval Market, and the implementation of smart contracts based on the Filecoin protocol. We also discussed some flaws in the Filecoin protocol that were not mentioned in the white paper and proposed some improvements to the Filecoin protocol.

Hopefully, you learned a lot about IPFS from this series . Let’s get started!

When you ask someone about the latest “Avengers” movie, they probably won’t say something like “on this server at this subdomain, then at this file path, slash “Marvel” slash “Avengers” dot mp4.” Instead, they’ll describe the content of the video: “Half of the universe is destroyed by Thanos…” This is obviously the most intuitive way for humans to think about it, but it’s not how we access content on the web today. Despite this, distributed protocols such as IPFS use content-based addressing (using the content of a file to mark and find content) to find content stored on a distributed network. In this article, we’ll explore how the entire IPFS works, what different components are involved, and how they work together. To do this, we’ll add a file to IPFS and then look at what happens when you add a file to IPFS.

Let’s start by adding a photo to IPFS. We add this…

https://unsplash.com/photos/rW-I87aPY5Y

By the way, you must have IPFS installed on your system to use it with me. You can install it from here. After installing IPFS, you must start the IPFS daemon (the software that communicates with the IPFS network in order to add and retrieve data from the network). You can start the daemon with ipfs daemon

When you add a photo to IPFS, the following happens:

In the terminal I got this:

You can see the final hash here:

But we don't see anything related to the 2 steps in between (Raw and Digest). It all happens "under the hood".

When we add an image, we convert the image into Raw data that the computer can understand . Now that it is content addressable (which we discussed above), we need to come up with a method by which we can convert this image data into a tag that uniquely identifies its content.

This is where hash functions come into play.

A hash function takes data (anything from a text, a photo, the entire Bible, etc.) as input and gives an output (Digest), which must be unique. If we change a pixel in this image, then the output will be different. This is its tamper-proof nature, making IPFS a self-certifying file system. So, you transfer this photo to someone else, and he/she can easily check whether the received photo has been tampered with.

Furthermore, you cannot know what the input was (in this case, a picture of a cat) and can only see its output (the Digest). Hence, this also ensures the security of the content.

Now, we pass the Raw Data into the SHA256 hash function and get a unique digest. Now, we need to convert this digest into a CID (Content Identifier). When we try to retrieve the image, IPFS will search for this CID (Content Identifier). For this, IPFS uses a technique called Multihash.

To understand the importance of Multihash, consider this situation.

You store an image on the internet and you have its CID, which you can give to anyone who wants to get it. But what if in the future you discover that SHA256 is broken (which means this process is no longer tamper-proof and secure) and you want to use SHA3 (to ensure tamper-proof and secure), what to do? This would mean changing the entire process of converting photos to CIDs, and the previous CIDs would be useless...

In this case, the question above may seem like a minor issue, but what you should know is that billions of dollars are being funded by these hash functions. All banks, national security agencies, etc. use these hash functions to ensure that they operate securely. Without it, even the green lock you see next to every site address on your browser wouldn't work.

To solve this problem, IPFS uses Multihash. Multihash allows us to customize the hashing. Therefore, we can have multiple versions of CID depending on the hash function used. We will discuss Multihash in detail in Part 4 of this series and delve into Multiformat.

Now we have added the photo to IPFS, but that’s not all. Here’s what’s actually happening:

Large files are chunked and hashed into IPLD (Merkle DAG)

If the files are larger than 256kb, we break them up into smaller pieces so that all pieces are equal to or smaller than 256kb. We can see this command used for the photo block:

ipfs object get Qmd286K6pohQcTKYqnS1YhWrCiS4gz7Xi34sdwMe9USZ7u

This then gives us 15 blocks, each smaller than 256kb. Each of these blocks is first converted to a digest and then to a CID.

{

"Links": [

{

"Name": "",

"Hash": "QmZ5RgT3jJhRNMEgLSEsez9uz1oDnNeAysLLxRco8jz5Be",

"Size": 262158

{

"Name": "",

"Hash": "QmUZvm5TertyZagJfoaw5E5DRvH6Ssu4Wsdfw69NHaNRTc",

"Size": 262158

{

"Name": "",

"Hash": "QmTA3tDxTZn5DGaDshGTu9XHon3kcRt17dgyoomwbJkxvJ",

"Size": 262158

{

"Name": "",

"Hash": "QmXRkS2AtimY2gujGJDXjSSkpt2Xmgog6FjtmEzt2PwcsA",

"Size": 262158

{

"Name": "",

"Hash": "QmVuqvYLEo76hJVE9c5h9KP2MbQuTxSFyntV22qdz6F1Dr",

"Size": 262158

{

"Name": "",

"Hash": "QmbsEhRqFwKAUoc6ivZyPa1vGUxFKBT4ciH79gVszPcFEG",

"Size": 262158

{

"Name": "",

"Hash": "QmegS44oDgNU2hnD3j8r1WH8xZ2RWfe3Z5eb6aJRHXwJsw",

"Size": 262158

{

"Name": "",

"Hash": "QmbC1ZyGUoxZrmTTjgmiB3KSRRXJFkhpnyKYkiVC6PUMzf",

"Size": 262158

{

"Name": "",

"Hash": "QmZvpEyzP7C8BABesRvpYWPec2HGuzgnTg4VSPiTpQWGpy",

"Size": 262158

{

"Name": "",

"Hash": "QmZhzU2QJF4rUpRSWZxjutWz22CpFELmcNXkGAB1GVb26H",

"Size": 262158

{

"Name": "",

"Hash": "QmZeXvgS1NTxtVv9AeHMpA9oGCRrnVTa9bSCSDgAt52iyT",

"Size": 262158

{

"Name": "",

"Hash": "QmPy1wpe1mACVrXRBtyxriT2T5AffZ1SUkE7xxnAHo4Dvs",

"Size": 262158

{

"Name": "",

"Hash": "QmcHbhgwAVddCyFVigt2DLSg8FGaQ1GLqkyy5M3U5DvTc6",

"Size": 262158

{

"Name": "",

"Hash": "QmNsx32qEiEcHRL1TFcy2bPvwqjHZGp62mbcVa9FUpY9Z5",

"Size": 262158

{

"Name": "",

"Hash": "QmVx2NfXEvHaS8uaRTYaF4ExeLaCSGpTSDhhYBEAembdbk",

"Size": 69716

}

"Data": "\b\u0002\u0018Ơ�\u0001 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 Ơ\u0004"

}

IPFS uses IPLD (IPLD uses Merkle DAG or directed acyclic graph) to manage all blocks and link them to the basic CID.

IPLD (objects) consists of 2 components:

● Data - A blob (binary large object) of unstructured binary data less than 256 kB in size.

● Links - Array of structures. Links to other IPFS objects.

Each IPLD link (in our case the 15 links mentioned above) has 3 parts:

● Name – the name of the link

● Hash - the hash value of the linked IPFS object

● Size - The cumulative size of the linked IPFS object, including the size of the links following it

IPLD is built on top of linked data, something that people in the distributed web community have been talking about for a long time. It’s also something that Tim Berners-Lee has been working on for years, and his new company Solid is building a business around it.

There are other benefits to using IPLD. To explain this, let's create a folder called photos and add 2 photos to it (a photo of a cat and a copy of the same photo).

As you can see, both photos have the same hash (which proves that I did not change anything in my copy of the image). This adds a deduplication property to IPFS. So even if your friend adds the same cat photo to IPFS, he will not have duplicated the image. This saves a lot of storage space.

Imagine if I store this article on IPFS, each of its letters is chunked and has a unique CID, then this article can be made up of a combination of letters (uppercase and lowercase), numbers, and some special characters. We will only store each letter, number, and character once, and rearrange them according to the links in the data structure.

IPFS also has a naming system called the InterPlanetary Naming System (IPNS). To understand its importance, let’s assume that you create a website and host it on a domain name. For this example, we will use my website: https://vaibhavsaini.com/

If I want to host it on IPFS, simply add the website folder on IPFS. For this, I have downloaded the website using wget. If you are using a Linux based operating system like Ubuntu or MAC, then you can follow along with me.

Downloads from this website (or any website):

wget --mirror --convert-links --adjust-extension --page-requisites --no-parent https://vaibhavsaini.com

Now add the IPFS folder named vaibhavsaini.com:

ipfs add -r vaibhavsaini.com

You will get something like this:

We can see that our website is now hosted at the last CID (i.e. the folder’s CID):

QmYVd8qstdXtTd1quwv4nJen6XprykxQRLo67Jy7WyiLMB

We can access the site using the http protocol:

https://gateway.pinata.cloud/ipfs/QmYVd8qstdXtTd1quwv4nJen6XprykxQRLo67Jy7WyiLMB/

Let's say I want to change my profile picture on the website. As we've learned above, if we change the input, we'll get a different digest, which means my final "CID" will be different.

This means that every time I update my site I have to update the hash, and people who have a link to my previous site (the URL above) won't be able to see my new site.

This will cause serious problems.

To solve this kind of problem, IPFS uses the InterPlanetary Naming System (IPNS). Use IPNS links to point to CIDs. If I want to update my website CID, I just point the new CID to the corresponding IPNS link (this is similar to today's DNS). We will explore IPNS in depth in Part 3 of this series.

But for now, let's generate an IPNS link for my website .

ipfs name publish QmYVd8qstdXtTd1quwv4nJen6XprykxQRLo67Jy7WyiLMB

This may take several minutes, and at the end you will get output like this:

Published to Qmb1VVr5xjpXHCTcVm3KF3i88GLFXSetjcxL7PQJRviXSy: /ipfs/QmYVd8qstdXtTd1quwv4nJen6XprykxQRLo67Jy7WyiLMB

Now if I want to add a new CID I will use the same command

ipfs name publish <my_new_CID>

Using this you can access an updated version of my website using the following link: https://gateway.pinata.cloud/ipns/Qmb1VVr5xjpXHCTcVm3KF3i88GLFXSetjcxL7PQJRviXSy

But the link address above is still not human-readable. We are used to names like : https://vaibhavsaini.com. In part 3 of this series, we will see how to link IPNS to a domain name so that you can see my IPFS-hosted website at https://vaibhavsaini.com.

IPFS is also a potential replacement for the HTTP protocol . But why would we want to replace HTTP? It seems to work fine, right? I mean, you can still read this article and watch a movie on Netflix, all using the HTTP protocol.

Even though it seems to work well for us, it has some big problems.

Imagine you are sitting in a large lecture hall and your professor asks you to visit a specific website. Each student in the room makes a request to that website and gives a response. This means that the same data is sent individually to each student in the room. If there are 100 students, there are 100 requests and 100 responses. This is obviously not the most efficient way to do things. Ideally, students would be able to use each other's information to retrieve the information they need more efficiently.

HTTP can also have big problems if there are some issues in the network communication lines and the client can't connect with the server . This can happen if there is an ISP outage, a country blocks certain content, or if the content is simply deleted or moved. These types of broken links exist almost everywhere in the HTTP network.

The location-based addressing model of HTTP encourages centralization. It’s convenient to trust a handful of applications with all of our data, but it’s also what makes a lot of data on the web dirty. It gives these providers enormous responsibility and power over our information.

This is where Libp2p comes into play. Libp2p is used to communicate data on the IPFS network and discover other nodes (computers or smartphones). The way it works is that if every computer and smartphone runs the IPFS software, then we will become part of a large BitTorrent network, where every system can act as a client and server. So if 100 students require the same website, they can request the website data from each other. This kind of system, if implemented on a large scale, can significantly increase Internet speeds.

Well, that's it for now. If you made it this far, you deserve a pat on the back. Well done!

We’ve learned a lot about IPFS so far . Let’s recap:

● IPFS is based on content addressability. Data on IPFS is identified using CID.

● These CIDs are unique to the data they reference.

● IPFS uses hash functions as its tamper-proof property, which makes IPFS a self-certifying file system.

● IPFS uses Multihash, which allows different versions of CID to be used for the same data (but this does not mean that the CID is not unique). If we use the same hash function, then we will get the same CID. We will discuss this issue more in Part 4 of this series.

● IPFS uses IPLD to manage and link all data blocks.

● IPLD uses the Merkle DAG (also known as directed acyclic graph) data structure to link data blocks.

● IPLD also adds deduplication features to IPFS.

● IPFS uses IPNS to link CID to a fixed IPNS link. This technology is similar to the DNS of today's centralized Internet.

● IPFS uses Libp2p to communicate data on the IPFS network and discover other nodes (computers and smartphones), which can significantly increase your Internet speed.

Below is a diagrammatic representation of the IPFS stack:

Original author: vasa

Original link: https://hackernoon.com/understanding-ipfs-in-depth-1-5-a-beginner-to-advanced-guide-e937675a8c8a

Translation: Star Continent Overseas Team

<<: Data shows that mineable cryptocurrencies are more valuable than non-mineable ones

>>: Gold, mining machines, virtual currencies, the investment stories of "Chinese aunties"

Facial features of a woman who is untrustworthy in love

Blog

Everyone in the cryptocurrency world is "freeloading" on EOS, but no one cares about the life and death of the EOS mainnet

Cryptocurrency

What is the character of crescent eyes? Is he kind on the surface but vicious inside?

Li Kuang: I once sold coins to pay for electricity and missed out on 400 million. I regret not knowing that I could do this…

In the first half of 2017, the mine sent me an el...

Face reading tells you which people are most likely to make money by cooperating with them. You must find them when doing business.

As the saying goes: Many hands make light work. I...

A Deeper Understanding of IPFS (Part 1): A Complete Guide from Beginner to Advanced

Facial features of a woman who is untrustworthy in love

Everyone in the cryptocurrency world is "freeloading" on EOS, but no one cares about the life and death of the EOS mainnet

What is the character of crescent eyes? Is he kind on the surface but vicious inside?

Several signs of promiscuous women: these women should never be married

Analysis of the character and destiny of a man with a square face

“2.5or2.88888888EB”, what is the baseline of the Filecoin network?

What does a red mole on the face mean for good or bad fortune?

Is it good for a man to have a mole on the wing of his nose? Does it mean that he cannot save money?

Does a woman’s face with crowded and uneven teeth indicate a good fate?

What does a mole on the clavicle mean? What is the fate of a person with a mole on the clavicle?

Recommend

Why is there no career line in palmistry?

Practical Tips | Common Questions and Answers about Beacon Chain Staking

Should I look at the marriage line on the left or right hand? - Palmistry Analysis

Face and palmistry to analyze why your marriage is unhappy

Former Mt Gox CEO Mark Karpeles: Haven’t you figured out block size yet?

Why do we say that women born on the third day of the third lunar month have good fortune?

The most comprehensive review of the history of mining machine development

What does it mean to have a mole on the sole of the foot?

Li Kuang: I once sold coins to pay for electricity and missed out on 400 million. I regret not knowing that I could do this…

How to read the cross pattern on the palm

The market fluctuates, and independent market opportunities emerge

Streamity: Completely subverting the fiat currency trading platform!

Filecoin mainnet finally launched, people involved share their pains and thoughts

Wisdom line shows love. How is love luck if the wisdom line is short and thick?

Face reading tells you which people are most likely to make money by cooperating with them. You must find them when doing business.