1
Why do we need IPFS?
Fundamentally, IPFS has a simple yet bold goal: to re-architect the entire Internet by replacing HTTP.
The addresses on the website are usually prefixed with HTTP. The goal of IPFS is to completely replace the HTTP protocol as the base layer of the Internet.
In fact, our current version of the web is sometimes called Web 2.0, and IPFS is its natural progression, Web 3.0.
Why re-architect the entire network?
In fact, you may already be affected by an HTTP flaw and not even know it.
The current Internet cannot support the coming wave of innovation and users, as future development of the Internet will require more bandwidth.
For example, people are beginning to pursue higher-quality videos, and for this we already need fast Internet, 8k, 16k or even higher.
As demands on the network grow, costs get higher.
Companies like Facebook, Google, and others spend billions of dollars to support their web architecture and make their content available to you. Of course, the cost is also passed on to users in various ways, and many applications that would otherwise be useful to humanity cannot be realized due to this barrier.
However, large files are not the only problem affecting networks. The proliferation of devices connected to the Internet is another important cause.
Now it is not just digital devices such as computers and mobile phones that can connect to the Internet, but even all machines that are powered by electricity, such as home appliances, can be connected via the Internet. As more and more devices are connected to the Internet, the development required to provide services for these devices also increases.
The number of devices connected to the Internet will continue to increase in the future, and the new concept of the Internet of Things is beginning to enter people's lives. The Internet of Things aims to connect everything around you, whether it is a car, a door in your house, a light, an electric meter, or something else.
The rapid development of the Internet of Things will be a challenge to the existing Internet.
We need an alternative that is exponentially scalable, efficient, and fast-moving. That’s where IPFS comes in.
Have you ever asked, how do I know that the content I want is what I want? What if some entity tampers with the image?
It may sound trivial at first glance, but as technology advances, issues such as counterfeiting become more and more common.
However, there are more immediate problems than this. How do I know that the website I’m connecting to is the right one and not a phishing attack by a malicious hacker? Currently, we have a system of certificate authorities in place to prevent this from happening, but they require trusting a third party.
Take Facebook as an example. When a bug occurs in the central server, the page will show 404. The fundamental problem is that the current Internet is based on the client-server model.
That is, the client (browser) requests data from a central server (such as Facebook's server), and the server then provides the data.
This works great when nothing goes wrong. However, when there is a sudden influx of users, a natural disaster, a hacker attack, or even a simple bug in the code, it is very easy for the server to crash. As a result, no one can access their content.
Through this epidemic, we can find that through Internet tracking, we can know everyone's travel trajectory at any time. If such convenience is used illegally, it will lead to a lot of privacy leakage.
It is obvious that there have been many privacy leaks exposed in recent years.
This is equivalent to everyone socializing on the Internet, making money on the bank, playing games on the Internet, and working on the Internet. You will find that as long as you are connected to the Internet, there will be no privacy.
Internet censorship can mean two things:
1. Content may be censored by large companies or even the government;
This is equivalent to letting others decide what content is allowed to be published.
2. Test your personal abilities on the Internet.
Imagine a content creator who was radically different from YouTube, Medium, or Twitter, who could lose his livelihood.
Imagine you were deleted from LinkedIn, you could no longer network, imagine you couldn’t use email. As the internet becomes more ubiquitous, the consequences of being disconnected from it become more significant.
IPFS is a solution that brings together various innovations and will solve many of humanity’s technological problems.
In that sense, it’s similar to Bitcoin, except instead of revolutionizing finance, it’s revolutionizing the internet, and therefore our lives.
2
The security of IPFS: No need to trust anyone
IPFS eliminates the need to trust a third party, which means that all IPFS data is self-authenticated. With such a change, how can users be sure that the data they receive is trustworthy?
The secret to the success of this self-authentication comes from the hash function.
If the input data is the same, a unique fingerprint called a hash value will be output.
These functions are engineered in such a way that it is difficult to find two data that produce the same results, making it practically impossible to falsify the original data.
The special thing about hash is that it is as important to humans as fingerprints, it can accurately find a person, and it cannot be copied or reset. This means that the hash is a unique identifier and does not compromise the privacy of the original data.
How does it relate to IPFS?
In HTTP, when a user enters a web page, the browser will fetch data at the location of the web server. That location can be spoofed by a hacker. Maybe someone can intercept the request and instead of sending a blog, send a phishing site to get the user's password.
But with IPFS, instead of typing in an HTTP URL, users will request a hash that looks like this: QmTkzDwWqPbnAh5YiV5VwcTLnGdwSNsNTn2aDxdXBFca7D.
Assuming a hacker intercepts a request for QmTkzDwWqPbnAh5YiV5VwcTLnGdwSNsNTn2aDxdXBFca7D and attempts to deliver a malicious phishing site, the user could run the received data through a hash function, compare the hash of the received data to the hash of the request, and then reject the received data if the hashes don't match.
It can effectively prevent hacker attacks.
However, this solution is not perfect because data integrity is always maintained.
If a user requests a legal document, not a single letter of that document will be different. If a user downloads a program, not a single 1 or a single 0 will be there. If a user requests a picture, every pixel will be in the exact same place, which is an interesting property in deep forgeries, where it is very difficult to determine the authenticity of a picture.
In contrast to the previously mentioned LOCATION addressing, focusing on fetching data using a hash function is called CONTENT addressing (because the content is hashed).
Since users query data based on a hash of its content rather than its location, how do we know where to find this data if it is not there at all? Where is the data? On which server exactly?
The answer is that the data can be anywhere. IPFS is a peer-to-peer network that anyone can participate in. You can think of it as BitTorrent, a protocol commonly used to distribute pirated movies and songs.
And since anyone can distribute data, just fetch it from others who are geographically nearby, and vice versa, it will be much more efficient if the data is right next to the user in terms of geography.
Suppose there is a room with 100 HTTP users and 100 IPFS users, and they all want to access the same URL, how will their experiences differ?
1. HTTP users
These 100 HTTP users will send requests to the location of that URL. Each of these requests will go through the internet, bounce through a bunch of routers until it finally reaches the server, from which the requested data is then sent, which then bounces back through a bunch of routers again, and finally reaches the user.
2. IPFS users
What do requests look like from an IPFS user's perspective?
100 IPFS users request a data hash from the IPFS network. What if someone in the room has the file? Why bother going through a router and reaching a possibly remote server? A user in geographical proximity can share it with another user, and that user can share it with another user and another user.
In this case, content addressing is clearly more efficient than location addressing!
3
The future is decentralized networks
As the saying goes, don’t put all your eggs in one basket. Sadly, this is exactly how the modern internet is organized. All the eggs and data are stored in these giant baskets (servers, which clients must connect to).
This arrangement makes the system fragile, because a problem with the server means that the client cannot access anything at all. It also means that if there is a sudden influx of egg-hungry connoisseurs, the basket's throughput will not be enough to feed everyone. We can imagine that there are many people waiting to be fed, and everyone must wait for the person in front to pick their own eggs.
Therefore, the secret of IPFS is not to put all your eggs in one basket.
IPFS is a distributed network, so it falls into the category of other peer-to-peer protocols such as BitTorrent.
Because the network is not dependent on a single server, computers can come online and go offline and the network will still function.
For example, when you want to connect to a web page, but the server is down. You are unable to connect because the connection depends on the server being available.
If this webpage is built on IPFS, the data can use some kind of encryption scheme on IPFS, where users have the key to access their private data and can freely access other users' public data.
In this case, there is no way to prevent users from connecting to the site and accessing their data. In fact, if one peer goes offline, there is still another that can have the data.
This is only possible using IPFS.
Likewise, if a large number of users want to access the same file in our current version of the web, it could cause a massive surge in demand that could overwhelm the servers. But in IPFS, the file can be shared peer-to-peer. Once a peer has a file, it can share it with another peer.
The file is always accessible, even when demand is high.
It's like BitTorrent, where popular files are more easily accessible, rather than less popular ones, because the file's data is shared between peers.
Speaking of BitTorrent, this leads us to our next point.
Since there is no central server to shut down, there is no separate entity for the administration to attack. So if one stops, there is another peer node to replace it. Also, simply attacking everyone is not feasible.
Of course, this means that IPFS can become a haven for illegal activity. Some ideas have been proposed to curb the negative effects of distributed file networks (such as blacklists). But whether these solutions will be effective remains to be seen.
For example, if someone blacklisted the hashes of illegal files, what would stop someone from simply changing the pixels and therefore the hashes?
An infinite number of illegal hashes could be generated, making blacklisting impractical.
Nonetheless, IPFS brings an important benefit: censorship resistance.
Since no file can be deleted, will false information drown out the flood of information? Will lying obscure the truth?
My personal belief on this matter is that there is an upward trend in favorable developments.
Some innovations may be used in conjunction with IPFS to verify the authenticity of a piece of data. What if we save important IPFS hashes to a blockchain system like Ethereum?
This would mean that the hash would also be associated with a true and unchangeable timestamp. We could associate a file with a verifiable time that is uncensorable.
There is a lot of fake news on the Internet now, whether it is individuals or companies, they modify pictures and videos in this way to distort reality. Faced with a large number of contradictory images and videos, it is becoming increasingly difficult to determine the truth.
With so many conflicting images and videos, it’s becoming increasingly difficult to determine the truth. But what if we had a timestamp on the file of the original picture or video?
We can prove any modified version of the file after the timestamp. This allows for more verifiable facts.
over