IPFS: The InterPlanetary FileSystem which can save the web

by mark · Published 2 August 2017 · Updated 26 July 2017

The IPFS project aims to revolutionize the Internet.

IPFS, the InterPlanetary FileSystem, is a distributed, peer-to-peer, decentralized file system that aims to connect all devices under the same inter-connected filesystem using a distributed model (like BitTorrent or Bitcoin) and a versioning system (like Git). Caught your interest? You’re not alone.

Is the Internet… broken?

Before we start diving into IPFS and its objectives, let’s take a step back to understand how the current Internet is structured. We are pretty comfortable with the client-server model that presumes there is one server that serves many clients, a so-called centralized system.

The HTTP (HyperText Transfer Protocol), current pivot of the World Wide Web, is based on the client-server model, meaning much of the Web works in this way. Currently, everything is working well, or is it? To tell the truth, it is, but only on the surface.

Under the hood, the Web is ever-expanding, demanding more and more throughput, bandwidth, resources. Each day, more and more individuals are accessing the Internet and the Web creating a large amount of traffic.

Mobile devices and the growing Internet coverage in developing countries are pushing the Internet and the Web to their limit. According to Cisco Visual Networking Index, the Internet traffic would be about 1 ZettaByte by the end of 2016. By the end of 2020, that traffic is currentlt estimated to become 2.3 ZettaBytes.

All of this is possible thanks to the HTTP and the client-server model, but the system as it stands is frail and has a few downsides.

HTTP downsides

HTTP is inefficient: currently a file is downloaded entirely from the server. Connection issues, server downtime and server errors might block such download and affect page load. On top of that, transferring data is costly. Each hop a package passes through costs money.
HTTP is forgetful: how many times have you seen a 404 error? Chances are, even if you’re not a technical person, you know exactly what it means: not found. HTTP relies on the server to serve contents, if those contents were removed or the server is down you won’t get them. If you ask for a resource that is gone, it is gone. Think of a removed YouTube video, the link still works, but you can’t view it. Not much of a problem? What if Google, or YouTube went down? Of course, none of this will happen soon, but it might happen.
HTTP encourages centralized authority: since the content is stored on server(s), taking down such servers, using censorship or DDoS attacks or whatever means of choice, will affect page load or even make the site unavailable. Although this can be partially mitigated using CDNs, the solution is once again relying on other servers that might go down at any time.
HTTP is highly dependent on the Internet Backbone: the backbone is the physical foundation of the Internet, an agglomerate of high-speed cables, networking equipment, routers and routes. The backbone is rock-solid but it isn’t immune to natural disasters or ship anchors. And remember, when the server is down, the content isn’t reachable (unless a CDN is employed).

As you can see, HTTP is far from perfect and its capabilities are tied to design and principles from more than twenty years ago. Although a partial solution has been shaped in what now is the HTTP/2.0, we’re far from resolving all the process associated with the current Web. A complete re-thinking is needed, and IPFS may be the solution the Web needs.

What is IPFS

IPFS is a peer-to-peer distributed file system that seeks to connect all computing devices with the same system of files. In some ways, IPFS is similar to the Web, but IPFS could be seen as a single BitTorrent swarm, exchanging objects within one Git repository.
In other words, IPFS provides a high throughput content-addressed block storage model, with content-addressed hyperlinks. This forms a generalized Merkle DAG, a data structure upon which one can build versioned file systems, blockchains, and even a Permanent Web.
IPFS combines a distributed hashtable, an incentivized block exchange, and a self-certifying namespace. IPFS has no single point of failure, and nodes do not need to trust each other.

How IPFS handles things

The idea behind IPFS is actually pretty simple:

Each file is identified by its content, using a hash. A hash is cryptographically guaranteed to represent only that file. Changing even a bit would result in the whole hash to be completely different.
If a file is too big (>256K) it is automatically chunked and each chunk will have its own hash.
Files are uploaded (made available) to IPFS networks using namespaces and are publicly available in that IPFS network.
Each IPFS node can host content, the node can select which content it is interested in.
Nodes use DHTs (Distributed Hash Table) in order to locate hashes (content) and other nodes.

HTTP and IPFS comparison

IPFS identifies content using hashes instead of locations. HTTP uses URL (Uniform Resource Locator) in order to provide content. This means that in HTTP, for example, www.example.com/dog-picture is showing a dog picture. Tomorrow it might host a cat, or even a hermit crab the day after. In IPFS a hash like QmYwAPJzv5CZsnA625s3Xf2nemtYgPpHdWEz79ojWnPbdG will always represent the same content. That hash represents the IPFS quickstart folder, you’re free to visit it even if you don’t have an IPFS node/client running.

Using hashes allows content to be decoupled from the servers (as opposed to locations), this enables multiple servers to store the same content under the same hash, even if a node (server) goes down, another server holding the same hash can be reached and provide such content.

By using hashes, much like Git, IPFS ensures files uniqueness (identity). Distributed Hash Tables ensure multiple nodes are able to communicate with each other and effectively locate content location in the network, much like BitTorrent does.

Can IPFS replace HTTP?

At its base, IPFS can work alongside HTTP or even replace HTTP. As a matter of fact here’s what Kyle Drake, founder of Neocities (a social network of websites) said about IPFS:

The message I want to send couldn’t possibly be more audacious: I strongly believe IPFS is the replacement to HTTP (and many other things), and now’s the time to start trying it out. Replacing HTTP sounds crazy. It is crazy! But HTTP is broken, and the craziest thing we could possibly do is continue to use it forever. We need to apply state-of-the-art computer science to the distribution problem, and design a better protocol for the web.

Using IPFS instead of HTTP, can shrink or even resolve many of its downsides:

Content may be fetched from a node closer to the client. This can enhance the performance of the request, and potentially reduce the number of hops a package needs to get to its destination.
File are retrieved in parallel from multiple nodes as opposed to HTTP which can retrieve one file from one server. (HTTP pipelining can partially resolve this, but you won’t yet be able to download from multiple sources.)
Content may continue to exist even after an authority takedown as part of another node. Much like BitTorrent currently works: as long as there is a seed, there’s hope. This helps keeping old and abandoned sites up even after linked resources or servers are removed.
Using DDoS to take down a node would result in another node being asked for that content. Unless the content is only hosted on one node, a DDoS attack would have to take down each and every node containing the content in order to take down the content itself.

IPFS caching

On top of all the advantages explained above, much of the static content served by IPFS nodes can be cached locally and automatically. Let’s make an example:

Your brother in the other room watches a video, the video took much time to load so your brother had to wait a while to watch it fully. Your brother then comes to your room to tell how awesome that video was and suggests you watch it right now. You fire up your IPFS browser and watch the video, the video is fully loaded in a few seconds and you don’t have to wait for it to load.

This is possible since your brother’s machine cached it locally. When your browser asked for that content, the closest source would be right in the adjacent room. This is especially useful when there are pay-as-you-go rates or metered connections. The potential saving in traffic and bandwidth is pretty high.

IPFS, dynamic content and IPNS

Up until now we spoke about static content, but the Web is more dynamic than ever. How does IPFS deal with such problem? The answer is IPNS. The InterPlanetary Naming System is a Public Key Infrastructure-based system that allows anyone to create and store dynamic content. If you’ve ever used Bitcoin it will be easier to understand:

You generate a private key that will be used to sign your references (IPNS hashes).
Use the previously generated private key to sign a IPNS hash.
You upload content in that IPNS reference.

Using this method, you’re able to update the content without changing the hash (IPNS hash in this case). This solution is especially good for dynamic content, but it still leaves room for the risk a lot of content to be forgotten. Developers spoke on the matter saying that IPNS references may be implemented to be similar to git commits (versioned) in the future.

IPFS, Filecoin and Ethereum

Some common problems people ask when speaking about IPFS:

How does IPFS ensure there are enough nodes with X content?
Won’t X content eventually be forgotten if there aren’t enough nodes?
Since nodes can decide what to host, how does IPFS deal with niche content?

We all know how BitTorrent can be bad with niche content. The IPFS team, however, devised a solution: Filecoin. Filecoin is a blockchain-based technology (like Bitcoin) and cryptocurrency based on Ethereum. Filecoin closely resembles Bitcoin, but its fundamental difference is that it rewards users for hosting files and user can pay other users to host specific files. In this way users can buy or sell space to host files and ensure enough copies exist or a faster load time. In Filecoin there are two markets available to miners:

Storage Market: here capacity is bought/sold. The more is rented, the more the cost/revenue.
Retrieval Market: rather than capacity, this market rewards speed. “Miners get rewarded for delivering content quickly”.

Conclusions

Now you know about IPFS, the file system that one day may revolutionise the World Wide Web. I really wanted to speak about IPFS earlier, but have never had the possibility due to the depth of this particular topic. This project has a huge potential up its sleeve, I really have high hopes and honestly think this will be the future of the Web. I would like to thank all the sources I used when creating this post:

The official IPFS website.
An Introduction to IPFS by ConenSys.
Why The Internet Needs IPFS Before It’s Too Late from TechCrunch.
HTTP vs IPFS: is Peer-to-Peer Sharing the Future of the Web? from SitePoint.
HTTP is obsolete. It’s time for the distributed, permanent web by Kyle Drake.
IPFS Meets Ethereum and They’re Changing the World from ETHNews.

Image courtesy of mark | marksei

Author
Recent Posts

mark

The IT guy with a slight look of boredom in his eyes. Freelancer. Current interests: Kubernetes, Tensorflow, shiny new things.

Cookie	Duration	Description
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_60468161_1	past	Set by Google to distinguish users.
_ga_DR9SCJ09BV	2 years	This cookie is installed by Google Analytics.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.

Cookie	Duration	Description
edgebucket	session	Reddit sets this cookie to save the information about a log-on Reddit user, for the purpose of advertisement recommendations and updating the content.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
test_cookie	14 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
csv	2 years	No description available.
GoogleAdServingTest	session	No description
wp_api	past	No description
wp_api_sec	past	No description
_pk_id.1.95fa	1 year 27 days	No description
_pk_ses.1.95fa	29 minutes	No description
__smSessionId	9 hours	No description available.
__smToken	1 year	This cookie is set by the Sumo. This cookie is used for verifying whether the user is logged in or not.

IPFS: The InterPlanetary FileSystem which can save the web

Is the Internet… broken?

HTTP downsides

What is IPFS

How IPFS handles things

HTTP and IPFS comparison

Can IPFS replace HTTP?

IPFS caching

IPFS, dynamic content and IPNS

IPFS, Filecoin and Ethereum

Conclusions

You may also like...

Leave a ReplyCancel reply

Recent Posts

Recent Comments

Categories

Latest tutorials

IPFS: The InterPlanetary FileSystem which can save the web

Is the Internet… broken?

HTTP downsides

What is IPFS

How IPFS handles things

HTTP and IPFS comparison

Can IPFS replace HTTP?

IPFS caching

IPFS, dynamic content and IPNS

IPFS, Filecoin and Ethereum

Conclusions

Related posts:

You may also like...

How to install NextCloud 18 on Ubuntu 16.04/18.04/19.04/19.10

MicroK8s: getting started with Kubernetes

OpenStack Queens: OS turns 17, HPC, containers and Edge

Leave a ReplyCancel reply

Recent Posts

Recent Comments

Categories

Latest tutorials