Encryption and Hashing differences and use cases
One of the most common misunderstanding among beginners (and sometimes even IT professionals) is the difference between encryption and Hashing. Let’s take this as a good chance to make things clear.
Encryption and Hashing
As the introduction paragraph stated, this is one of the most common misconception in the IT world. Now they’re both ways to hide something (information in this case) but how are they different? Simply put it this way: one way hides things but enable successive retrieval while the other hides things but makes retrieval almost impossible. The first one is encryption, the second one hashing.
Hashing is a way to hide information and make them unreadable by everyone. What does this mean? If I hash the string test with a hashing algorithm like md5 I will get 098f6bcd4621d373cade4e832627b4f6 . This result is “almost unique” and no matter how many times I try to run md5 with test as input: I will always get the same output. The output is also usually fixed-size meaning it won’t grow with the input. When I said almost unique it means that the probability that two inputs produce the same output is low. This phenomenon is named collision. The possibility of collision is virtually impossible to avoid, but it is normally negligible. Stronger algorithms are usually associated with a lower chance of collision. Here are some of the most known hash functions are: CRC, CRC32, MD5, SHA1, SHA256, RIPEMD160.
Hash functions have few uses but they are really important:
- Store password: so that only the user knows what the real password is. Using a hash function to store a password is always a good idea: if an attacker gets access to a database he will have a hard time trying to recover the passwords from the various hash-es. On the other side the user will be able to log in by typing his password and having it re-hashed and compared with the stored hash.
- Identify files: another common use for hashing is identifying files. If I run a hash function passing an entire file it will always produce the same output. This is particularly useful when distributing software. For example if you want to download a Linux distribution, chances are you will also get the file checksum (either md5 or SHA*) so that you can check if your file has been altered or damaged during download.
- Partitioning data: another common use for hash function is partition data. If you use hash to identify where data are stored you get hash partitions. This concept is quite complicated and it is used by even more complicated solutions like databases. A good example is OpenStack Swift.
On the other side, encryption makes retrieval possible (but difficult) thanks to the concept of keys. Imagine a room with only one door, and inside of this room you put a family photo, when you exit you lock the door with a key. That process is much similar to encryption: only the ones possessing the key can access. Now go on and duplicate that key and then give it to your family: they too will get access. This process is known symmetric encryption, there is also asymmetric encryption but I will talk about this in another article since it is much more complicated.
Encryption is widely used in IT:
- Channel Encryption: one of the most common uses for encryption is in the web. When you see HTTPS instead of HTTP in the address bar of your browser you are using an encrypted channel to communicate. This kind of encryption is both symmetric and asymmetric depending on the phase you are in. SSL/TLS are a good example of this.
- Files Encryption: another common use for encryption is simply encrypting files: if you want your files to be hidden from the eyes of the others you just encrypt them. There are many software around this concept like TrueCrypt (and its successor VeraCrypt) or even Microsoft BitLocker.
- Digital signature: through the use of asymmetric encryption it is possible to sign and verify messages and even emails. Prominent examples are GPG and PGP.
- Ransomware: 2016 is the year of the “ransomware”, a particular malware that encrypts everything on a computer and asks for ransom to get the files back. Ransomware makes use of encryption to hold files until the ransom has been paid.
Next time you face the problem, think twice. So let’s recap:
- Encryption: makes retrieval possible using keys. Difficult otherwise. There are two main ways to perform encryption: symmetrical and asymmetrical.
- Hashing: retrieval from the output is very difficult. Hash usually are fixed-size. Hashing the same data will produce the same hash.
Image courtesy of r2hox