While the word encryption has a shared general meaning, that doesn’t mean that all encryption is the same. In fact, the field of cryptography is as wide and diverse, as it is basically impossible for outsiders to understand.
Just like physics and other brainy subjects, you have to be some sort of weird genius to really understand how it all works.
The basic premise is easy enough for most people. That includes me, and it goes something like this:
Encryption is a method of taking a plainly readable bit of information and then scrambling it using a special mathematical recipe known as a cipher. This is where the word “decipher” comes from. Something we say when trying to figure a cryptic piece of information out. Hey, the word “cryptic” is related to cryptography too!
While there have been many types of ciphers over the millennia, there are basically only three encryption approaches that have relevance in the modern day when it comes to computers and the internet. In this article, I’m going to discuss all three of these approaches and by the end, you should have a good idea of how they work. At least in general terms.
The proper term to use for this type of encryption is the cryptographic hash function. That’s quite a mouthful. So, let us break it down the bit first.
The word "cryptographic" simply means that the hash function is meant to do the job of a cipher. So that means we are dealing with encryption of some sort. Not a huge revelation, but OK. The hash function part is what’s really interesting here.
A hash function is a method of mapping data into a specific unique value so that you can quickly look up information in a computerized system.
In a way, it’s useful to think of a hash function as a way to create a fingerprint for a piece of data. Be that a string of text, a file, a photo or anything else.
Hashing a piece of data involves running it through a particular recipe that transforms it into something known as a digest. Changing even a single bit of the data will result in a very different hash digest.
The hash digest is always the same length, no matter how long or short the original data is. Only the original data run through the hash will produce that specific digest. The process is also one-way. There is no way of getting the original data back by reversing the process. The only way to know if a specific hash digest belongs to a specific piece of data is to run the hash function on that piece of data again.
This also makes it a fantastic technology to check file integrity Whether a given file has been tampered with. If even the smallest change is brought to the data it will no longer match the hash digest. OK, but how does that work as a form of encryption. I’m glad you asked!
How many passwords do you use every day? Hopefully, you don’t share the same password across multiple services, but that probably means you have a dozen different passwords for all the different internet services you use.
Have you ever wondered where those passwords are stored? When you type in your password to log into a service, the server needs some way to check if that password is the correct one. Obviously, they can just keep a list of passwords in a database. However, if someone hacks the service and gets a copy of that database everyone will be in real trouble.
The best solution at present is not to keep a copy of the password at all! As you’ve probably figured out, this is achieved using a cryptographic hash function. You create a password when signing up for something. The text of the password runs through a hash function. The hash digest is stored and associated with your username. When you try to log in the password you enter is hashed again and the two digests compared. If they aren’t exactly the same then you are out of luck. Even if a hacker gets his hands on the hashes, there is no way to figure out which password generated it.
Just as we have various ciphers which offer different ways to encrypt data, there are various ways to perform a hash function. When it comes to cryptographic hashes the best-known algorithm is SHA or the secure hash algorithm.
SHA-0 was actually withdrawn before going into general use. Mainly thanks to flaws found in it. SHA-1, however, was the standard all the way up to 2010, originally designed by the NSA. Now it also has some exposed weaknesses and in most cases shouldn’t be used to secure anything anymore.
The current standard is SHA-2, which comes in various types. The NSA designed all of the SHA-2 variants. SHA-2 algorithms are usually referred to by their block lengths: SHA-256 and SHA-512 are common examples.
There’s also SHA-3, which is not designed by the NSA. It has the same block length as SHA-2. However, it looks nothing like it internally.
It's also worth mentioning MD5 hashes here as well. This hash function was created to be a cryptographic tool. Unfortunately, it was cracked pretty easily. These days it's still used for file checking.
In the history of cryptography, the usual way that you could "crack the code" of a cipher was to figure out how the cipher works and then reverse engineer it. It doesn't really matter if you know how the hash is created. You simply cannot walk the process backward and get the original string back.
Other types of modern encryption work on a similar principle. A key that is worked into the encryption algorithm. This produces a uniquely encrypted output. Unlike a hash function, the actual information is all there in the encrypted code. If you have the right key you can get it all back again. In exactly the same shape it was before it gets encrypted.
In symmetric key encryption, there is only one key. The same key that locks the data away will also unlock it. This means that you have to be very careful about who you share that key with.
You can't send your symmetric key over the internet for fear of it being intercepted. Which means you need a secure way to install it on both machines that are going to be communicating with each other.
That doesn't make symmetric encryption very useful for anything long distance, but it is practical if you have easy access to both machines that need to talk to each other. For example, a router and a computer.
WiFi passwords are an example of a symmetric key. The WiFi signal is encrypted with it so that no one listening in can get any data. When you type the WiFi password into your laptop, phone or tablet, you are installing a copy of the key on it. Then the scrambled code coming from the WiFi router makes sense.
Symmetric encryption like this is favored for low processing power requirements. You don't need much to decrypt the information. It's fast and lightweight. Which makes it a good fit for devices that have to run off a battery or might not have spare processing power lying around. Just change your WiFi password often!
So symmetric encryption works well enough for something like WiFi, but how do we solve the problem of sending your precious encryption key far away where someone might snag it?
It turns out the answer is not to use one key, but two. This is asymmetric encryption. Two keys are linked mathematically. What exactly does that mean? On a technical level it's way above my head, but in practice mean that if you encrypt data with one key you can only decrypt it with the other.
Of the two keys, one always stays with you. It is never sent anywhere. That's your private key. The other key is publicly available to anyone. This is, unsurprisingly, called the public key.
This means if someone encrypts a message with your public key, you are the only one that can decrypt and read it. Typically there isn't just one key in play however. The participants in the transaction each use the other's public keys to encrypt the messages. The public key can't be used to decrypt anything created by it. This solves our problem.
Except, now you need to know that the public key you are using is actually paired to the private key. That it will reach the person who you want to send the data to. This is why we have central certificate authorities or other verification methods in place.
This type of encryption is also heavier on processing power, takes a bit longer to decrypt and (as you can see) is quite a bit more complex than the symmetric method. Still, the significant security benefits of asymmetric encryption have won it universal use on the web, wherever reputable sites do business.
Wouldn't it be nice if we never had to lock anything? In a perfect world where you could trust everyone else, you wouldn't need all this clever trickery. The bad news is that this isn't a perfect world. There are plenty of people out there that want nothing more but to cheat and exploit people for their own gain.
With these encryption technologies, we can act as if people on the web can be trusted. Not because everyone is honest, but because these technologies keep them honest. Without that technologically-enforced trust, we couldn't do so much as sell a pack of gum online. So thank goodness for encryption!