You may have seen MD5 hashes listed next to downloads during your internet travels, but what exactly are they? Let’s take a look at what these cryptic strings are and how you can use them to verify your downloads.
What Are Hashes and What Are They Used For?
(Image credit: Wikimedia Commons)
Hashes, “digests,” are the products of cryptographic algorithms. If you’re not sure what an algorithm is, be sure to check out our article on what they are and how they work. In short, though, they’re a set of instructions used by computers to manipulate data. Many hash functions are designed to produce a fixed-length digest, regardless of the size of the input data. Take a look at the above chart and you’ll see that both “Fox” and “The red fox jumps over the blue dog” yield the same length output.
Another factor is complexity. Compare the second example in the above chart to the third, fourth, and fifth. You’ll see that despite a very minor change in the input data, the resulting hashes are all very different from one another. This is a sign of complexity of the algorithm (at least to our non-programmer eyes) and helps make it so that working backwards from the hash to the data is very difficult. Passwords are often stored as hashes because of this reason; it’s easy to take the password during a login attempt and compare it to the stored hash. On the the other hand, if someone has the hash, it’s very difficult to work backwards to the original input. When people try to crack passwords, they usually don’t work backwards, but instead use a dictionary of known hashes (usually of common passwords and key patterns) to compare the stolen ones with.
Data Verification
MD5, the Message-Digest Algorithm, has been used in multiple types of security-based programs in the past, but it’s also widely employed for another purpose: data verification. These types of algorithms work great to verify your downloads. Imagine, if you will, you’re online trying to grab the latest Ubuntu release from BitTorrent. Some horrible troublemaker starts distributing a version of the .iso you need but with malicious code embedded into it. Not just that, he’s clever, so he makes sure the files are exactly the same size. You wouldn’t know you had the bad file until you tried to boot the CD, and by then, permanent damage could have already occurred!
Thankfully for us, Canonical posts the MD5 checksums for its images online. You can run a hash check yourself with any number of tools, and then check it against the posted checksum. If there are any differences at all, you know that the file you have was tampered with, did not complete properly, or something else prevented the data from matching. This way you prevent any damage to your system before you run anything, and you can just re-download the appropriate file.
This comes in handy not just for Linux distros, but for other things like BIOS files, third-party Android ROMs, and router firmwares – all things that could potentially “brick” your devices if the data is tainted. In general, large files have a larger risk of data corruption, so you may want to run your own checksums if your archives are important.
MD5 is no longer considered completely secure, and so people have started to migrate to other commonly used hash algorithms like SHA-1. This last one in particular is used for data verification more and more often so most tools will work with both of these algorithms.
Calculating File Hashes via Command-Line
Linux and OS X
When you’re downloading files, you’ll see the checksums listed on the website somewhere. But how do you compare yourself?
Linux distros will have this ability built-in. Just pop open a terminal and enter the following command:
md5sum path/to/file.iso
I used the example of an .iso, but you can get hashes on any time of file.
If you’re on a Mac, you can open up Terminal.app and use this:
md5 path/to/file.7z
Replace ‘md5’ with ‘sha1’ in both of the previous commands to check the SHA-1 hash.
Windows
Windows doesn’t have a built-in tool to check hashes, but Microsoft does provide one from their website.
Microsoft File Checksum Integrity Verifier Utility
Once you download and extract the file (there’s no installation), open up a Command Prompt window. Navigate to where the file is, and then use the following command to check MD5 hashes:
fciv.exe C:\path\to\file.bin
FCIV checks MD5 by default, but it can also check SHA-1 hashes, too:
fciv.exe –sha1 C:\path\to\file.zip
GUI-Based Tools for Checking Hashes
If you’re on Windows or OS X and you don’t like using the command-line, there’s a really great graphical utility you can use called HashTab.
Head on over to the HashTab webpage, download the app, and install it.
Now, just right-click on your file, and go to Properties.
You’ll see a new “Hashes” tab.
Your computer will automatically start calculating the three default hashes: CRC32, MD5, and SHA-1.
If you click the Settings link, you can customize which hashes show up. There are quite a few to choose from!
As you can see, there are plenty of benefits to checking your downloads, particularly for firmware images and the like. Now that you know what to do with the checksums you find online, you can rest easy.
Have a favorite hashing algorithm? Found checksums used for a really weird purpose? Love cryptography? Share you thoughts in the comments!