When many people look at a hash and understand that there is no way of getting the original value back often think well what use is that!!!
What a hash is
Every time some text or file is hashed it should always produce the same set hash but if even a single character or digit changes in the original data a completely different hash will be produced. It is like a verification stamp of a set of data.
Types of hashes
The main two types of hashing algorithms are md5 (message digest) and sha (secure hashing algorithm).
Here is an example of “hello” hashed using md5 (sha is similar but creates a longer hash which is therefore more secure):
Hello (md5): 5d41402abc4b2a76b9719d911017c592
Clashes
Clashes are flaws in the hashing algorithm that means the same hash is produced with different text. These clashes reduce the total number of possible hashes making brute forcing and other hacking methods more feasible. Although md5 and sha have been well tested now days and are safe enough to use in my personal opinion. Sha is better for storing sensitive data such as passwords, and md5 is best for checksums for example checking if a file has downloaded correctly.
Salting hashes & Rainbow Tables
Brute forcing using rainbow tables which are tables that consist of a lot of combinations of string and hash for example:
A = hash
B = hash
C = hash
And so on…
They consist of every combination within a range for example a-z and A-Z from 1 to 6 characters in length. If a string (text) is hashed that is within that range when the hash is looked up in the database the original text will be returned! The tables that contain all the combinations are often referred to as “rainbow tables” and there are many that are available free to download on the internet.
Now to get around this problem there is a technique known as “salting”. This is mainly used in protecting passwords from being brute forced which often fall in the range of “a to z and 0 to 9 and between 6 and 10 characters in length”. Rainbow tables I am sure exist for such a short range. Now to allow passwords of such a small range and length you can salt it. The same salt must be used every time but this produces a much stronger hash that is more resistant to brute forcing techniques.
Here is an example of a salted md5 hash of hello.
Md5(“members_username_here” + “password” + “@_=1”)
Now if passwords are restricted to 6+ characters and usernames are 6+ characters you will have a minimum original md5 value length of at least 12 characters (excluding @_=1)! Notice I have also added some characters outside of the range I mentioned above: “@_=1”. This would now mean they would have to use a much larger rainbow table and makes the password much harder to brute force. Also the larger the salt used the much harder it is to brute force the hash to get the original value.
Uses of hashing
The two main uses I have found that I use myself a lot in web development are as follows.
Storing Passwords
Storing a hash of a password in a database is much more secure than storing the original plain text version of the password. This is because if someone hasks the database, or a system admin needs to manually update the database or any other reasons. They will not be able to see anyone’s passwords! Meaning not only can they not change it, they can also not access the members account because their password is hidden behind the hash.
Now your probably wondering if the original password is not stored then how do I know if they have entered the correct password? Easy! You has the password they type in and if the hash is the same as whats stored in the database then their details are correct.
File Names
Another great use for hashes are for safely storing files that users have uploaded without having to worry about characters in the file names, names being to long or to short, file names clashing, etc.
All you do is hash the file itself, and use that hash as the filename and append the file extension to the hash. You will still have to check for clashes in case someone else uploads the same file. But you can always salt the file with a random number to lesson the chances of a clash ever happening even with the same file.