Exercise 1: Introduction to Hash Functions

This exercise introduces hash functions and demonstrates how you can compute file hashes using GCHQ’s browser-based tool CyberChef. You can run CyberChef online or download it to run locally. (In the latter case, you’ll need to unzip the archive and then open the HTML file from the CyberChef directory in your browser.)

Using Hash Functions

  1. Run CyberChef in your browser. At the bottom of the UI, make sure that the ‘Auto Bake’ checkbox is ticked.

  2. Examine the contents of message.txt in your browser or by downloading the file and opening it in a text editor. Copy the entire contents of this text file and paste them into the Input panel of CyberChef. You will see the same text appear in the Output panel, because we are not yet doing any processing of the input.

  3. In the Operations menu on the left of the CyberChef UI, click on the ‘Hashing’ menu item to open up a submenu listing a large number of different hash functions. Find the entry for MD5 and hover the cursor over it to read the description. Then click and drag the MD5 operation onto the Recipe panel. The contents of the Output panel will now change to show the MD5 hash of the input, as a string of 32 hexadecimal digits:

    c7cd4529f9ebe5b40ff061188ec6f5c1
    

    (If you see a different hash value, beginning with a3, you’ve probably just included the final newline character of the file in the Input panel, which is not a problem…)

    Screenshot of CyberChef computing an MD5 hash

    CyberChef computing an MD5 hash

  4. Alter a single character of the text in the Input panel. You should see that most of the hex digits have changed. For example, if you change the word “park” to “pork”, you should see the hash change to

    e231c7c4f92f282fdebd4419c8629ba9
    

    (Again, you’ll see a different value if you’ve included a trailing newline in the input.)

    This extreme sensitivity of hash function output to changes in input is known as the avalanche effect.

  5. MD5 is too short for security purposes these days, so try using something much more secure, from the SHA-2 family.

    Remove the MD5 operation by clicking the trashcan icon at the top of the Recipe panel, then find ‘SHA2’ from the ‘Hashing’ submenu and drag it onto the panel. Click on the Size parameter to choose different SHA-2 hash functions.

    Notice how the hash sizes vary in direct relation to the hash function name (e.g., SHA-256 = 256 bits = an output of 64 hex digits). Notice also how all of these hashes are significantly longer than MD5.

Hash Collisions

  1. Download psdocs.zip and unzip it. This will give you two PostScript documents, recommend.ps and order.ps. These can be printed on a PostScript-supporting printer or previewed with a suitable application (e.g., evince or gv on Linux machines, or Adobe Acrobat Reader on Windows). For convenience, we show them side-by-side in the image below. You can see that the documents are entirely different in content.

    screenshots of two PS documents with the same MD5 hash

    Two different letters (recommend.ps on left, order.ps on right)

  2. Replace the SHA2 operation with MD5 in the CyberChef Recipe panel. Then click on the Open file as input button (the middle of the five buttons) at the top of the Input panel and select recommend.ps. The MD5 hash of the file will appear in the Output panel.

  3. Now click on the Add an input tab button (the one with the ‘+’ icon) to add a new input tab. With this new tab selected, repeat the previous step, this time loading order.ps. If you examine the two tabs in the Output panel closely, you will see that the two files have produced the same hash (beginning a25f7f), despite appearing to be entirely different!

    This is an example of an MD5 collision. It was constructed in 2005 by Magnus Daum of Ruhr-Universität Bochum and Stefan Lucks from the University of Mannheim. The increasing feasibility of finding such collisions is the key reason why MD5 – and, subsequently, SHA-1 – are no longer considered usable for security purposes.

  4. Finally, try using CyberChef to compute various SHA-2 hashes of the two documents. (Note: you may need to use the Bake button each time you change the hash function, to ensure that both hashes update correctly.) You’ll find that there are no collisions for any of these functions. A collision for two inputs to one particular hash function does not imply collisions for the same inputs in other hash functions.

â–¡