Exercise 3: Computing HMACs in Python

This exercise shows you how to compute authentication tags in Python using the HMAC (hash-based message authentication code) algorithm. This can be done with the hmac and secrets modules from the standard library.

  1. Create a directory for this exercise, then download the files message1.txt and message2.txt into it. These two text files differ by a single character, which you can verify using the diff command1:

    diff message1.txt message2.txt
    
  2. Start a Python 3 REPL from a terminal window and use the secrets module to create a key consisting of 16 random bytes:

    >>> import secrets
    >>> key = secrets.token_bytes(16)
    >>> key.hex()
    

    The last of the lines above is just a convenient way of examining the key, as it returns the key’s bytes as a string of hexadecimal digits.

  3. Now read the contents of message1.txt and message2.txt like so:

    >>> from pathlib import Path
    >>> message1 = Path("message1.txt").read_bytes()
    >>> message2 = Path("message2.txt").read_bytes()
    
  4. To compute an authentication tag for message1.txt, create an HMAC object from the key and file contents, specifying the required hash function. Then call the digest() method to retrieve the tag as a sequence of bytes:

    >>> import hmac
    >>> h = hmac.new(key, message1, digestmod="sha256")
    >>> tag1 = h.digest()
    >>> tag1.hex()
    

    As before, calling the hex() method allows us to examine the bytes in a convenient way2.

    We’ve kept things simple here, but note that you can also create the HMAC object without providing any data, and then subsequently feed it chunks of data using the update() method. This is useful when you want to compute an authentication tag over multiple items of data.

  5. Now imagine that Alice wants to send message1.txt to Bob, in such a way that Bob can detect any tampering. We’ll assume that Alice and Bob have already shared the HMAC key in a secure fashion. All Alice needs to do is transmit message1.txt along with tag1.

    Let us suppose that an attacker somehow tampers with the message, such that message2.txt and tag1 are what Bob actually receives. Bob can check whether message2.txt is what Alice sent by computing his own authentication tag for this message and then comparing it with the tag that was received from Alice:

    >>> h = hmac.new(key, message2, digestmod="sha256")
    >>> tag2 = h.digest()
    >>> hmac.compare_digest(tag1, tag2)
    

    If you try this, it should return False, indicating that the HMACs are different, hence message2.txt is different from message1.txt. Although the attacker has been able to change the message, they cannot forge a valid authentication tag for it because they do not have the key needed to generate a valid tag3.


  1. This is on Linux or macOS. If you are working from the Windows command prompt, you could use the comp command instead. ↩︎

  2. If a string of hex digits is actually what you want, you can obtain it more directly by calling the hexdigest() method on the HMAC object instead of digest()↩︎

  3. Note: this assumes that the key is sufficiently large to resist brute force attack; that it is sufficiently random that its value cannot be predicted; and that it is has been stored securely by both Alice and Bob. ↩︎