This exercise shows you how to compute authentication tags in Python
using the HMAC (hash-based message authentication code) algorithm.
This can be done with the hmac
and secrets
modules from
the standard library.
Create a directory for this exercise, then download the files
message1.txt
and message2.txt
into it.
These two text files differ by a single character, which you can verify
using the diff
command1:
diff message1.txt message2.txt
Start a Python 3 REPL from a terminal window and use the secrets
module to create a key consisting of 16 random bytes:
>>> import secrets
>>> key = secrets.token_bytes(16)
>>> key.hex()
The last of the lines above is just a convenient way of examining the key, as it returns the key’s bytes as a string of hexadecimal digits.
secrets
module for this, rather
than the random
module. The latter is typically used for the
generation of pseudorandom numbers but it does not generate these numbers
securely and should never be used in cryptographic applications.
Now read the contents of message1.txt
and message2.txt
like so:
>>> from pathlib import Path
>>> message1 = Path("message1.txt").read_bytes()
>>> message2 = Path("message2.txt").read_bytes()
To compute an authentication tag for message1.txt
, create an HMAC object
from the key and file contents, specifying the required hash function.
Then call the digest()
method to retrieve the tag as a sequence of bytes:
>>> import hmac
>>> h = hmac.new(key, message1, digestmod="sha256")
>>> tag1 = h.digest()
>>> tag1.hex()
As before, calling the hex()
method allows us to examine the bytes in
a convenient way2.
We’ve kept things simple here, but note that you can also create the
HMAC object without providing any data, and then subsequently feed
it chunks of data using the update()
method. This is useful when you
want to compute an authentication tag over multiple items of data.
Now imagine that Alice wants to send message1.txt
to Bob, in such a
way that Bob can detect any tampering. We’ll assume that Alice and Bob
have already shared the HMAC key in a secure fashion. All Alice needs to
do is transmit message1.txt
along with tag1
.
Let us suppose that an attacker somehow tampers with the message, such
that message2.txt
and tag1
are what Bob actually receives. Bob can
check whether message2.txt
is what Alice sent by computing his own
authentication tag for this message and then comparing it with the tag
that was received from Alice:
>>> h = hmac.new(key, message2, digestmod="sha256")
>>> tag2 = h.digest()
>>> hmac.compare_digest(tag1, tag2)
If you try this, it should return False
, indicating that the HMACs are
different, hence message2.txt
is different from message1.txt
.
Although the attacker has been able to change the message, they cannot
forge a valid authentication tag for it because they do not have the key
needed to generate a valid tag3.
tag1 == tag2
. The compare_digest
method
performs a more careful constant-time comparison, as
a defence against timing attacks.
□
This is on Linux or macOS. If you are working from the Windows
command prompt, you could use the comp
command instead. ↩︎
If a string of hex digits is actually what you want, you can obtain
it more directly by calling the hexdigest()
method on the HMAC
object instead of digest()
. ↩︎
Note: this assumes that the key is sufficiently large to resist brute force attack; that it is sufficiently random that its value cannot be predicted; and that it is has been stored securely by both Alice and Bob. ↩︎