This exercise gives you practice at computing hashes using the facilities offered by the standard libraries of Python and Java. If you are doing this on your own PC, there is nothing else to install, beyond having working installations of Python 3 and the Java Development Kit.
Create a directory for this exercise and download
If doing this exercise on a SoC Linux machine, activate the Anaconda Python distribution by entering the following in a terminal window:
module load legacy-eng module add anaconda3/2020.11
This will ensure that the
python command runs Python 3.
python in the terminal window to run Python’s REPL
(read-eval-print loop). Read the contents of
message.txt into a
byte string using the following code:
>>> import pathlib >>> path = pathlib.Path("message.txt") >>> data = path.read_bytes()
Use the hashlib module to compute and display a SHA-256 hash of the file contents, like so:
>>> import hashlib >>> h = hashlib.sha256() >>> h.update(data) >>> print(h.hexdigest())
Note that the
update() method can be called repeatedly, to feed the
hash function with multiple items of data. If you have a single chunk
of data, this example can be shortened to
Note also that you can use
digest() instead of
hexdigest(), to get
output from the hash function as raw bytes instead of a string of
printable hex digits. Try this now to compare the sizes of the input to
and output from the hash function. You can use
len(data) to get the
len(h.digest()) to get the latter. You’ll see that the
hash is smaller than the input. The hash size never varies, regardless
of how large or small the input gets.
A wide range of hash functions is provided by the
You can see what is available on your platform by examining the value of
hashlib.algorithms_available in the REPL.
Hash functions can be accessed in Java using the
class, which is part of the
java.security package in Java’s standard library.
Hash.java to the directory you are working in.
This file contains a Java program that is supposed to compute the hash
of a file named on the command line, using a hash function also
specified on the command line. Currently, all it does is read bytes
from the named file.
Under the ‘Apply hash function’ comment, add the following:
MessageDigest md = MessageDigest.getInstance(args); md.update(message); byte hash = md.digest();
MessageDigest object is obtained by calling the
supplying the name of the desired hash function as a string. The Java
documentation has a list of supported hash function names. In this
code, the name comes from the second command line argument of the program.
As with the Python example, we feed data to the hash function by calling
update() method, then call the
digest() method to retrieve the
Check that the program compiles before continuing. If you see a compiler
error, make sure that you’ve imported
MessageDigest from the
Under the ‘Display hash’ comment, add code that will display the hash
as a string of hex digits. Compile and run the program, specifying
SHA-256 as command line arguments. Compare its
output with that obtained from Python.
System.out.printfto print each one, with
%02xas the formatting directive.