Reading zip file content for later compute sha256 checksum fails

Question:

I have a zip file which contains some regular files. This file is uploaded to a fileserver.
Now I am trying to compute the sha256 checksum for the zip file, then write the checksum into a *.sha256sum file and upload to the fileserver as well.

Then when one downloads the zip file and the checksum file (.sha256sum) from the fileserver, he/she computes again the sha256 of the zip file and compare it with the one stored as text in the checksum file (.sha256sum) just downloaded.

When I try to compute the sha256 checksum of the zip file i get an error.

with open(filename) as f:
    data = f.read()
    hash_sha256 = hashlib.sha256(data).hexdigest()

The error is the following and it is thrown in line data = f.read():

in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 44: character maps to <undefined>
Asked By: Willy

||

Answers:

You must open the file in binary mode:

with open(filename, 'rb') as f:
    data = f.read()
    hash_sha256 = hashlib.sha256(data).hexdigest()

Per Reading and Writing files:

Normally, files are opened in text mode, that means, you read and write strings from and to the file, which are encoded in a specific encoding.

So, there’s something going on under the hood to make it usable text, which you don’t want.

Appending a ‘b’ to the mode opens the file in binary mode. Binary mode data is read and written as bytes objects.

Answered By: RagingRobot
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.