Sign a byte string with SHA256 in Python

Question:

Currently I have some code that signs a byte string with the SHA256 algorithm using the native OpenSSL binary, the code calls an external process, sends the parameters, and receive the result back into the Python code.

The current code is as follows:

signed_digest_proc = subprocess.Popen(
    ['openssl', 'dgst', '-sha256', '-sign', tmp_path],
    stdin=subprocess.PIPE,
    stdout=subprocess.PIPE
)
signed_digest_proc.stdin.write(original_string)
signed_digest, _ = signed_digest_proc.communicate()

base64.encodestring(signed_digest).decode().replace('n', '')

When original_string is too big, I might have problems with the result (from the communication with an external process I think), that’s why I’m trying to change it to a Python only solution:

import hmac, hashlib
h = hmac.new(bytes(key_pem(), 'ASCII'), original_string, hashlib.sha256)
result = base64.encodestring(h).decode().replace('n', '')

This result in a completely different string than the first one.

What would be the way to implement the original code without calling an external process?

Answers:

The openssl command you used does three things:

  • Create a hash of the data, using SHA256
  • If RSA is used, pad out the message to a specific length, using PKCS#1 1.5
  • Sign the (padded) hash, using the private key you provided. It’ll depend on the type of key what algorithm was used.

The hmac module does not serve the same function.

You’ll need to install a cryptography package like cryptography to replicate what openssl dgst -sign does. cryptography uses OpenSSL as a backend, so it will produce the same output.

You can then

  • load the key with the load_pem_private_key() function. This returns the right type of object for the algorithm used.
  • use the key to sign the message; each key type has a sign() method, and this method will take care of hashing the message for you if you so wish. See for example the Signing section for RSA.

    However, you’ll need to provide different kinds of config for the different .sign() methods. Only the RSA, DSA and Elliptic Curve keys can be used to create a signed digest.

You’ll have to switch between the types to get the signature right:

from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.primitives.asymmetric import dsa, ec, rsa, utils
from cryptography.hazmat.primitives import serialization
from cryptography.hazmat.primitives.asymmetric import padding

# configuration per key type, each lambda takes a hashing algorithm
_signing_configs = (
    (dsa.DSAPrivateKey, lambda h: {
        'algorithm': h}),
    (ec.EllipticCurvePrivateKey, lambda h: {
        'signature_algorithm': ec.ECDSA(h)}),
    (rsa.RSAPrivateKey, lambda h: {
        'padding': padding.PKCS1v15(),
        'algorithm': h
    }),
)

def _key_singing_config(key, hashing_algorithm):
    try:
        factory = next(
            config
            for type_, config in _signing_configs
            if isinstance(key, type_)
        )
    except StopIteration:
        raise ValueError('Unsupported key type {!r}'.format(type(key)))
    return factory(hashing_algorithm)

def sign(private_key, data, algorithm=hashes.SHA256()):
    with open(private_key, 'rb') as private_key:
        key = serialization.load_pem_private_key(
            private_key.read(), None, default_backend())

    return key.sign(data, **_key_singing_config(key, algorithm))

If you need to hash a large amount of data, you can hash the data yourself first, in chunks, before passing in just the digest and the special util.Prehashed() object:

def sign_streaming(private_key, data_iterable, algorithm=hashes.SHA256()):
    with open(private_key, 'rb') as private_key:
        key = serialization.load_pem_private_key(
            private_key.read(), None, default_backend())

    hasher = hashes.Hash(algorithm, default_backend())
    for chunk in data_iterable:
        hasher.update(chunk)
    digest = hasher.finalize()
    prehashed = utils.Prehashed(algorithm)

    return key.sign(digest, **_key_singing_config(key, prehashed))

with open(large_file, 'rb') as large_file:
    signature = sign_streaming(private_key_file, iter(lambda: large_file.read(2 ** 16), b''))

This uses the iter() function to read data from a binary file in chunks of 64 kilobytes.

Demo; I’m using an RSA key I generated in /tmp/test_rsa.pem. Using the command-line to produce a signed digest for Hello world!:

$ echo -n 'Hello world!' | openssl dgst -sign /tmp/test_rsa.pem -sha256 | openssl base64
R1bRhzEr+ODNThyYiHbiUackZpx+TCviYR6qPlmiRGd28wpQJZGnOFg9tta0IwkT
HetvITcdggXeiqUqepzzT9rDkIw6CU7mlnDRcRu2g76TA4Uyq+0UzW8Ati8nYCSx
Wyu09YWaKazOQgIQW3no1e1Z4HKdN2LtZfRTvATk7JB9/nReKlXgRjVdwRdE3zl5
x3XSPlaMwnSsCVEhZ8N7Gf1xJf3huV21RKaXZw5zMypHGBIXG5ngyfX0+aznYEve
x1uBrtZQwUGuS7/RuHw67WDIN36aXAK1sRP5Q5CzgeMicD8d9wr8St1w7WtYLXzY
HwzvHWcVy7kPtfIzR4R0vQ==

or using the Python code:

>>> signature = sign(keyfile, b'Hello world!')
>>> import base64
>>> print(base64.encodebytes(signature).decode())
R1bRhzEr+ODNThyYiHbiUackZpx+TCviYR6qPlmiRGd28wpQJZGnOFg9tta0IwkTHetvITcdggXe
iqUqepzzT9rDkIw6CU7mlnDRcRu2g76TA4Uyq+0UzW8Ati8nYCSxWyu09YWaKazOQgIQW3no1e1Z
4HKdN2LtZfRTvATk7JB9/nReKlXgRjVdwRdE3zl5x3XSPlaMwnSsCVEhZ8N7Gf1xJf3huV21RKaX
Zw5zMypHGBIXG5ngyfX0+aznYEvex1uBrtZQwUGuS7/RuHw67WDIN36aXAK1sRP5Q5CzgeMicD8d
9wr8St1w7WtYLXzYHwzvHWcVy7kPtfIzR4R0vQ==

Although the line lengths differ, the base64 data the two output is clearly the same.

Or, using a generated file with random binary data, size 32kb:

$ dd if=/dev/urandom of=/tmp/random_data.bin bs=16k count=2
2+0 records in
2+0 records out
32768 bytes transferred in 0.002227 secs (14713516 bytes/sec)
$ cat /tmp/random_data.bin | openssl dgst -sign /tmp/test_rsa.pem -sha256 | openssl base64
b9sYFdRzpBtJTan7Pnfod0QRon+YfdaQlyhW0aWabia28oTFYKKiC2ksiJq+IhrF
tIMb0Ti60TtBhbdmR3eF5tfRqOfBNHGAzZxSaRMau6BuPf5AWqCIyh8GvqNKpweF
yyzWNaTBYATTt0RF0fkVioE6Q2LdfrOP1q+6zzRvLv4BHC0oW4qg6F6CMPSQqpBy
dU/3P8drJ8XCWiJV/oLhVehPtFeihatMzcZB3IIIDFP6rN0lY1KpFfdBPlXqZlJw
PJQondRBygk3fh+Sd/pGYzjltv7/4mC6CXTKlDQnYUWV+Rqpn6+ojTElGJZXCnn7
Sn0Oh3FidCxIeO/VIhgiuQ==

Processing the same file in Python:

>>> with open('/tmp/random_data.bin', 'rb') as random_data:
...     signature = sign_streaming('/tmp/test_rsa.pem', iter(lambda: random_data.read(2 ** 16), b''))
...
>>> print(base64.encodebytes(signature).decode())
b9sYFdRzpBtJTan7Pnfod0QRon+YfdaQlyhW0aWabia28oTFYKKiC2ksiJq+IhrFtIMb0Ti60TtB
hbdmR3eF5tfRqOfBNHGAzZxSaRMau6BuPf5AWqCIyh8GvqNKpweFyyzWNaTBYATTt0RF0fkVioE6
Q2LdfrOP1q+6zzRvLv4BHC0oW4qg6F6CMPSQqpBydU/3P8drJ8XCWiJV/oLhVehPtFeihatMzcZB
3IIIDFP6rN0lY1KpFfdBPlXqZlJwPJQondRBygk3fh+Sd/pGYzjltv7/4mC6CXTKlDQnYUWV+Rqp
n6+ojTElGJZXCnn7Sn0Oh3FidCxIeO/VIhgiuQ==
Answered By: Martijn Pieters