How to generate a byte string consisting of random nonzero bytes using Python?

Question:

As part of implementing the PKCS #1 v1.5 padding scheme for RSA, I need to generate an octet string of length n consisting of pseudo-randomly generated nonzero octets.
I’m looking for the best way to do this using Python.

This is what my current implementation looks like:

def nonzero_random_bytes(n: int) -> bytes:
    values = [x.to_bytes(1, "big") for x in range(1, 256)]
    seq = [secrets.choice(values) for _ in range(n)]
    return b"".join(seq)

I’ve looked at generating the byte string with secrets.token_bytes(n), filtering the result, and generating nonzero values to backfill the string. I know I can also do something like secrets.token_bytes(2 * n), filter, and trim the result but that doesn’t strike me as an elegant solution.

I’ve also looked into how PyCryptodome and python-pkcs1 do this but I’m thinking there must be a better way (I poked around pyca/cryptography but couldn’t find how they did it and it seems they use OpenSSL bindings – here’s where I think that’s implemented).

Disclaimer: I am aware that I shouldn’t use PKCS1 v1.5, much less be rolling out any cryptography code myself. This is purely an academic exercise. 🙂

Asked By: tlgs

||

Answers:

You didn’t define what "best" means to you. I’d go with this, which is basically a less wordy way of doing what you already did:

from secrets import randbelow

def nonzero_random_bytes(n: int) -> bytes:
    return bytes(randbelow(255) + 1 for _ in range(n))
Answered By: Tim Peters

Had almost the same as Tim but thought "best" might require speed. Benchmark for n = 250 (middle of the "probably 100-400" range):

471.3 us  nonzero_random_bytes_original
438.3 us  nonzero_random_bytes_randbelow
  4.7 us  nonzero_random_bytes_2n
  3.1 us  nonzero_random_bytes_plus10

Code (Try it online!):

from timeit import timeit
import secrets

def nonzero_random_bytes_original(n: int) -> bytes:
    values = [x.to_bytes(1, "big") for x in range(1, 256)]
    seq = [secrets.choice(values) for _ in range(n)]
    return b"".join(seq)

def nonzero_random_bytes_randbelow(n: int) -> bytes:
    return bytes(1 + secrets.randbelow(255) for _ in range(n))

def nonzero_random_bytes_2n(n: int) -> bytes:
    return secrets.token_bytes(2 * n).replace(b'', b'')[:n]

def nonzero_random_bytes_plus10(n: int) -> bytes:
    result = b''
    while need := n - len(result):
        result += secrets.token_bytes(need + 10).replace(b'', b'')[:need]
    return result

funcs = [
    nonzero_random_bytes_original,
    nonzero_random_bytes_randbelow,
    nonzero_random_bytes_2n,
    nonzero_random_bytes_plus10,
]

for _ in range(3):
    for func in funcs:
        t = timeit(lambda: func(250), number=1000)
        print('%5.1f us ' % (t * 1e3), func.__name__)
    print()
Answered By: Kelly Bundy
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.