create single bytes instance from sequence of memoryview

Question:

tl;dr Given a Sequence of memoryview, how can I create a single bytes instance without creating intermediate bytes instances?

The naive approach creates many intermediary instances of bytes

def create_bytes(seq_mv: Sequence[memoryview]) -> bytes:
    data = bytes()
    for mv in seq_mv:
        data = data + bytes(mv)
    return data

The function create_bytes creates len(seq_mv) + 1 instances of bytes during execution. That is inefficient.
I want create_bytes to create one new bytes instance during execution.

Asked By: JamesThomasMoon

||

Answers:

bytes as you got it, is an imutable object.

As the Tim Peters put in the comments, you can let Python create a single instance with all parts joined together with a single call to bytes().join(seq_mv).

If you need to perform any other operation on your data that would involve changing it in the way, you could be using the mutable bytearray instead- which not only gives you flexibility to change your object, but have all the advantages of mutable sequences.

You can then make a single conversion to bytes at the end of your function if the users of your function can’t deal straight with a a bytearray (but maybe you can just return it directly):

def create_bytes(seq_mv: Sequence[memoryview]) -> bytes:
    data = bytearray()
    for mv in seq_mv:
        data.extend(mv)
    return bytes(data)

Or simply:

from functools import reduce

data = reduce(lambda data, mv: data.extend(mv), seq_mv, bytearray())
Answered By: jsbueno