Memory conscious way of adding bytes to the beginning of a file

Question:

I am trying to write a byte array at the beginning of a file and at a (much) later point I want to split them again, an retrieve the original file. the byte_array is just a small jpeg.

# write a byte array at the beginning of a file
def write_byte_array_to_beginning_of_file( byte_array, file_path, out_file_path ):
    with open( file_path, "rb" ) as f:
        with open( out_file_path, "wb" ) as f2:
            f2.write( byte_array )
            f2.write( f.read( ) )

while the function works, it hogs a lot of memory. It seems like it reads the files to memory first befor doing something. There are some files in excess of 40gb that i need to work on, and it’s only done on a small NAS with 8Gb of RAM.

What would be a memory conscious to achieve this?

Asked By: globus243

||

Answers:

You can read from the original file in chunks instead of reading the whole thing.

def write_byte_array_to_beginning_of_file( byte_array, file_path, out_file_path, chunksize = 10 * 1024 * 1024 ):
    with open( file_path, "rb" ) as f, open( out_file_path, "wb" ) as f2:
        f2.write( byte_array )
        while True:
            block = f.read(chunksize)
            if not block:
                break
            f2.write(block)

This reads it in chunks of 10 MB by default, which you can override.

Answered By: Barmar
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.