How do I use a file like a memory buffer in Python?

Question:

I don’t know the correct terminology, maybe it’s called page file, but I’m not sure. I need a way to use an on-disk file as a buffer, like bytearray. It should be able to do things like a = buffer[100:200] and buffer[33] = 127 without the code having to be aware that it’s reading from and writing to a file in the background.

Basically I need the opposite of bytesIO, which uses memory with a file interface. I need a way to use a file with a memory buffer interface. And ideally it doesn’t write to the file everytime the data is changed (but it’s ok if it does).

The reason I need this functionality is because I use packages that expect data to be in a buffer object, but I only have 4MB of memory available. It’s impossible to load the files into memory. So I need an object that acts like a bytearray for example, but reads and writes data directly to a file, not memory.

In my use case I need a micropython module, but a standard python module might work as well. Are there any modules that would do what I need?

Asked By: uzumaki

||

Answers:

Can something like this work for you?

class Memfile:

    def __init__(self, file):
        self.file = file

    def __getitem__(self,key):
        if type(key) is int:
            self.file.seek(key)
            return self.file.read(1)
        if type(key) is slice:
            self.file.seek(key.start)
            return self.file.read(key.stop - key.start)

    def __setitem__(self, key, val):
        assert(type(val) == bytes or type(val) == bytearray)
        if type(key) is slice:
            assert(key.stop - key.start == len(val))
            self.file.seek(key.start)
            self.file.write(val)
        if type(key) is int:
            assert(len(val) == 1)
            self.file.seek(key)
            self.file.write(val)

    def close(self):
        self.file.close()


if __name__ == "__main__":
    mf = Memfile(open("data", "r+b")) # Assuming the file 'data' have 10+ bytes
    mf[0:10] = b'x00'*10
    print(mf[0:10]) # b'x00x00x00x00x00x00x00x00x00x00'
    mf[0:2] = b'xffxff'
    print(mf[0:10]) # b'xffxffx00x00x00x00x00x00x00x00'
    print(mf[2]) # b'x00'
    print(mf[1]) # b'xff'
    mf[0:4] = b'xdexadxbexef'
    print(mf[0:4]) # b'xdexadxbexef'
    mf.close()

Note that if this solutions fits your needs you will need to do plenty of testing here

Answered By: jvx8ss