pickling python objects to google cloud storage

Question:

I’ve been pickling the objects to filesystem and reading them back when needed to work with those objects. Currently I’ve this code for that purpose.

def pickle(self, directory, filename):
    if not os.path.exists(directory):
        os.makedirs(directory)
    with open(directory + '/' + filename, 'wb') as handle:
        pickle.dump(self, handle)

@staticmethod
def load(filename):
    with open(filename, 'rb') as handle:
        element = pickle.load(handle)
    return element

Now I’m moving my applictation(django) to Google app engine and figured that app engine does not allow me to write to file system. Google cloud storage seemed my only choice but I could not understand how could I pickle my objects as cloud storage objects and read them back to create the original python object.

Asked By: Jo Kachikaran

||

Answers:

You can use the Cloud Storage client library.

Instead of open() use cloudstorage.open() (or gcs.open() if importing cloudstorage as gcs, as in the above-mentioned doc) and note that the full filepath starts with the GCS bucket name (as a dir).

More details in the cloudstorage.open() documentation.

Answered By: Dan Cornilescu

For Python 3 users, you can use gcsfs library from Dask creator to solve your issue.

Example reading :

import gcsfs

fs = gcsfs.GCSFileSystem(project='my-google-project')
fs.ls('my-bucket')
>>> ['my-file.txt']
with fs.open('my-bucket/my-file.txt', 'rb') as f:
    print(f.read())

It basically is identical with pickle tho :

with fs.open(directory + '/' + filename, 'wb') as handle:
        pickle.dump(shandle)

To read, this is similar, but replace wb by rb and dump with load :

with fs.open(directory + '/' + filename, 'rb') as handle:
        pickle.load(handle)
Answered By: LaSul

One other option (I tested it with Tensorflow 2.2.0) which also works with Python 3:

from tensorflow.python.lib.io import file_io

with file_io.FileIO('gs://....', mode='rb') as f:
    pickle.load(f)

This is very useful if you already use Tensorflow for example.

Answered By: Dr. Fabien Tarrade