How do I save a pickled file to a GCP bucket from a jupyter environment?
Question:
I’m trying to save an arviz inference object as a pickle file to a gcp storage bucket using the following function:
def upload_to_bucket(model, blob_name, bucket_name):
"""
model: trace object
Upload data to a bucket"""
# Explicitly use service account credentials by specifying the private key
# file.
storage_client = storage.Client.from_service_account_json(
'key1.json')
#print(buckets = list(storage_client.list_buckets())
bucket = storage_client.get_bucket(bucket_name)
blob = bucket.blob(blob_name)
with open(f"gs://{config['GCS_bucket']}//{config['blob_name']}//", 'wb') as filehandler4:
# Call load method to deserialze
pickle.dump(model, filehandler4, protocol=4)
#returns a public url
return blob.public_url
upload_to_bucket(model=trace, blob_name='korea_hierarchy_seasonal_price.pkl', bucket_name='korea-forecasting-bucket')
However, I keep getting the following error:
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
/tmp/ipykernel_45766/484011007.py in <module>
----> 1 upload_to_bucket(model=trace, blob_name='korea_hierarchy_seasonal_price.pkl', bucket_name='korea-forecasting-bucket')
~/google_bucket.py in upload_to_bucket(model, blob_name, bucket_name)
46 # pickle.dump(model, filehandler4, protocol=4)
47
---> 48 with open(f"gs://{'GCS_bucket'}//{'blob_name'}//", 'wb') as filehandler4:
49 # Call load method to deserialze
50 pickle.dump(model, filehandler4, protocol=4)
FileNotFoundError: [Errno 2] No such file or directory: 'gs://GCS_bucket//blob_name//'
The answers to similar questions uses the gs
moniker when saving to a gcp cloud storage so I’m not sure what I’m doing wrong.
Answers:
You need to use the Cloud Storage (GCS) client library’s methods to e.g. read|write Objects (blobs).
You’re creating a regular (ext) file system object and giving that to GCS and this won’t work.
One option is to use GCS equivalent open
.
(I’ve not tried this but) you should be able to:
with blob.open("wb") as f:
pickle.dump(mode,f,protocol=4)
I’m trying to save an arviz inference object as a pickle file to a gcp storage bucket using the following function:
def upload_to_bucket(model, blob_name, bucket_name):
"""
model: trace object
Upload data to a bucket"""
# Explicitly use service account credentials by specifying the private key
# file.
storage_client = storage.Client.from_service_account_json(
'key1.json')
#print(buckets = list(storage_client.list_buckets())
bucket = storage_client.get_bucket(bucket_name)
blob = bucket.blob(blob_name)
with open(f"gs://{config['GCS_bucket']}//{config['blob_name']}//", 'wb') as filehandler4:
# Call load method to deserialze
pickle.dump(model, filehandler4, protocol=4)
#returns a public url
return blob.public_url
upload_to_bucket(model=trace, blob_name='korea_hierarchy_seasonal_price.pkl', bucket_name='korea-forecasting-bucket')
However, I keep getting the following error:
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
/tmp/ipykernel_45766/484011007.py in <module>
----> 1 upload_to_bucket(model=trace, blob_name='korea_hierarchy_seasonal_price.pkl', bucket_name='korea-forecasting-bucket')
~/google_bucket.py in upload_to_bucket(model, blob_name, bucket_name)
46 # pickle.dump(model, filehandler4, protocol=4)
47
---> 48 with open(f"gs://{'GCS_bucket'}//{'blob_name'}//", 'wb') as filehandler4:
49 # Call load method to deserialze
50 pickle.dump(model, filehandler4, protocol=4)
FileNotFoundError: [Errno 2] No such file or directory: 'gs://GCS_bucket//blob_name//'
The answers to similar questions uses the gs
moniker when saving to a gcp cloud storage so I’m not sure what I’m doing wrong.
You need to use the Cloud Storage (GCS) client library’s methods to e.g. read|write Objects (blobs).
You’re creating a regular (ext) file system object and giving that to GCS and this won’t work.
One option is to use GCS equivalent open
.
(I’ve not tried this but) you should be able to:
with blob.open("wb") as f:
pickle.dump(mode,f,protocol=4)