What causes the error "_pickle.UnpicklingError: invalid load key, ' '."?
Question:
I’m trying to store 5000 data elements on an array. This 5000 elements are stored on an existinng file (therefore it’s not empty).
But I’m getting an error.
IN:
def array():
name = 'puntos.df4'
m = open(name, 'rb')
v = []*5000
m.seek(-5000, io.SEEK_END)
fp = m.tell()
sz = os.path.getsize(name)
while fp < sz:
pt = pickle.load(m)
v.append(pt)
m.close()
return v
OUT:
line 23, in array
pt = pickle.load(m)
_pickle.UnpicklingError: invalid load key, ''.
Answers:
I am not completely sure what you’re trying to achieve by seeking to a specific offset and attempting to load individual values manually, the typical usage of the pickle
module is:
# save data to a file
with open('myfile.pickle','wb') as fout:
pickle.dump([1,2,3],fout)
# read data from a file
with open('myfile.pickle') as fin:
print pickle.load(fin)
# output
>> [1, 2, 3]
If you dumped a list, you’ll load a list, there’s no need to load each item individually.
you’re saying that you got an error before you were seeking to the -5000 offset, maybe the file you’re trying to read is corrupted.
If you have access to the original data, I suggest you try saving it to a new file and reading it as in the example.
pickling is recursive, not sequential. Thus, to pickle a list, pickle
will start to pickle the containing list, then pickle the first element… diving into the first element and pickling dependencies and sub-elements until the first element is serialized. Then moves on to the next element of the list, and so on, until it finally finishes the list and finishes serializing the enclosing list. In short, it’s hard to treat a recursive pickle as sequential, except for some special cases. It’s better to use a smarter pattern on your dump
, if you want to load
in a special way.
The most common pickle, it to pickle everything with a single dump
to a file — but then you have to load
everything at once with a single load
. However, if you open a file handle and do multiple dump
calls (e.g. one for each element of the list, or a tuple of selected elements), then your load
will mirror that… you open the file handle and do multiple load
calls until you have all the list elements and can reconstruct the list. It’s still not easy to selectively load
only certain list elements, however. To do that, you’d probably have to store your list elements as a dict
(with the index of the element or chunk as the key) using a package like klepto
, which can break up a pickled dict
into several files transparently, and enables easy loading of specific elements.
This may not be relevant to your specific issue, but I had a similar problem when the pickle archive had been created using gzip
.
For example if a compressed pickle archive is made like this,
import gzip, pickle
with gzip.open('test.pklz', 'wb') as ofp:
pickle.dump([1,2,3], ofp)
Trying to open it throws the errors
with open('test.pklz', 'rb') as ifp:
print(pickle.load(ifp))
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
_pickle.UnpicklingError: invalid load key, ''.
But, if the pickle file is opened using gzip
all is harmonious
with gzip.open('test.pklz', 'rb') as ifp:
print(pickle.load(ifp))
[1, 2, 3]
If you transferred these files through disk or other means, it is likely they were not saved properly.
I had a similar error but with different context when I uploaded a *.p file to Google Drive. I tried to use it later in a Google Colab session, and got this error:
1 with open("/tmp/train.p", mode='rb') as training_data:
----> 2 train = pickle.load(training_data)
UnpicklingError: invalid load key, '<'.
I solved it by compressing the file, upload it and then unzip on the session.
It looks like the pickle file is not saved correctly when you upload/download it so it gets corrupted.
I solved my issue by:
- Remove the cloned project
- Install git lfs:
sudo apt-get install git-lfs
- Set up git lfs for your user account:
git lfs install
- Clone the project again.
I received a similar error while loading a pickled sklearn model. The problem was that the pickle is created via sklearn.externals.joblib and i was trying to load it via standard pickle library. Using joblib has solved my problem.
I just encountered that issue which was initiated by the bad pickle file (not fully copied).
My solution: Check the pickle file status (corrupted or not).
-
Close the opened file
filepath = 'model_v1.pkl' with open(filepath, 'rb') as f: p = cPickle.Unpickler(f) model = p.load() f.close()
-
If step 1 doesn’t work; restart the session
Pickling error – _pickle.UnpicklingError: invalid load key, ‘<‘.
This kind of error comes when Weights are complete or some problem with the Weights/ Pickle file because of which UnPickling of weights giving Error.
In my case, I ran into this issue due to multiple processes trying to read from the same pickled file. The first of these actually creates a pickle (write operation) and some quick threads start reading from it too soon. Just by retrying the read when catching these 2 errors EOFError, UnpicklingError
I don’t see these errors anymore
I’m trying to store 5000 data elements on an array. This 5000 elements are stored on an existinng file (therefore it’s not empty).
But I’m getting an error.
IN:
def array():
name = 'puntos.df4'
m = open(name, 'rb')
v = []*5000
m.seek(-5000, io.SEEK_END)
fp = m.tell()
sz = os.path.getsize(name)
while fp < sz:
pt = pickle.load(m)
v.append(pt)
m.close()
return v
OUT:
line 23, in array
pt = pickle.load(m)
_pickle.UnpicklingError: invalid load key, ''.
I am not completely sure what you’re trying to achieve by seeking to a specific offset and attempting to load individual values manually, the typical usage of the pickle
module is:
# save data to a file
with open('myfile.pickle','wb') as fout:
pickle.dump([1,2,3],fout)
# read data from a file
with open('myfile.pickle') as fin:
print pickle.load(fin)
# output
>> [1, 2, 3]
If you dumped a list, you’ll load a list, there’s no need to load each item individually.
you’re saying that you got an error before you were seeking to the -5000 offset, maybe the file you’re trying to read is corrupted.
If you have access to the original data, I suggest you try saving it to a new file and reading it as in the example.
pickling is recursive, not sequential. Thus, to pickle a list, pickle
will start to pickle the containing list, then pickle the first element… diving into the first element and pickling dependencies and sub-elements until the first element is serialized. Then moves on to the next element of the list, and so on, until it finally finishes the list and finishes serializing the enclosing list. In short, it’s hard to treat a recursive pickle as sequential, except for some special cases. It’s better to use a smarter pattern on your dump
, if you want to load
in a special way.
The most common pickle, it to pickle everything with a single dump
to a file — but then you have to load
everything at once with a single load
. However, if you open a file handle and do multiple dump
calls (e.g. one for each element of the list, or a tuple of selected elements), then your load
will mirror that… you open the file handle and do multiple load
calls until you have all the list elements and can reconstruct the list. It’s still not easy to selectively load
only certain list elements, however. To do that, you’d probably have to store your list elements as a dict
(with the index of the element or chunk as the key) using a package like klepto
, which can break up a pickled dict
into several files transparently, and enables easy loading of specific elements.
This may not be relevant to your specific issue, but I had a similar problem when the pickle archive had been created using gzip
.
For example if a compressed pickle archive is made like this,
import gzip, pickle
with gzip.open('test.pklz', 'wb') as ofp:
pickle.dump([1,2,3], ofp)
Trying to open it throws the errors
with open('test.pklz', 'rb') as ifp:
print(pickle.load(ifp))
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
_pickle.UnpicklingError: invalid load key, ''.
But, if the pickle file is opened using gzip
all is harmonious
with gzip.open('test.pklz', 'rb') as ifp:
print(pickle.load(ifp))
[1, 2, 3]
If you transferred these files through disk or other means, it is likely they were not saved properly.
I had a similar error but with different context when I uploaded a *.p file to Google Drive. I tried to use it later in a Google Colab session, and got this error:
1 with open("/tmp/train.p", mode='rb') as training_data:
----> 2 train = pickle.load(training_data)
UnpicklingError: invalid load key, '<'.
I solved it by compressing the file, upload it and then unzip on the session.
It looks like the pickle file is not saved correctly when you upload/download it so it gets corrupted.
I solved my issue by:
- Remove the cloned project
- Install git lfs:
sudo apt-get install git-lfs
- Set up git lfs for your user account:
git lfs install
- Clone the project again.
I received a similar error while loading a pickled sklearn model. The problem was that the pickle is created via sklearn.externals.joblib and i was trying to load it via standard pickle library. Using joblib has solved my problem.
I just encountered that issue which was initiated by the bad pickle file (not fully copied).
My solution: Check the pickle file status (corrupted or not).
-
Close the opened file
filepath = 'model_v1.pkl' with open(filepath, 'rb') as f: p = cPickle.Unpickler(f) model = p.load() f.close()
-
If step 1 doesn’t work; restart the session
Pickling error – _pickle.UnpicklingError: invalid load key, ‘<‘.
This kind of error comes when Weights are complete or some problem with the Weights/ Pickle file because of which UnPickling of weights giving Error.
In my case, I ran into this issue due to multiple processes trying to read from the same pickled file. The first of these actually creates a pickle (write operation) and some quick threads start reading from it too soon. Just by retrying the read when catching these 2 errors EOFError, UnpicklingError
I don’t see these errors anymore