I can "pickle local objects" if I use a derived class?

Question:

The pickle reference states that the set of objects which can be pickled is rather limited. Indeed, I have a function which returns a dinamically-generated class, and I found I can’t pickle instances of that class:

>>> import pickle
>>> def f():
...     class A: pass
...     return A
... 
>>> LocalA = f()
>>> la = LocalA()
>>> with open('testing.pickle', 'wb') as f:
...     pickle.dump(la, f, pickle.HIGHEST_PROTOCOL)
... 
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
AttributeError: Can't pickle local object 'f.<locals>.A'

Such objects are too complicated for pickle. Ok. Now, what’s magic is that, if I try to pickle a similar object, but of a derived class, it works!

>>> class DerivedA(LocalA): pass
... 
>>> da = DerivedA()
>>> with open('testing.pickle', 'wb') as f:
...     pickle.dump(da, f, pickle.HIGHEST_PROTOCOL)
...
>>>

What’s happening here? If this is so easy, why doesn’t pickle use this workaround to implement a dump method that allows “local objects” to be pickled?

Asked By: fonini

||

Answers:

DerivedA instances are pickleable because DerivedA is available through a global variable matching its fully-qualified name, which is how pickle looks for classes when unpickling.

The problem with trying to do something like this with local classes is that there’s nothing identifying which A class an instance corresponds to. If you run f twice, you get two A classes, and there’s no way to tell which one should be the class of unpickled A instances from another run of the program. If you don’t run f at all, you get no A classes, and then what the heck do you do about the type of unpickled instances?

Answered By: user2357112

I think you did not read the reference you cite carefully. The reference also clearly states that only the following objects are pickleable:

  • functions defined at the top level of a module (using def, not >lambda)
  • built-in functions defined at the top level of a module
  • classes that are defined at the top level of a module

Your example

>>> def f():
...     class A: pass
...     return A

does not define a class at the top level of a module, it defines a class within the scope of f(). pickle works on global classes, not local classes. This automatically fails the pickleable test.

DerivedA is a global class, so all is well.

As for why only top-level (global to you) classes and functions can’t be pickled, the reference answers that question as well (bold mine):

Note that functions (built-in and user-defined) are pickled by “fully qualified” name reference, not by value. This means that only the function name is pickled, along with the name of the module the function is defined in. Neither the function’s code, nor any of its function attributes are pickled. Thus the defining module must be importable in the unpickling environment, and the module must contain the named object, otherwise an exception will be raised.

Similarly, classes are pickled by named reference, so the same restrictions in the unpickling environment apply.

So there you have it. pickle only serialises objects by name reference, not by the raw instructions contained within the object. This is because pickle's job is to serialise object hierarchy, and nothing else.

Answered By: Akshat Mahajan

I disagree, you can pickle both. You just need to use a better serializer, like dill. dill (by default) pickles classes by saving the class definition instead of pickling by reference, so it won’t fail your first case. You can even use dill to get the source code, if you like.

>>> import dill as pickle
>>> def f():
...   class A: pass
...   return A
... 
>>> localA = f()
>>> la = localA()
>>> 
>>> _la = pickle.dumps(la)
>>> la_ = pickle.loads(_la)
>>>    
>>> class DerivedA(localA): pass
... 
>>> da = DerivedA()
>>> _da = pickle.dumps(da)
>>> da_ = pickle.loads(_da)
>>> 
>>> print(pickle.source.getsource(la_.__class__))
  class A: pass

>>> 
Answered By: Mike McKerns

You can only pickle instances of classes defined at module’s top level.

However, you can pickle instances of locally-defined classes if you promote them to top level.

You must set the __ qualname__ class attribute of the local class. Then you must assign the class to a top-level variable of the same name.

def define_class(name):
    class local_class:
        pass
    local_class.__qualname__ = name
    return local_class

class_A = define_class('class_A') # picklable
class_B = define_class('class_B') # picklable
class_X = define_class('class_Y') # unpicklable, names don't match
Answered By: haael
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.