pickle and dill can't load objects with overridden __hash__ function (AttributeError)

Question:

In the next few lines of code I’ll replicate on a smaller scale what’s happening with my program.

Class A must store a dictionary with keys that have type A (values can be any type to replicate the error).

class A:
    def __init__(self, name):
        self.name = name
        self.dic = dict()   # it'll be a mapping from A objects to <?> objects

    def __repr__(self): return self.name

    def __hash__(self): return hash(self.name)

The same is needed with class B. Besides, class B is a more complex object that takes a while to build, and thus I need to store it locally and load it when I need it.

class B:
    def __init__(self, dic):
        self.dic = dic      # it'll be a mapping from A objects to <?> objects

    def __repr__(self): return str(self.dic)

    # saving the model with pickle
    def save(self, filename):
        with open("objects/" + filename + ".fan", "wb+") as filehandler:
            pickle.dump(self, filehandler)

    # loading the model with pickle
    @staticmethod
    def load(filename):
        with open("objects/" + filename + ".fan", "rb") as filehandler:
            return pickle.load(filehandler)

Let’s instantiate some objects:

# instantiate two A objects
obj1 = A("name")
obj2 = A("name2")

# fill their dic field
obj1.dic[obj2] = 0
obj2.dic[obj1] = 1

# create a dictionary object with type(key) = A
# and instantiate a B object with that
dic = {obj1: (0, 0), obj2: (1, 4)}
obj3 = B(dic)

Now if I try to dump and load B with pickle/dill:

obj3.save("try")    # all goes well
B.load("try")       # nothing goes well

I get the following error:

Traceback (most recent call last):
  File "C:UsersSimoneZanniniDocumentsFantacalciotry.py", line 40, in <module>
    B.load("try")
  File "C:UsersSimoneZanniniDocumentsFantacalciotry.py", line 29, in load
    return pickle.load(filehandler)
  File "C:UsersSimoneZanniniDocumentsFantacalciotry.py", line 11, in __hash__
    def __hash__(self): return hash(self.name)
AttributeError: 'A' object has no attribute 'name'

Process finished with exit code 1

I know there’s a similar problem that was solved, but this isn’t exactly my case and the __getstate__ and __setstate__ workaround doesn’t seem to work. I think this is due to A class having a dict object inside of it, but it’s just an assumption.

Thanks in advance for your time.

Asked By: simone

||

Answers:

Two things:

1

I’m not sure exactly why the error occurs but you can avoid it by declaring name as a class member variable like so

class A:
    name = ""
    def __init__(self, name):
        self.name = name
        self.dic = dict()   # it'll be a mapping from A objects to <?> objects

    def __repr__(self): return self.name

    def __hash__(self): return hash(self.name)

2

Objects that keep a reference to other objects of the same class are often (though not always) an indicator of sub-optimal design. Why keep a dict inside each A when you could simply keep a dict (or dicts) outside the class?

To address the comments

Now I get KeyError with the dic field of A, something that doesn’t happen without dumping and loading the object B

Consider the following

class C:
    def __hash__(self):
        return 1

c1 = C()
c2 = C()

mydict = {c1:1}

print(mydict[c1])  # 1

print(mydict[c2])  # key error

When you un-pickle a B, its self.dic now contains new As (not the original ones) so when you try to use the old As as keys in the new Bs dic, it doesn’t work. Again, you could work around this but I think re-designing your app will be easier in the long run. You will need to override __eq__() in A for it to work:

class D:
    def __hash__(self):
        return 1

    def __eq__(self, other):
        return True

d1 = D()
d2 = D()

mydict = {d1:1}

print(mydict[d1])  # 1

print(mydict[d2])  # 1
Answered By: FiddleStix
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.