Is there a hash of a class instance in Python?
Question:
Let’s suppose I have a class like this:
class MyClass:
def __init__(self, a):
self._a = a
And I construct such instances:
obj1 = MyClass(5)
obj2 = MyClass(12)
obj3 = MyClass(5)
Is there a general way to hash my objects such that objects constructed with same values have equal hashes? In this case:
myhash(obj1) != myhash(obj2)
myhash(obj1) == myhash(obj3)
By general I mean a Python function that can work with objects created by any class I can define. For different classes and same values the hash function must return different results, of course; otherwise this question would be about hashing of several arguments instead.
Answers:
def myhash(obj):
items = sorted(obj.__dict__.items(), key=lambda it: it[0])
return hash((type(obj),) + tuple(items))
This solution obviously has limitations:
- It assumes that all fields in
__dict__
are important.
- It assumes that
__dict__
is present, e.g. this won’t work with __slots__
.
- It assumes that all values are hashable
- It breaks the Liskov substitution principle.
The question is badly formed for a couple reasons:
- Hashes don’t test eqaulity, just inequality. That is, they guarantee that
hash(a) != hash(b)
implies a != b
, but the reverse does not hold true. For example, checking "aKey" in myDict
will do a linear search through all keys in myDict
that have the same hash as "aKey"
.
- You seem to wanting to do something with storage. Note that the hash of
"aKey"
will change between runs, so don’t write it to a file. See the bottom of __hash__
for more information.
- In general, you need to think carefully about subclasses, hashes, and equality. There is a pit here, so even the official documentation quietly sidesteps what the hash of instance means. Do note that each instance has a
__dict__
for local variables and the __class__
with more information.
Hope this helps those who come after you.
Let’s suppose I have a class like this:
class MyClass:
def __init__(self, a):
self._a = a
And I construct such instances:
obj1 = MyClass(5)
obj2 = MyClass(12)
obj3 = MyClass(5)
Is there a general way to hash my objects such that objects constructed with same values have equal hashes? In this case:
myhash(obj1) != myhash(obj2)
myhash(obj1) == myhash(obj3)
By general I mean a Python function that can work with objects created by any class I can define. For different classes and same values the hash function must return different results, of course; otherwise this question would be about hashing of several arguments instead.
def myhash(obj):
items = sorted(obj.__dict__.items(), key=lambda it: it[0])
return hash((type(obj),) + tuple(items))
This solution obviously has limitations:
- It assumes that all fields in
__dict__
are important. - It assumes that
__dict__
is present, e.g. this won’t work with__slots__
. - It assumes that all values are hashable
- It breaks the Liskov substitution principle.
The question is badly formed for a couple reasons:
- Hashes don’t test eqaulity, just inequality. That is, they guarantee that
hash(a) != hash(b)
impliesa != b
, but the reverse does not hold true. For example, checking"aKey" in myDict
will do a linear search through all keys inmyDict
that have the same hash as"aKey"
. - You seem to wanting to do something with storage. Note that the hash of
"aKey"
will change between runs, so don’t write it to a file. See the bottom of__hash__
for more information. - In general, you need to think carefully about subclasses, hashes, and equality. There is a pit here, so even the official documentation quietly sidesteps what the hash of instance means. Do note that each instance has a
__dict__
for local variables and the__class__
with more information.
Hope this helps those who come after you.