Creating a namedtuple with a custom hash function

Question:

Say I have a namedtuple like this:

FooTuple = namedtuple("FooTuple", "item1, item2")

And I want the following function to be used for hashing:

foo_hash(self):
    return hash(self.item1) * (self.item2)

I want this because I want the order of item1 and item2 to be irrelevant (I will do the same for the comparison-operator). I thought of two ways to do this. The first would be:

FooTuple.__hash__ = foo_hash

This works, but it feels hacked. So I tried subclassing FooTuple:

class EnhancedFooTuple(FooTuple):
    def __init__(self, item1, item2):
        FooTuple.__init__(self, item1, item2)

    # custom hash function here

But then I get this:

DeprecationWarning: object.__init__() takes no parameters

So, what can I do? Or is this a bad idea altogether and I should just write my own class from scratch?

Asked By: Björn Pollex

||

Answers:

I think there is something wrong with your code (my guess is that you created an instance of the tuple with the same name, so fooTuple is now a tuple, not a tuple class), because subclassing the named tuple like that should work. Anyway, you don’t need to redefine the constructor. You can just add the hash function:

In [1]: from collections import namedtuple

In [2]: Foo = namedtuple('Foo', ['item1', 'item2'], verbose=False)

In [3]: class ExtendedFoo(Foo):
   ...:     def __hash__(self):
   ...:         return hash(self.item1) * hash(self.item2)
   ...: 

In [4]: foo = ExtendedFoo(1, 2)

In [5]: hash(foo)
Out[5]: 2

Starting in Python 3.6.1, this can be achieved more cleanly with the typing.NamedTuple class (as long as you are OK with type hints):

from typing import NamedTuple, Any


class FooTuple(NamedTuple):
    item1: Any
    item2: Any

    def __hash__(self):
        return hash(self.item1) * hash(self.item2)
Answered By: ipetrik

A namedtuple with a custom __hash__ function is useful to store immutable data models into dict and set

For example:

class Point(namedtuple('Point', ['label', 'lat', 'lng'])):
    def __eq__(self, other):
        return self.label == other.label

    def __hash__(self):
        return hash(self.label)

    def __str__(self):
        return ", ".join([str(self.lat), str(self.lng)])

Override both __eq__ and __hash__ allows grouping businesses into a set, ensuring that each business line is unique in the collection:

walgreens = Point(label='Drugstore', lat = 37.78735890, lng = -122.40822700)
mcdonalds = Point(label='Restaurant', lat = 37.78735890, lng = -122.40822700)
pizza_hut = Point(label='Restaurant', lat = 37.78735881, lng = -122.40822713)

businesses = [walgreens, mcdonalds, pizza_hut]
businesses_by_line = set(businesses)

assert len(business) == 3
assert len(businesses_by_line) == 2
Answered By: JP Ventura