List unhashable, but tuple hashable?

Question:

In How to hash lists? I was told that I should convert to a tuple first, e.g. [1,2,3,4,5] to (1,2,3,4,5).

So the first cannot be hashed, but the second can. Why*?


*I am not really looking for a detailed technical explanation, but rather for an intuition

Asked By: gsamaras

||

Answers:

Since a list is mutable, if you modify it you would modify its hash too, which ruins the point of having a hash (like in a set or a dict key).

Edit: I’m surprised this answer regularly get new upvotes, it was really quickly written. I feel I need to make it better now.

So the set and the dict native data structures are implemented with a hashmap. Data types in Python may have a magic method __hash__() that will be used in hashmap construction and lookups.

Only immutable data types (int, string, tuple, …) have this method, and the hash value is based on the data and not the identity of the object.
You can check this by

>>> a = (0,1)
>>> b = (0,1)
>>> a is b
False # Different objects
>>> hash(a) == hash(b)
True # Same hash

If we follow this logic, mutating the data would mutate the hash, but then what’s the point of a changing hash ? It defeats the whole purpose of sets and dicts or other hashes usages.

Fun fact : if you try the example with strings or ints -5 <= i <= 256, a is b returns True because of micro-optimizations (in CPython at least).

Answered By: polku

Because lists are mutable and tuples aren’t.

Answered By: Daniel Roseman

Mainly, because tuples are immutable. Assume the following works:

>>> l = [1, 2, 3]
>>> t = (1, 2, 3)
>>> x = {l: 'a list', t: 'a tuple'}

Now, what happens when you do l.append(4)? You’ve modified the key in your dictionary! From afar! If you’re familiar with how hashing algorithms work, this should frighten you. Tuples, on the other hand, are absolutely immutable. t += (1,) might look like it’s modifying the tuple, but really it’s not: it simply creating a new tuple, leaving your dictionary key unchanged.

Answered By: val

You could totally make that work, but I bet you wouldn’t like the effects.

from functools import reduce
from operator import xor

class List(list):
    def __hash__(self):
        return reduce(xor, self)

Now let’s see what happens:

>>> l = List([23,42,99])
>>> hash(l)
94
>>> d = {l: "Hello"}
>>> d[l]
'Hello'
>>> l.append(7)
>>> d
{[23, 42, 99, 7]: 'Hello'}
>>> l
[23, 42, 99, 7]
>>> d[l]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: [23, 42, 99, 7]

edit: So I thought about this some more. You could make the above example work, if you return the list’s id as its hash value:

class List(list):
    def __hash__(self):
        return id(self)

In that case, d[l] will give you 'Hello', but neither d[[23,42,99,7]] nor d[List([23,42,99,7])] will (because you’re creating a new [Ll]ist.

Answered By: L3viathan

Not every tuple is hashable.For example tuple contains list as an element.

x = (1,[2,3])
print(type(x))
print(hash(x))
Answered By: user1436465

The answers are good. The reason is the mutability. If we could use list in dicts as keys; (or any mutable object) then we would be able to change the key by mutating that key (either accidentally or intentionally). This would cause change in the hash value of the key in dictionary due to which we would not be able to retrace the value from that data structure by that key.
Hash values and Hash tables are used to map the large data with ease by mapping them to indices which stores the real value entries.

Read more about them here:-

Hash Tables & Hash Functions & Assosiative Arrays

Answered By: Vicrobot
Categories: questions Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.