How to access an element of a set using an equivalent object?

Question:

If I have an object that compares equal to an element of a Python set, but is not the same object, is there a reasonable way to get a reference to the object in the set? The use case would be using the set to identify and share duplicated data.

Example (Python 2.7):

>>> a = "This is a string"
>>> b = "This is a string"
>>> a is b
False
>>> a == b
True
>>> s = set((a,))
>>> b in s
True

How to get a reference to a using b and s? I can think of one way, but I’m not sure if it is not implementation-dependent whether you get a or b. EDIT: This does not work when s has more than one element; intersection is quite naturally implemented something like [x for x in smaller_set if x in larger_set]

>>> for x in set((b,)).intersection(s): c = x
...
>>> c is a
True

Perhaps a good workaround would be to use a dict that maps each key to itself, instead of the set.

Asked By: Janne Karila

||

Answers:

Your use case sounds like it is a use case for dictionaries. Use, as keys, the attribute of the object that compares equal to the "foreign" object, and as values the desired objects themselves.

If it is a simple use case, and you can have a linear search, however, you could do the obvious – it would not be bad:

def get_equal(in_set, in_element):
   for element in in_set:
       if element == in_element:
           return element
   return None 

If you need what exactly what you ar asking for (I can wonder some use cases for that) – the way to go is to create a custom dictionary class that has a set as one of its members, do implement proxy methods to the member set, and in both dictionary and set methods, keeps sync of both the dictionary and set contents. This would be time consuming to implement right, but relatively straightforward, and have a O(1) time.

If having to copy references to all data around is not an issue (this is linear, but probably worst than the straightforward search above), you can use the expression

(data - (data - {key})).pop()

as in:

In [40]: class A:
    ...:     def __init__(self, id, extra):
    ...:         self.id = id
    ...:         self.extra = extra
    ...:     def __eq__(self, other):
    ...:         return self.id == other.id
    ...:     def __hash__(self):
    ...:         return hash(self.id)
    ...:     def __repr__(self):
    ...:         return f"({self.id}, {self.extra})"
    ...: 
    ...: 

In [41]: data = set(A(i, "initial") for i in range(10))

In [42]: (data - (data - {A(5, None)})).pop()
Out[42]: (5, initial)
Answered By: jsbueno

I found a similar question on python-list: Get item from set. There is a clever answer with reference to get_equivalent(container, item) (Python recipe).

The trick is to construct a wrapper object for the ‘key’ object, and check if wrapper is in the set using the in operator. If the wrapper hashes equal to the key, its __eq__ method can gain access to the object in the set, and save a reference to it. An important point from the discussion is that the __eq__ method of the set elements must return NotImplemented for unrecognized types, otherwise the wrapper’s __eq__ may not get called.

Answered By: Janne Karila

Here’s a quick solution I made by exploiting the behavior of the ‘eq‘ and ‘contains‘ methods. Code comments are (hopefully) self documenting, but it’s otherwise pretty simple.

As the example shows, it works for set, dict and list, and in theory could work for any object that implements ‘contains‘.

import typing as _ts
from typing import Any


class Getter:
    __slots__ = "key", "value"

    def __init__(self, key, value=None):
        self.key = key
        self.value = value

    def __repr__(self):
        return "{}({}, {})".format(
            type(self).__name__,
            repr(self.key), repr(self.value),
        )

    def __hash__(self):
        return hash(self.key)

    def __eq__(self, other):
        self.value = other
        return self.key == other


RAISES = object()


def getkey(keyed: _ts.Container, key: Any, default: Any = RAISES):
    getter = Getter(key)
    if getter in keyed:
        # providing '__contains__' is implemented to call
        #  the '__eq__' method (which in any sane case it
        #  should be), this results in our special
        #  'Getter.__eq__' method being called with the
        #  element we're trying to get as the 'other' argument
        return getter.value
    if default is RAISES:
        raise KeyError(key)
    return default


if __name__ == '__main__':
    # testing

    class T(int):
        def __repr__(self):
            return "T({})".format(int.__repr__(self))

    def main():
        # works for both builtin set and dict
        hm1 = {T(1), T(2), T(3)}
        hm2 = {T(1): 1, T(2): 2, T(3): 3}
        print(getkey(hm1, 2))
        print(getkey(hm2, 2))
        # should print "T(2)"

        # even works for list
        lst = [T(1), T(2), T(3)]
        print(getkey(lst, 3))
        # should print "T(3)"

        # in theory could work for any type that
        #  implements '__contains__' by calling '__eq__'

    main()
Answered By: tex
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.