Understanding the set() function

Question:

In python, set() is an unordered collection with no duplicate elements. However, I am not able to understand how it generates the output.

For example, consider the following:

>>> x = [1, 1, 2, 2, 2, 2, 2, 3, 3]
>>> set(x)
set([1, 2, 3])

>>> y = [1, 1, 6, 6, 6, 6, 6, 8, 8]
>>> set(y)
set([8, 1, 6])

>>> z = [1, 1, 6, 6, 6, 6, 6, 7, 7]
>>> set(z)
set([1, 6, 7])

Shouldn’t the output of set(y) be: set([1, 6, 8])? I tried the above two in Python 2.6.

Asked By: Prakhar Mehrotra

||

Answers:

As +Volatility and yourself pointed out, sets are unordered. If you need the elements to be in order, just call sorted on the set:

>>> y = [1, 1, 6, 6, 6, 6, 6, 8, 8]
>>> sorted(set(y))
[1, 6, 8]
Answered By: I82Much

Sets are unordered, as you say. Even though one way to implement sets is using a tree, they can also be implemented using a hash table (meaning getting the keys in sorted order may not be that trivial).

If you’d like to sort them, you can simply perform:

sorted(set(y))

which will produce a sorted list containing the set’s elements. (Not a set. Again, sets are unordered.)

Otherwise, the only thing guaranteed by set is that it makes the elements unique (nothing will be there more than once).

Hope this helps!

Answered By: user

As an unordered collection type, set([8, 1, 6]) is equivalent to set([1, 6, 8]).

While it might be nicer to display the set contents in sorted order, that would make the repr() call more expensive.

Internally, the set type is implemented using a hash table: a hash function is used to separate items into a number of buckets to reduce the number of equality operations needed to check if an item is part of the set.

To produce the repr() output it just outputs the items from each bucket in turn, which is unlikely to be the sorted order.

Answered By: James Henstridge

Python’s sets (and dictionaries) will iterate and print out in some order, but exactly what that order will be is arbitrary, and not guaranteed to remain the same after additions and removals.

Here’s an example of a set changing order after a lot of values are added and then removed:

>>> s = set([1,6,8])
>>> print(s)
{8, 1, 6}
>>> s.update(range(10,100000))
>>> for v in range(10, 100000):
    s.remove(v)
>>> print(s)
{1, 6, 8}

This is implementation dependent though, and so you should not rely upon it.

Answered By: Blckknght

After reading the other answers, I still had trouble understanding why the set comes out un-ordered.

Mentioned this to my partner and he came up with this metaphor: take marbles. You put them in a tube a tad wider than marble width : you have a list. A set, however, is a bag. Even though you feed the marbles one-by-one into the bag; when you pour them from a bag back into the tube, they will not be in the same order (because they got all mixed up in a bag).

Answered By: Ania
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.