Random seed for order of elements in Python's Set to List conversion

Question:

I was executing some code in a Jupyter notebook and noticed that each time I ran it, the output was different despite not explicitly putting randomness in my program.

I narrowed it down to a line that removes all repeated elements from a list.

l = list(set(l))

I noticed two things:

  • If I re-run the same code in the same Jupyter kernel, I always get the same output for l, but

  • If I open up another notebook, I get a different output.

Is there some kind of hidden random seed that is used for the set -> list conversion for a given kernel? How does it work under the hood, and what would I do if I wanted deterministic output from the above code?

Asked By: Paradox

||

Answers:

A set functions almost the same as dict, with the hash of your object as the key. The default __hash__ function of most objects (in CPython) relies on their id, which in turn relies on their address in the memory.

New kernel means objects have a different address, which means a different id, a different hash, and a different order of the iterator that the set gives.

This is implementation-dependent, so you cannot rely on it, all I can say is CPython, so far, works this way. The thing you can rely on is set not being (usefully) ordered.

If you need ordering, keep both the list and the set. If you want to remove repeats while preserving order, something like this will work:

def could_add(s, x):
    if x in s:
        return False
    else:
        s.add(x)
        return True

seen = set()
[x for x in l if could_add(seen, x)]

(Though I fully agree with Barmar’s comment — if order matters, they should be sortable.)

Answered By: Amadan

You can use OrderedDict instead of set to removes all repeated elements from a list and keep its order.
If you use python>=3.6, dict will also keep its order as the same as OrderedDict.

# python < 3.6
from collections import OrderedDict
res = list(OrderedDict.fromkeys(yourlist))
# pyton >= 3.6
res = list(dict.fromkeys(yourlist))
Answered By: Fenris Elric
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.