# Determine if 2 lists have the same elements, regardless of order?

## Question:

Sorry for the simple question, but I’m having a hard time finding the answer.

When I compare 2 lists, I want to know if they are “equal” in that they have the same contents, but in different order.

Ex:

```
x = ['a', 'b']
y = ['b', 'a']
```

I want `x == y`

to evaluate to `True`

.

## Answers:

You can simply check whether the multisets with the elements of x and y are equal:

```
import collections
collections.Counter(x) == collections.Counter(y)
```

This requires the elements to be hashable; runtime will be in `O(n)`

, where `n`

is the size of the lists.

If the elements are also unique, you can also convert to sets (same asymptotic runtime, may be a little bit faster in practice):

```
set(x) == set(y)
```

If the elements are not hashable, but sortable, another alternative (runtime in `O(n log n)`

) is

```
sorted(x) == sorted(y)
```

If the elements are neither hashable nor sortable you can use the following helper function. Note that it will be quite slow (`O(n²)`

) and should generally **not** be used outside of the esoteric case of unhashable and unsortable elements.

```
def equal_ignore_order(a, b):
""" Use only when elements are neither hashable nor sortable! """
unmatched = list(b)
for element in a:
try:
unmatched.remove(element)
except ValueError:
return False
return not unmatched
```

Determine if 2 lists have the same elements, regardless of order?

Inferring from your example:

```
x = ['a', 'b']
y = ['b', 'a']
```

that the elements of the lists won’t be repeated (they are unique) as well as hashable (which strings and other certain immutable python objects are), **the most direct and computationally efficient answer** uses Python’s builtin sets, (which are semantically like mathematical sets you may have learned about in school).

```
set(x) == set(y) # prefer this if elements are hashable
```

In the case that the elements are hashable, but non-unique, the `collections.Counter`

also works semantically as a multiset, but *it is far slower*:

```
from collections import Counter
Counter(x) == Counter(y)
```

Prefer to use `sorted`

:

```
sorted(x) == sorted(y)
```

if the elements are orderable. This would account for non-unique or non-hashable circumstances, but this could be much slower than using sets.

# Empirical Experiment

An empirical experiment concludes that one should prefer `set`

, then `sorted`

. Only opt for `Counter`

if you need other things like counts or further usage as a multiset.

First setup:

```
import timeit
import random
from collections import Counter
data = [str(random.randint(0, 100000)) for i in xrange(100)]
data2 = data[:] # copy the list into a new one
def sets_equal():
return set(data) == set(data2)
def counters_equal():
return Counter(data) == Counter(data2)
def sorted_lists_equal():
return sorted(data) == sorted(data2)
```

And testing:

```
>>> min(timeit.repeat(sets_equal))
13.976069927215576
>>> min(timeit.repeat(counters_equal))
73.17287588119507
>>> min(timeit.repeat(sorted_lists_equal))
36.177085876464844
```

So we see that comparing sets is the fastest solution, and comparing sorted lists is second fastest.

This seems to work, though possibly cumbersome for large lists.

```
>>> A = [0, 1]
>>> B = [1, 0]
>>> C = [0, 2]
>>> not sum([not i in A for i in B])
True
>>> not sum([not i in A for i in C])
False
>>>
```

However, if each list *must* contain all the elements of other then the above code is problematic.

```
>>> A = [0, 1, 2]
>>> not sum([not i in A for i in B])
True
```

The problem arises when `len(A) != len(B)`

and, in this example, `len(A) > len(B)`

. To avoid this, you can add one more statement.

```
>>> not sum([not i in A for i in B]) if len(A) == len(B) else False
False
```

One more thing, I benchmarked my solution with timeit.repeat, under the same conditions used by Aaron Hall in his post. As suspected, the results are disappointing. My method is the last one. `set(x) == set(y)`

it is.

```
>>> def foocomprehend(): return not sum([not i in data for i in data2])
>>> min(timeit.repeat('fooset()', 'from __main__ import fooset, foocount, foocomprehend'))
25.2893661496
>>> min(timeit.repeat('foosort()', 'from __main__ import fooset, foocount, foocomprehend'))
94.3974742993
>>> min(timeit.repeat('foocomprehend()', 'from __main__ import fooset, foocount, foocomprehend'))
187.224562545
```

As mentioned in comments above, the general case is a pain. It is fairly easy if all items are hashable or all items are sortable. However I have recently had to try solve the general case. Here is my solution. I realised after posting that this is a duplicate to a solution above that I missed on the first pass. Anyway, if you use slices rather than list.remove() you can compare immutable sequences.

```
def sequences_contain_same_items(a, b):
for item in a:
try:
i = b.index(item)
except ValueError:
return False
b = b[:i] + b[i+1:]
return not b
```