Python remove element list element with same value at position

Question:

Let’s assume I have a list, structured like this with approx 1 million elements:

a = [["a","a"],["b","a"],["c","a"],["d","a"],["a","a"],["a","a"]]

What is the fastest way to remove all elements from a that have the same value at index 0?
The result should be

b = [["a","a"],["b","a"],["c","a"],["d","a"]]

Is there a faster way than this:

processed = []
no_duplicates = []

for elem in a:
    if elem[0] not in processed:
        no_duplicates.append(elem)
        processed.append(elem[0])

This works but the appending operations take ages.

Asked By: dmuensterer

||

Answers:

You can use a list comprehension with a condition and add the first element back to the results like so

[a[0]] + [n for n in a if n != a[0]]

Output

[['a', 'a'], ['b', 'a'], ['c', 'a'], ['d', 'a']]
Answered By: Patrick

You can use a dictionary comprehension using the first item as key. This will work in O(n) time.

out = list({x[0]: x for x in a}.values())

output: [['a', 'a'], ['b', 'a'], ['c', 'a'], ['d', 'a']]

Answered By: mozway

you can use set to keep the record of first element and check if for each sublist first element in this or not. it will took O(1) time compare to O(n) time to your solution to search.

>>> a = [["a","a"],["b","a"],["c","a"],["d","a"],["a","a"],["a","a"]]
>>> 
>>> seen = set()
>>> new_a = []
>>> for i in a:
...     if i[0] not in seen:
...             new_a.append(i)
...             seen.add(i[0])
... 
>>> new_a
[['a', 'a'], ['b', 'a'], ['c', 'a'], ['d', 'a']]
>>> 

Space complexity : O(N)
Time complexity: O(N)
Search if first element there or not : O(1)

In case, no new list to be declared, then use del element, but this will increase time complexity

Answered By: sahasrara62
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.