Removing elements that have consecutive duplicates
Question:
I was curios about the question: Eliminate consecutive duplicates of list elements, and how it should be implemented in Python.
What I came up with is this:
list = [1,1,1,1,1,1,2,3,4,4,5,1,2]
i = 0
while i < len(list)-1:
if list[i] == list[i+1]:
del list[i]
else:
i = i+1
Output:
[1, 2, 3, 4, 5, 1, 2]
Which I guess is ok.
So I got curious, and wanted to see if I could delete the elements that had consecutive duplicates and get this output:
[2, 3, 5, 1, 2]
For that I did this:
list = [1,1,1,1,1,1,2,3,4,4,5,1,2]
i = 0
dupe = False
while i < len(list)-1:
if list[i] == list[i+1]:
del list[i]
dupe = True
elif dupe:
del list[i]
dupe = False
else:
i += 1
But it seems sort of clumsy and not pythonic, do you have any smarter / more elegant / more efficient way to implement this?
Answers:
>>> L = [1,1,1,1,1,1,2,3,4,4,5,1,2]
>>> from itertools import groupby
>>> [key for key, _group in groupby(L)]
[1, 2, 3, 4, 5, 1, 2]
For the second part
>>> [k for k, g in groupby(L) if len(list(g)) < 2]
[2, 3, 5, 1, 2]
If you don’t want to create the temporary list just to take the length, you can use sum over a generator expression
>>> [k for k, g in groupby(L) if sum(1 for i in g) < 2]
[2, 3, 5, 1, 2]
To Eliminate consecutive duplicates of list elements; as an alternative, you may use itertools.zip_longest()
with list comprehension as:
>>> from itertools import zip_longest
>>> my_list = [1,1,1,1,1,1,2,3,4,4,5,1,2]
>>> [i for i, j in zip_longest(my_list, my_list[1:]) if i!=j]
[1, 2, 3, 4, 5, 1, 2]
Oneliner in pure Python
[v for i, v in enumerate(your_list) if i == 0 or v != your_list[i-1]]
Here is a solution without dependence on outside packages:
list = [1,1,1,1,1,1,2,3,4,4,5,1,2]
L = list + [999] # append a unique dummy element to properly handle -1 index
[l for i, l in enumerate(L) if l != L[i - 1]][:-1] # drop the dummy element
Then I noted that Ulf Aslak’s similar solution is cleaner 🙂
Another possible one-liner, using functools.reduce
(excluding the import) – with the downside that string and list require slightly different implementations:
>>> from functools import reduce
>>> reduce(lambda a, b: a if a[-1:] == [b] else a + [b], [1,1,2,3,4,4,5,1,2], [])
[1, 2, 3, 4, 5, 1, 2]
>>> reduce(lambda a, b: a if a[-1:] == b else a+b, 'aa bbb cc')
'a b c'
A "lazy" approach would be to use itertools.groupby
.
import itertools
list1 = [1, 2, 3, 3, 4, 3, 5, 5]
list1 = [g for g, _ in itertools.groupby(list1)]
print(list1)
outputs
[1, 2, 3, 4, 3, 5]
You can do this by using zip_longest()
+ list comprehension.
from itertools import zip_longest
list1 = [1, 2, 3, 3, 4, 3, 5, 5].
# using zip_longest()+ list comprehension
res = [i for i, j in zip_longest(list1, list1[1:])
if i != j]
print ("List after removing consecutive duplicates : " + str(res))
If you use Python 3.8+, you can use assignment expression :=
:
list1 = [1, 2, 3, 3, 4, 3, 5, 5]
prev = object()
list1 = [prev:=v for v in list1 if prev!=v]
print(list1)
Prints:
[1, 2, 3, 4, 3, 5]
Plenty of better/more pythonic answers above, however one could also accomplish this task using list.pop()
:
my_list = [1, 2, 3, 3, 4, 3, 5, 5]
for x in my_list[:-1]:
next_index = my_list.index(x) + 1
if my_list[next_index] == x:
my_list.pop(next_index)
outputs
[1, 2, 3, 4, 3, 5]
I was curios about the question: Eliminate consecutive duplicates of list elements, and how it should be implemented in Python.
What I came up with is this:
list = [1,1,1,1,1,1,2,3,4,4,5,1,2]
i = 0
while i < len(list)-1:
if list[i] == list[i+1]:
del list[i]
else:
i = i+1
Output:
[1, 2, 3, 4, 5, 1, 2]
Which I guess is ok.
So I got curious, and wanted to see if I could delete the elements that had consecutive duplicates and get this output:
[2, 3, 5, 1, 2]
For that I did this:
list = [1,1,1,1,1,1,2,3,4,4,5,1,2]
i = 0
dupe = False
while i < len(list)-1:
if list[i] == list[i+1]:
del list[i]
dupe = True
elif dupe:
del list[i]
dupe = False
else:
i += 1
But it seems sort of clumsy and not pythonic, do you have any smarter / more elegant / more efficient way to implement this?
>>> L = [1,1,1,1,1,1,2,3,4,4,5,1,2]
>>> from itertools import groupby
>>> [key for key, _group in groupby(L)]
[1, 2, 3, 4, 5, 1, 2]
For the second part
>>> [k for k, g in groupby(L) if len(list(g)) < 2]
[2, 3, 5, 1, 2]
If you don’t want to create the temporary list just to take the length, you can use sum over a generator expression
>>> [k for k, g in groupby(L) if sum(1 for i in g) < 2]
[2, 3, 5, 1, 2]
To Eliminate consecutive duplicates of list elements; as an alternative, you may use itertools.zip_longest()
with list comprehension as:
>>> from itertools import zip_longest
>>> my_list = [1,1,1,1,1,1,2,3,4,4,5,1,2]
>>> [i for i, j in zip_longest(my_list, my_list[1:]) if i!=j]
[1, 2, 3, 4, 5, 1, 2]
Oneliner in pure Python
[v for i, v in enumerate(your_list) if i == 0 or v != your_list[i-1]]
Here is a solution without dependence on outside packages:
list = [1,1,1,1,1,1,2,3,4,4,5,1,2]
L = list + [999] # append a unique dummy element to properly handle -1 index
[l for i, l in enumerate(L) if l != L[i - 1]][:-1] # drop the dummy element
Then I noted that Ulf Aslak’s similar solution is cleaner 🙂
Another possible one-liner, using functools.reduce
(excluding the import) – with the downside that string and list require slightly different implementations:
>>> from functools import reduce
>>> reduce(lambda a, b: a if a[-1:] == [b] else a + [b], [1,1,2,3,4,4,5,1,2], [])
[1, 2, 3, 4, 5, 1, 2]
>>> reduce(lambda a, b: a if a[-1:] == b else a+b, 'aa bbb cc')
'a b c'
A "lazy" approach would be to use itertools.groupby
.
import itertools
list1 = [1, 2, 3, 3, 4, 3, 5, 5]
list1 = [g for g, _ in itertools.groupby(list1)]
print(list1)
outputs
[1, 2, 3, 4, 3, 5]
You can do this by using zip_longest()
+ list comprehension.
from itertools import zip_longest
list1 = [1, 2, 3, 3, 4, 3, 5, 5].
# using zip_longest()+ list comprehension
res = [i for i, j in zip_longest(list1, list1[1:])
if i != j]
print ("List after removing consecutive duplicates : " + str(res))
If you use Python 3.8+, you can use assignment expression :=
:
list1 = [1, 2, 3, 3, 4, 3, 5, 5]
prev = object()
list1 = [prev:=v for v in list1 if prev!=v]
print(list1)
Prints:
[1, 2, 3, 4, 3, 5]
Plenty of better/more pythonic answers above, however one could also accomplish this task using list.pop()
:
my_list = [1, 2, 3, 3, 4, 3, 5, 5]
for x in my_list[:-1]:
next_index = my_list.index(x) + 1
if my_list[next_index] == x:
my_list.pop(next_index)
outputs
[1, 2, 3, 4, 3, 5]