Combining two lists and removing duplicates, without removing duplicates in original list
Question:
I have two lists that i need to combine where the second list has any duplicates of the first list ignored. .. A bit hard to explain, so let me show an example of what the code looks like, and what i want as a result.
first_list = [1, 2, 2, 5]
second_list = [2, 5, 7, 9]
# The result of combining the two lists should result in this list:
resulting_list = [1, 2, 2, 5, 7, 9]
You’ll notice that the result has the first list, including its two “2” values, but the fact that second_list also has an additional 2 and 5 value is not added to the first list.
Normally for something like this i would use sets, but a set on first_list would purge the duplicate values it already has. So i’m simply wondering what the best/fastest way to achieve this desired combination.
Thanks.
Answers:
You need to append to the first list those elements of the second list that aren’t in the first – sets are the easiest way of determining which elements they are, like this:
first_list = [1, 2, 2, 5]
second_list = [2, 5, 7, 9]
in_first = set(first_list)
in_second = set(second_list)
in_second_but_not_in_first = in_second - in_first
result = first_list + list(in_second_but_not_in_first)
print(result) # Prints [1, 2, 2, 5, 9, 7]
Or if you prefer one-liners 😎
print(first_list + list(set(second_list) - set(first_list)))
resulting_list = first_list + [i for i in second_list if i not in first_list]
resulting_list = list(first_list)
resulting_list.extend(x for x in second_list if x not in resulting_list)
This might help
def union(a,b):
for e in b:
if e not in a:
a.append(e)
The union function merges the second list into first, with out duplicating an element of a, if it’s already in a. Similar to set union operator. This function does not change b. If a=[1,2,3] b=[2,3,4]. After union(a,b) makes a=[1,2,3,4] and b=[2,3,4]
You can use sets:
first_list = [1, 2, 2, 5]
second_list = [2, 5, 7, 9]
resultList= list(set(first_list) | set(second_list))
print(resultList)
# Results in : resultList = [1,2,5,7,9]
first_list = [1, 2, 2, 5]
second_list = [2, 5, 7, 9]
newList=[]
for i in first_list:
newList.append(i)
for z in second_list:
if z not in newList:
newList.append(z)
newList.sort()
print newList
[1, 2, 2, 5, 7, 9]
You can also combine RichieHindle’s and Ned Batchelder’s responses for an average-case O(m+n) algorithm that preserves order:
first_list = [1, 2, 2, 5]
second_list = [2, 5, 7, 9]
fs = set(first_list)
resulting_list = first_list + [x for x in second_list if x not in fs]
assert(resulting_list == [1, 2, 2, 5, 7, 9])
Note that x in s
has a worst-case complexity of O(m), so the worst-case complexity of this code is still O(m*n).
first_list = [1, 2, 2, 5]
second_list = [2, 5, 7, 9]
print( set( first_list + second_list ) )
Simplest to me is:
first_list = [1, 2, 2, 5]
second_list = [2, 5, 7, 9]
merged_list = list(set(first_list+second_list))
print(merged_list)
#prints [1, 2, 5, 7, 9]
You can bring this down to one single line of code if you use numpy:
a = [1,2,3,4,5,6,7]
b = [2,4,7,8,9,10,11,12]
sorted(np.unique(a+b))
>>> [1,2,3,4,5,6,7,8,9,10,11,12]
Based on the recipe :
resulting_list = list(set().union(first_list, second_list))
you can use dict.fromkeys
to return a list with no duplicates:
def mergeTwoListNoDuplicates(list1, list2):
"""
Merges two lists together without duplicates
:param list1:
:param list2:
:return:
"""
merged_list = list1 + list2
merged_list = list(dict.fromkeys(merged_list))
return merged_list
I have two lists that i need to combine where the second list has any duplicates of the first list ignored. .. A bit hard to explain, so let me show an example of what the code looks like, and what i want as a result.
first_list = [1, 2, 2, 5]
second_list = [2, 5, 7, 9]
# The result of combining the two lists should result in this list:
resulting_list = [1, 2, 2, 5, 7, 9]
You’ll notice that the result has the first list, including its two “2” values, but the fact that second_list also has an additional 2 and 5 value is not added to the first list.
Normally for something like this i would use sets, but a set on first_list would purge the duplicate values it already has. So i’m simply wondering what the best/fastest way to achieve this desired combination.
Thanks.
You need to append to the first list those elements of the second list that aren’t in the first – sets are the easiest way of determining which elements they are, like this:
first_list = [1, 2, 2, 5]
second_list = [2, 5, 7, 9]
in_first = set(first_list)
in_second = set(second_list)
in_second_but_not_in_first = in_second - in_first
result = first_list + list(in_second_but_not_in_first)
print(result) # Prints [1, 2, 2, 5, 9, 7]
Or if you prefer one-liners 😎
print(first_list + list(set(second_list) - set(first_list)))
resulting_list = first_list + [i for i in second_list if i not in first_list]
resulting_list = list(first_list)
resulting_list.extend(x for x in second_list if x not in resulting_list)
This might help
def union(a,b):
for e in b:
if e not in a:
a.append(e)
The union function merges the second list into first, with out duplicating an element of a, if it’s already in a. Similar to set union operator. This function does not change b. If a=[1,2,3] b=[2,3,4]. After union(a,b) makes a=[1,2,3,4] and b=[2,3,4]
You can use sets:
first_list = [1, 2, 2, 5]
second_list = [2, 5, 7, 9]
resultList= list(set(first_list) | set(second_list))
print(resultList)
# Results in : resultList = [1,2,5,7,9]
first_list = [1, 2, 2, 5]
second_list = [2, 5, 7, 9]
newList=[]
for i in first_list:
newList.append(i)
for z in second_list:
if z not in newList:
newList.append(z)
newList.sort()
print newList
[1, 2, 2, 5, 7, 9]
You can also combine RichieHindle’s and Ned Batchelder’s responses for an average-case O(m+n) algorithm that preserves order:
first_list = [1, 2, 2, 5]
second_list = [2, 5, 7, 9]
fs = set(first_list)
resulting_list = first_list + [x for x in second_list if x not in fs]
assert(resulting_list == [1, 2, 2, 5, 7, 9])
Note that x in s
has a worst-case complexity of O(m), so the worst-case complexity of this code is still O(m*n).
first_list = [1, 2, 2, 5]
second_list = [2, 5, 7, 9]
print( set( first_list + second_list ) )
Simplest to me is:
first_list = [1, 2, 2, 5]
second_list = [2, 5, 7, 9]
merged_list = list(set(first_list+second_list))
print(merged_list)
#prints [1, 2, 5, 7, 9]
You can bring this down to one single line of code if you use numpy:
a = [1,2,3,4,5,6,7]
b = [2,4,7,8,9,10,11,12]
sorted(np.unique(a+b))
>>> [1,2,3,4,5,6,7,8,9,10,11,12]
Based on the recipe :
resulting_list = list(set().union(first_list, second_list))
you can use dict.fromkeys
to return a list with no duplicates:
def mergeTwoListNoDuplicates(list1, list2):
"""
Merges two lists together without duplicates
:param list1:
:param list2:
:return:
"""
merged_list = list1 + list2
merged_list = list(dict.fromkeys(merged_list))
return merged_list