Get difference from two lists in Python
Question:
I have two lists, l1
and l2
. I need items from l1
which are not in l2
.
l1 = [2, 3, 4, 5]
l2 = [0, 1, 2, 3]
I want to get only [4,5] – only new values in l1.
[i for i in l1 if not i in l2 ]
Can I do that without iteration?
Answers:
If you don’t care about the order of the elements, you can use sets:
l1 = set([2, 3, 4, 5])
l2 = set([0, 1, 2, 3])
print l1 - l2
prints
set([4, 5])
You can use use set_1.difference_update(set_2) for in place difference:
>>sl1 = set([2, 3, 4, 5])
>>sl2 = set([0, 1, 2, 3])
>>sl1.difference_update(sl2)
>>sl1
set([4, 5])
Convert them to sets and use the difference operator:
l1=[2,3,4,5]
l2=[0,1,2,3]
answer = set(l1) - set(l2)
You can’t do it without iteration. Even if you call a single method, internally that will iterate.
Your approach is fine for a small list, but you could use this approach instead for larger lists:
s2 = set(l2)
result = [i for i in l1 if not i in s2 ]
This will be fast and will also preserve the original order of the elements in l1.
Short answer, yes: list(set(l1) - set(l2))
, though this will not keep order.
Long answer, no, since internally the CPU will always iterate. Though if you use set()
that iteration will be done highly optimized and will be much faster then your list comprehension (not to mention that checking for membership value in list
is much faster with sets then lists).
You can do this simply as follows:
list( set(l1) - set(l2) )
This should do the trick.
The conversion to sets is great when your list elements can be converted to sets. Otherwise you’ll need something like Mark Byers’ solution. If you have large lists to compare you might not want to pay the memory allocation overhead and simplify his line to:
[l1.remove(m) for m in l1 if m in l2]
Example:
>>a = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
>>b = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
We can concatenate as well to get the complete difference:
>>list (set(a) -set(b)) + list (set(b) -set(a))
>>[89, 34, 21, 55, 4, 6, 7, 9, 10, 11, 12]
Using the built-in module set
>>> a = set([1,2,3,4,5])
>>> b = set([1,3,5])
>>> a.difference(b)
set([2, 4])
Another approach
>>> a = set([1,2,3,4,5])
>>> b = [1,3,5]
>>> a.difference(b)
set([2, 4])
Simply as it is programming, a simple task can be done in a variety of ways.
We can use list comprehension methods like this for exactly the same problem
fruits = ["apple", "banana", "cherry", "kiwi", "mango"]
ddd = ["apple", "banana", "mango"]
newlist = [x for x in fruits if x not in ddd]
print(newlist)
I have two lists, l1
and l2
. I need items from l1
which are not in l2
.
l1 = [2, 3, 4, 5]
l2 = [0, 1, 2, 3]
I want to get only [4,5] – only new values in l1.
[i for i in l1 if not i in l2 ]
Can I do that without iteration?
If you don’t care about the order of the elements, you can use sets:
l1 = set([2, 3, 4, 5])
l2 = set([0, 1, 2, 3])
print l1 - l2
prints
set([4, 5])
You can use use set_1.difference_update(set_2) for in place difference:
>>sl1 = set([2, 3, 4, 5])
>>sl2 = set([0, 1, 2, 3])
>>sl1.difference_update(sl2)
>>sl1
set([4, 5])
Convert them to sets and use the difference operator:
l1=[2,3,4,5]
l2=[0,1,2,3]
answer = set(l1) - set(l2)
You can’t do it without iteration. Even if you call a single method, internally that will iterate.
Your approach is fine for a small list, but you could use this approach instead for larger lists:
s2 = set(l2)
result = [i for i in l1 if not i in s2 ]
This will be fast and will also preserve the original order of the elements in l1.
Short answer, yes: list(set(l1) - set(l2))
, though this will not keep order.
Long answer, no, since internally the CPU will always iterate. Though if you use set()
that iteration will be done highly optimized and will be much faster then your list comprehension (not to mention that checking for membership value in list
is much faster with sets then lists).
You can do this simply as follows:
list( set(l1) - set(l2) )
This should do the trick.
The conversion to sets is great when your list elements can be converted to sets. Otherwise you’ll need something like Mark Byers’ solution. If you have large lists to compare you might not want to pay the memory allocation overhead and simplify his line to:
[l1.remove(m) for m in l1 if m in l2]
Example:
>>a = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
>>b = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
We can concatenate as well to get the complete difference:
>>list (set(a) -set(b)) + list (set(b) -set(a))
>>[89, 34, 21, 55, 4, 6, 7, 9, 10, 11, 12]
Using the built-in module set
>>> a = set([1,2,3,4,5])
>>> b = set([1,3,5])
>>> a.difference(b)
set([2, 4])
Another approach
>>> a = set([1,2,3,4,5])
>>> b = [1,3,5]
>>> a.difference(b)
set([2, 4])
Simply as it is programming, a simple task can be done in a variety of ways.
We can use list comprehension methods like this for exactly the same problem
fruits = ["apple", "banana", "cherry", "kiwi", "mango"]
ddd = ["apple", "banana", "mango"]
newlist = [x for x in fruits if x not in ddd]
print(newlist)