Why is my python code for finding the intersection between lists not working as intended?
Question:
First, sorry if this question is too basic, I am still a beginner in programming. I was trying to create a code to generate two random lists and find their intersection, without duplicates, but my idea is not working as intended, which is using this setup with list comprehensions:
import random
a = random.choices(range(0, 10), k = 7)
b = random.choices(range(0, 10), k = 7)
print(a)
print(b)
c = []
c = [i for i in a if i in b if i not in c]
Here are some of the results:
a = [1, 4, 1, 7, 2, 3, 8] b = [5, 6, 4, 9, 4, 4, 1] c = [1, 4, 1]
a = [7, 3, 8, 4, 7, 5, 3] b = [3, 3, 7, 8, 1, 4, 7] c = [7, 3, 8, 4, 7, 3]
Clearly, there are duplicates being included. Why is it happening? Shouldn’t list c be updated after each loop and the code check if the duplicate is already there and hence not include it?
Answers:
Shouldn’t list c be updated after each loop and the code check if the duplicate is already there and hence not include it?
No.
In an assignment like c = [1, 2, 3]
, the expression on the right hand side of the =
is evaluated first (in its entirety), and then the name on the left hand side of the =
is bound to that value.
The fact that you had to do c = []
to prevent an exception from being raised in the body of your list comprehension is a clue — if you didn’t have that, c
wouldn’t be bound to anything until the list comprehension completed. With your code as you have it, c
is bound to an empty list, and then it gets rebound to the completed list comprehension. There is no in-between state where c
is bound to the list comprehension while it’s still in progress.
If you were to write this as a for
loop:
c = []
for i in a:
if i in b:
if i not in c:
c.append(i)
you get the behavior you want, because now c
is actually being modified (via the c.append
call) on each iteration of the loop.
It is simpler to avoid duplicates by using a set instead of a list; this is often as straightforward as using a set comprehension instead of a list comprehension:
c = {i for i in a if i in b}
but since you’re trying to find an intersection, it’s simpler yet if you make sets out of a
and b
and then use the set intersection operator:
c = set(a) & set(b)
If you place your comprehension in a call to c.extend
, the additions will be processed as you go and your code will work as expected:
c = []
c.extend(i for i in a if i in b if i not in c)
First, sorry if this question is too basic, I am still a beginner in programming. I was trying to create a code to generate two random lists and find their intersection, without duplicates, but my idea is not working as intended, which is using this setup with list comprehensions:
import random
a = random.choices(range(0, 10), k = 7)
b = random.choices(range(0, 10), k = 7)
print(a)
print(b)
c = []
c = [i for i in a if i in b if i not in c]
Here are some of the results:
a = [1, 4, 1, 7, 2, 3, 8] b = [5, 6, 4, 9, 4, 4, 1] c = [1, 4, 1]
a = [7, 3, 8, 4, 7, 5, 3] b = [3, 3, 7, 8, 1, 4, 7] c = [7, 3, 8, 4, 7, 3]
Clearly, there are duplicates being included. Why is it happening? Shouldn’t list c be updated after each loop and the code check if the duplicate is already there and hence not include it?
Shouldn’t list c be updated after each loop and the code check if the duplicate is already there and hence not include it?
No.
In an assignment like c = [1, 2, 3]
, the expression on the right hand side of the =
is evaluated first (in its entirety), and then the name on the left hand side of the =
is bound to that value.
The fact that you had to do c = []
to prevent an exception from being raised in the body of your list comprehension is a clue — if you didn’t have that, c
wouldn’t be bound to anything until the list comprehension completed. With your code as you have it, c
is bound to an empty list, and then it gets rebound to the completed list comprehension. There is no in-between state where c
is bound to the list comprehension while it’s still in progress.
If you were to write this as a for
loop:
c = []
for i in a:
if i in b:
if i not in c:
c.append(i)
you get the behavior you want, because now c
is actually being modified (via the c.append
call) on each iteration of the loop.
It is simpler to avoid duplicates by using a set instead of a list; this is often as straightforward as using a set comprehension instead of a list comprehension:
c = {i for i in a if i in b}
but since you’re trying to find an intersection, it’s simpler yet if you make sets out of a
and b
and then use the set intersection operator:
c = set(a) & set(b)
If you place your comprehension in a call to c.extend
, the additions will be processed as you go and your code will work as expected:
c = []
c.extend(i for i in a if i in b if i not in c)