Understanding list comprehensions in Python
Question:
When reading the official tutorial, I encountered this example:
>>> vec = [[1,2,3], [4,5,6], [7,8,9]]
>>> [num for elem in vec for num in elem]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
I couldn’t understand this, so I did some experiments:
>>> [num for elem in vec]
[9, 9, 9]
>>> [num for elem in (vec for num in elem)]
[9, 9, 9]
And I’m even more confused now!
In which order should I read a list comprehension?
I’m sure I haven’t defined a num
variable with the value 9 anywhere.
python
Output:
Python 2.7.10 (default, Oct 23 2015, 19:19:21)
[GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
REPL session:
>>> num
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'num' is not defined
>>> vec = [[1,2,3], [4,5,6], [7,8,9]]
>>> [num for elem in vec for num in elem]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> [num for elem in vec]
[9, 9, 9]
>>> [num for elem in (vec for num in elem)]
[9, 9, 9]
Answers:
When you execute the list comprehension, the value of num
is 9
, so the next time you iterate through the vec
you will get a list of 9
.
See this.
In [1]: vec = [[1,2,3], [4,5,6], [7,8,9]]
In [2]: [num for elem in vec for num in elem]
Out[2]: [1, 2, 3, 4, 5, 6, 7, 8, 9]
In [3]: num
Out[3]: 9
In [4]: [num for elem in vec]
Out[4]: [9, 9, 9]
list1 = [num for elem in vec for num in elem]
is equivalent to:
list1 = []
for elem in vec:
for num in elem:
list1.append(num)
The loops in list comprehension are read from left to right. If your list comprehension would be written as an ordinary loop, it would look something like this:
>>> vec = [[1,2,3], [4,5,6], [7,8,9]]
>>> l = []
>>> for elem in vec:
... for num in elem:
... l.append(num)
...
>>> l
[1, 2, 3, 4, 5, 6, 7, 8, 9]
In Python 2, the variables within the list comprehension share the outer scope, so num
is available to be used later:
>>> vec = [[1,2,3], [4,5,6], [7,8,9]]
>>> [num for elem in vec for num in elem]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> num
9
Note that on Python 3, the behavior is different:
>>> vec = [[1,2,3], [4,5,6], [7,8,9]]
>>> [num for elem in vec for num in elem]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> num
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'num' is not defined
Let me try to make the answer more clear. And the order obviously will be from left to right and most right value will be stored in the variable, i.e., num and elem.
Initial data:
vec = [[1,2,3], [4,5,6], [7,8,9]]
num # Undefined
elem # Undefined
Step 1: After execution of line [num for elem in vec for num in elem]
# Now the value of undefined variable will be
num = 9 # Will keep the last value of the loop as per Python 2.7
elem = [7, 8, 9] # The same applies here (as the loop
# for 'elem' in 'vec' will get
# executed first, followed by
# for 'num' in 'elem')
Step 2: After execution of line [num for elem in vec]
# Output: [9, 9, 9]
# Since num value is 9 and its get repeated 3 times because of
# 'vec' has three elements (three list object in a list, so
# the 'for' loop will run three times)
# Now the variable value would be
num = 9 # No change
elem = [7, 8, 9] # The last tuple of variable vec
Step 3: After the execution of [num for elem in (vec for num in elem)]
-
In the first/right loop, i.e., (vec for num in elem)
Here the result will be a generator that will have run thrice since the length of elem is 3.
-
The final for loop will iterate over RESULT1 (the result of for loop #1 having length 3) and since the num value is 9. The result will be [9, 9, 9] # ‘num’ value repeated thrice.
When reading the official tutorial, I encountered this example:
>>> vec = [[1,2,3], [4,5,6], [7,8,9]]
>>> [num for elem in vec for num in elem]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
I couldn’t understand this, so I did some experiments:
>>> [num for elem in vec]
[9, 9, 9]
>>> [num for elem in (vec for num in elem)]
[9, 9, 9]
And I’m even more confused now!
In which order should I read a list comprehension?
I’m sure I haven’t defined a num
variable with the value 9 anywhere.
python
Output:
Python 2.7.10 (default, Oct 23 2015, 19:19:21)
[GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
REPL session:
>>> num
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'num' is not defined
>>> vec = [[1,2,3], [4,5,6], [7,8,9]]
>>> [num for elem in vec for num in elem]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> [num for elem in vec]
[9, 9, 9]
>>> [num for elem in (vec for num in elem)]
[9, 9, 9]
When you execute the list comprehension, the value of num
is 9
, so the next time you iterate through the vec
you will get a list of 9
.
See this.
In [1]: vec = [[1,2,3], [4,5,6], [7,8,9]]
In [2]: [num for elem in vec for num in elem]
Out[2]: [1, 2, 3, 4, 5, 6, 7, 8, 9]
In [3]: num
Out[3]: 9
In [4]: [num for elem in vec]
Out[4]: [9, 9, 9]
list1 = [num for elem in vec for num in elem]
is equivalent to:
list1 = []
for elem in vec:
for num in elem:
list1.append(num)
The loops in list comprehension are read from left to right. If your list comprehension would be written as an ordinary loop, it would look something like this:
>>> vec = [[1,2,3], [4,5,6], [7,8,9]]
>>> l = []
>>> for elem in vec:
... for num in elem:
... l.append(num)
...
>>> l
[1, 2, 3, 4, 5, 6, 7, 8, 9]
In Python 2, the variables within the list comprehension share the outer scope, so num
is available to be used later:
>>> vec = [[1,2,3], [4,5,6], [7,8,9]]
>>> [num for elem in vec for num in elem]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> num
9
Note that on Python 3, the behavior is different:
>>> vec = [[1,2,3], [4,5,6], [7,8,9]]
>>> [num for elem in vec for num in elem]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> num
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'num' is not defined
Let me try to make the answer more clear. And the order obviously will be from left to right and most right value will be stored in the variable, i.e., num and elem.
Initial data:
vec = [[1,2,3], [4,5,6], [7,8,9]]
num # Undefined
elem # Undefined
Step 1: After execution of line [num for elem in vec for num in elem]
# Now the value of undefined variable will be
num = 9 # Will keep the last value of the loop as per Python 2.7
elem = [7, 8, 9] # The same applies here (as the loop
# for 'elem' in 'vec' will get
# executed first, followed by
# for 'num' in 'elem')
Step 2: After execution of line [num for elem in vec]
# Output: [9, 9, 9]
# Since num value is 9 and its get repeated 3 times because of
# 'vec' has three elements (three list object in a list, so
# the 'for' loop will run three times)
# Now the variable value would be
num = 9 # No change
elem = [7, 8, 9] # The last tuple of variable vec
Step 3: After the execution of [num for elem in (vec for num in elem)]
-
In the first/right loop, i.e., (vec for num in elem)
Here the result will be a generator that will have run thrice since the length of elem is 3.
-
The final for loop will iterate over RESULT1 (the result of for loop #1 having length 3) and since the num value is 9. The result will be [9, 9, 9] # ‘num’ value repeated thrice.