Associativity of "in" in Python?
Question:
I’m making a Python parser, and this is really confusing me:
>>> 1 in [] in 'a'
False
>>> (1 in []) in 'a'
TypeError: 'in <string>' requires string as left operand, not bool
>>> 1 in ([] in 'a')
TypeError: 'in <string>' requires string as left operand, not list
How exactly does in
work in Python, with regards to associativity, etc.?
Why do no two of these expressions behave the same way?
Answers:
1 in [] in 'a'
is evaluated as (1 in []) and ([] in 'a')
.ยน
Since the first condition (1 in []
) is False
, the whole condition is evaluated as False
; ([] in 'a')
is never actually evaluated, so no error is raised.
We can see how Python executes each statement using the dis
module:
>>> from dis import dis
>>> dis("1 in [] in 'a'")
1 0 LOAD_CONST 0 (1)
2 BUILD_LIST 0
4 DUP_TOP
6 ROT_THREE
8 CONTAINS_OP 0 # `in` is the contains operator
10 JUMP_IF_FALSE_OR_POP 18 # skip to 18 if the first
# comparison is false
12 LOAD_CONST 1 ('a') # 12-16 are never executed
14 CONTAINS_OP 0 # so no error here (14)
16 RETURN_VALUE
>> 18 ROT_TWO
20 POP_TOP
22 RETURN_VALUE
>>> dis("(1 in []) in 'a'")
1 0 LOAD_CONST 0 (1)
2 LOAD_CONST 1 (())
4 CONTAINS_OP 0 # perform 1 in []
6 LOAD_CONST 2 ('a') # now load 'a'
8 CONTAINS_OP 0 # check if result of (1 in []) is in 'a'
# throws Error because (False in 'a')
# is a TypeError
10 RETURN_VALUE
>>> dis("1 in ([] in 'a')")
1 0 LOAD_CONST 0 (1)
2 BUILD_LIST 0
4 LOAD_CONST 1 ('a')
6 CONTAINS_OP 0 # perform ([] in 'a'), which is
# incorrect, so it throws a TypeError
8 CONTAINS_OP 0 # if no Error then this would
# check if 1 is in the result of ([] in 'a')
10 RETURN_VALUE
- Except that
[]
is only evaluated once. This doesn’t matter in this example but if you (for example) replaced []
with a function that returned a list, that function would only be called once (at most). The documentation explains also this.
Python does special things with chained comparisons.
The following are evaluated differently:
x > y > z # in this case, if x > y evaluates to true, then
# the value of y is used, again, and compared with z
(x > y) > z # the parenthesized form, on the other hand, will first
# evaluate x > y. And, compare the evaluated result
# with z, which can be "True > z" or "False > z"
In both cases though, if the first comparison is False
, the rest of the statement won’t be looked at.
For your particular case,
1 in [] in 'a' # this is false because 1 is not in []
(1 in []) in a # this gives an error because we are
# essentially doing this: False in 'a'
1 in ([] in 'a') # this fails because you cannot do
# [] in 'a'
Also to demonstrate the first rule above, these are statements that evaluate to True.
1 in [1,2] in [4,[1,2]] # But "1 in [4,[1,2]]" is False
2 < 4 > 1 # and note "2 < 1" is also not true
Precedence of Python operators: https://docs.python.org/3/reference/expressions.html#comparisons
Comparisons can be chained arbitrarily, e.g., x < y <= z is equivalent to x < y and y <= z, except that y is evaluated only once (but in both cases z is not evaluated at all when x < y is found to be false).
What this means is, that there no associativity in x in y in z
!
The following are equivalent:
1 in [] in 'a'
# <=>
middle = []
# False not evaluated
result = (1 in middle) and (middle in 'a')
(1 in []) in 'a'
# <=>
lhs = (1 in []) # False
result = lhs in 'a' # False in 'a' - TypeError
1 in ([] in 'a')
# <=>
rhs = ([] in 'a') # TypeError
result = 1 in rhs
The short answer, since the long one is already given several times here and in excellent ways, is that the boolean expression is short-circuited, this is has stopped evaluation when a change of true in false or vice versa cannot happen by further evaluation.
(see http://en.wikipedia.org/wiki/Short-circuit_evaluation)
It might be a little short (no pun intended) as an answer, but as mentioned, all other explanation is allready done quite well here, but I thought the term deserved to be mentioned.
I’m making a Python parser, and this is really confusing me:
>>> 1 in [] in 'a'
False
>>> (1 in []) in 'a'
TypeError: 'in <string>' requires string as left operand, not bool
>>> 1 in ([] in 'a')
TypeError: 'in <string>' requires string as left operand, not list
How exactly does in
work in Python, with regards to associativity, etc.?
Why do no two of these expressions behave the same way?
1 in [] in 'a'
is evaluated as (1 in []) and ([] in 'a')
.ยน
Since the first condition (1 in []
) is False
, the whole condition is evaluated as False
; ([] in 'a')
is never actually evaluated, so no error is raised.
We can see how Python executes each statement using the dis
module:
>>> from dis import dis
>>> dis("1 in [] in 'a'")
1 0 LOAD_CONST 0 (1)
2 BUILD_LIST 0
4 DUP_TOP
6 ROT_THREE
8 CONTAINS_OP 0 # `in` is the contains operator
10 JUMP_IF_FALSE_OR_POP 18 # skip to 18 if the first
# comparison is false
12 LOAD_CONST 1 ('a') # 12-16 are never executed
14 CONTAINS_OP 0 # so no error here (14)
16 RETURN_VALUE
>> 18 ROT_TWO
20 POP_TOP
22 RETURN_VALUE
>>> dis("(1 in []) in 'a'")
1 0 LOAD_CONST 0 (1)
2 LOAD_CONST 1 (())
4 CONTAINS_OP 0 # perform 1 in []
6 LOAD_CONST 2 ('a') # now load 'a'
8 CONTAINS_OP 0 # check if result of (1 in []) is in 'a'
# throws Error because (False in 'a')
# is a TypeError
10 RETURN_VALUE
>>> dis("1 in ([] in 'a')")
1 0 LOAD_CONST 0 (1)
2 BUILD_LIST 0
4 LOAD_CONST 1 ('a')
6 CONTAINS_OP 0 # perform ([] in 'a'), which is
# incorrect, so it throws a TypeError
8 CONTAINS_OP 0 # if no Error then this would
# check if 1 is in the result of ([] in 'a')
10 RETURN_VALUE
- Except that
[]
is only evaluated once. This doesn’t matter in this example but if you (for example) replaced[]
with a function that returned a list, that function would only be called once (at most). The documentation explains also this.
Python does special things with chained comparisons.
The following are evaluated differently:
x > y > z # in this case, if x > y evaluates to true, then
# the value of y is used, again, and compared with z
(x > y) > z # the parenthesized form, on the other hand, will first
# evaluate x > y. And, compare the evaluated result
# with z, which can be "True > z" or "False > z"
In both cases though, if the first comparison is False
, the rest of the statement won’t be looked at.
For your particular case,
1 in [] in 'a' # this is false because 1 is not in []
(1 in []) in a # this gives an error because we are
# essentially doing this: False in 'a'
1 in ([] in 'a') # this fails because you cannot do
# [] in 'a'
Also to demonstrate the first rule above, these are statements that evaluate to True.
1 in [1,2] in [4,[1,2]] # But "1 in [4,[1,2]]" is False
2 < 4 > 1 # and note "2 < 1" is also not true
Precedence of Python operators: https://docs.python.org/3/reference/expressions.html#comparisons
Comparisons can be chained arbitrarily, e.g., x < y <= z is equivalent to x < y and y <= z, except that y is evaluated only once (but in both cases z is not evaluated at all when x < y is found to be false).
What this means is, that there no associativity in x in y in z
!
The following are equivalent:
1 in [] in 'a'
# <=>
middle = []
# False not evaluated
result = (1 in middle) and (middle in 'a')
(1 in []) in 'a'
# <=>
lhs = (1 in []) # False
result = lhs in 'a' # False in 'a' - TypeError
1 in ([] in 'a')
# <=>
rhs = ([] in 'a') # TypeError
result = 1 in rhs
The short answer, since the long one is already given several times here and in excellent ways, is that the boolean expression is short-circuited, this is has stopped evaluation when a change of true in false or vice versa cannot happen by further evaluation.
(see http://en.wikipedia.org/wiki/Short-circuit_evaluation)
It might be a little short (no pun intended) as an answer, but as mentioned, all other explanation is allready done quite well here, but I thought the term deserved to be mentioned.