Python 3.6.5 "is" and "==" for integers beyond caching interval
Question:
I want to preface this by saying that I know the difference between ==
and is
one is for references and the other is for objects. I also know that python caches the integers in the range (-5, 256)
at startup so they should work when comparing them with is
.
However I have seen a strange behaviour.
>>> 2**7 is 2**7
True
>>> 2**10 is 2**10
False
This is to be expected, 2**7
is 128
and 2**10
is 1024
, one is in the interval (-5, 256)
and the other is not.
However…
>>> 10000000000000000000000000000000000000000 is 10000000000000000000000000000000000000000
True
Why does this return True
? It is obviously a value WAY above any kind of caching interval and 2**10 is 2**10
clearly showed that is
does actually not work on integers above 256
. So… why does this happen?
Answers:
Remember that Python is compiled. The expression was compiled all at once and its literals are shared when possible. Any operation, like your exponentiation, or adding and subtracting 1 from one side, will break the identity. (Python could in theory do constant folding and thereby extend the set of is
-identical expressions, but it doesn’t bother.)
Performing multiple compilations will also break this:
>>> x=300
>>> x is 300
False
CPython detects constant values in your code and re-uses them to save memory. These constants are stored on code objects, and can even be accessed from within python:
>>> codeobj = compile('999 is 999', '<stdin>', 'exec')
>>> codeobj
<code object <module> at 0x7fec489ef420, file "<stdin>", line 1>
>>> codeobj.co_consts
(999, None)
Both operands of your is
refer to this very same 999 integer. We can confirm this by dissassembling the code:
>>> import dis
>>> dis.dis(codeobj)
1 0 LOAD_CONST 0 (999)
2 LOAD_CONST 0 (999)
4 COMPARE_OP 8 (is)
6 POP_TOP
8 LOAD_CONST 1 (None)
10 RETURN_VALUE
As you can see, the first two LOAD_CONST
instructions both load the constant with index 0
, which is the 999 number.
However, this only happens if the two numbers are compiled at the same time. If you create each number in a separate code object, they will no longer be identical:
>>> x = 999
>>> x is 999
False
I want to preface this by saying that I know the difference between ==
and is
one is for references and the other is for objects. I also know that python caches the integers in the range (-5, 256)
at startup so they should work when comparing them with is
.
However I have seen a strange behaviour.
>>> 2**7 is 2**7
True
>>> 2**10 is 2**10
False
This is to be expected, 2**7
is 128
and 2**10
is 1024
, one is in the interval (-5, 256)
and the other is not.
However…
>>> 10000000000000000000000000000000000000000 is 10000000000000000000000000000000000000000
True
Why does this return True
? It is obviously a value WAY above any kind of caching interval and 2**10 is 2**10
clearly showed that is
does actually not work on integers above 256
. So… why does this happen?
Remember that Python is compiled. The expression was compiled all at once and its literals are shared when possible. Any operation, like your exponentiation, or adding and subtracting 1 from one side, will break the identity. (Python could in theory do constant folding and thereby extend the set of is
-identical expressions, but it doesn’t bother.)
Performing multiple compilations will also break this:
>>> x=300
>>> x is 300
False
CPython detects constant values in your code and re-uses them to save memory. These constants are stored on code objects, and can even be accessed from within python:
>>> codeobj = compile('999 is 999', '<stdin>', 'exec')
>>> codeobj
<code object <module> at 0x7fec489ef420, file "<stdin>", line 1>
>>> codeobj.co_consts
(999, None)
Both operands of your is
refer to this very same 999 integer. We can confirm this by dissassembling the code:
>>> import dis
>>> dis.dis(codeobj)
1 0 LOAD_CONST 0 (999)
2 LOAD_CONST 0 (999)
4 COMPARE_OP 8 (is)
6 POP_TOP
8 LOAD_CONST 1 (None)
10 RETURN_VALUE
As you can see, the first two LOAD_CONST
instructions both load the constant with index 0
, which is the 999 number.
However, this only happens if the two numbers are compiled at the same time. If you create each number in a separate code object, they will no longer be identical:
>>> x = 999
>>> x is 999
False