How to read python bytecode?

Question:

I am having a lot of difficulty understanding Python’s bytecode and its dis module.

import dis
def func():
   x = 1
dis.dis(func)

The above code when typed in the interpreter produces the following output:

    0 LOAD_CONST                  1(1)
    3 STORE_FAST                  0(x)
    6 LOAD_CONST                  0(NONE)
    9 RETURN_VALUE

E.g.:

What is the meaning of LOAD_CONST, STORE_FAST and the numbers like 0, 3, 6 and 9?

A specific resource, where I can find this information would be much appreciated.

Asked By: Pratik Singhal

||

Answers:

The numbers before the bytecodes are offsets into the original binary bytecodes:

>>> func.__code__.co_code
'dx01x00}x00x00dx00x00S'

Some bytecodes come with additional information (arguments) that influence how each bytecode works, the offset tells you at what position in the bytestream the bytecode was found.

The LOAD_CONST bytecode (ASCII d, hex 64) is followed by two additional bytes encoding a reference to a constant associated with the bytecode, for example. As a result, the STORE_FAST opcode (ASCII }, hex 7D) is found at index 3.

The dis module documentation lists what each instruction means. For LOAD_CONST, it says:

Pushes co_consts[consti] onto the stack.

which refers to the co_consts structure that is always present with a code object; the compiler constructs that:

>>> func.__code__.co_consts
(None, 1)

The opcode loads index 1 from that structure (the 01 00 bytes in the bytecode encode a 1), and dis has looked that up for you; it is the value 1.

The next instruction, STORE_FAST is described as:

Stores TOS into the local co_varnames[var_num].

Here TOS refers to Top Of Stack; note that the LOAD_CONST just pushed something onto the stack, the 1 value. co_varnames is another structure; it references local variable names, the opcode references index 0:

>>> func.__code__.co_varnames
('x',)

dis looked that up too, and the name you used in your code is x. Thus, this opcode stored 1 into x.

Another LOAD_CONST loads None onto the stack from index 0, followed by RETURN_VALUE:

Returns with TOS to the caller of the function.

so this instruction takes the top of the stack (with the None constant) and returns from this code block. None is the default return value for functions without an explicit return statement.

You omitted something from the dis output, the line numbers:

>>> dis.dis(func)
  2           0 LOAD_CONST               1 (1)
              3 STORE_FAST               0 (x)
              6 LOAD_CONST               0 (None)
              9 RETURN_VALUE        

Note the 2 on the first line; that’s the line number in the original source that contains the Python code that was used for these instructions. Python code objects have co_lnotab and co_firstlineno attributes that let you map bytecodes back to line numbers in the original source. dis does this for you when displaying a disassembly.

Answered By: Martijn Pieters
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.