"Precompile" function in python, compile-time computations

Question:

If we write a function in python, whose execution depends on external variables that are already known, python will still check this every time

I mean

def fun():
    return 1 if False else 5

This "False" will be rechecked each time this function is called

This can be simply checked in a simple way:

def pp(t, s):
    print(s)
    return t

def returner(t):
    return t if pp(True, "YAY") else 0

list(map(returner,[1,2,3,4,5]))

Will print "YAY" 5 times

Is there a way to "precompile" a function so that such conditions are not checked every time it is called, but only when it is compiled

That is, a way to tell the compiler, "This expression is already known at compile time, figure it out and substitute it.

Added in response to a post by @Chepner:

You’re right, but there is a third party: speed of execution.

In your example:

X = 5

def fun():
    return __precompute__(X * 5)

X = 7

I could really just write return 25, but what if the function is defined inside another function and its definition depends on the parameters passed to the function in which this definition takes place? I can’t handwrite the return value right away, because I don’t know the parameters passed in, but the defining function does.

In this case, it is assumed that the compile-time parameter is the parameter passed to the parent function, and it will never change again.

Example

def parent(parent_1, parent_2, parent_3, parent_4, long_array):
    def child(x):
        s = 0
        if parent_1:
            s += x**2
        if parent_2:
            s += x**x
        if parent_3:
            s += 2*x
        if parent_4:
            s += 74
        return s

    return sum(list(map(child, long_array)))

In this case, these parameters will not change, and each check will take a long time, when they can not be done at all

In "fast languages", such as C++, in this case, the child function absolutely cannot be optimized during program execution. But we deal with Python, where a program can change itself at runtime, so optimization and compilation of such a child function can theoretically happen at runtime.

My question is, are there technical means for such optimization and compilation of a function at runtime?

Asked By: MPEI_stud

||

Answers:

There is always a cost to identify expressions that can safely be reduced in a dynamic language like Python. For instance,

X = 5

def fun():
    return X * 5

cannot be optimized to

def fun():
    return 25

without whole-program analysis that verifies that X is never assigned a different value. Python itself does not specify any such optimizations, but individual implementations are free to add optimizations as they see fit. For example, CPython will optimize certain constant expressions and eliminate dead code under some conditions. You can see this using the dis module to see the byte code generated for a given function.

>>> def fun(): return 5*5
...
>>> dis.dis(fun)
  1           0 LOAD_CONST               1 (25)
              2 RETURN_VALUE

or

>>> def fun():
...     if False:
...         return 1
...     else:
...         return 5
...
>>> dis.dis(fun)
  5           0 LOAD_CONST               1 (5)
              2 RETURN_VALUE

The conditional expression is not similarly optimized.

>>> def fun():
...     return 1 if False else 5
...
>>> dis.dis(fun)
  2           0 LOAD_CONST               1 (False)
              2 POP_JUMP_IF_FALSE        8
              4 LOAD_CONST               2 (1)
              6 RETURN_VALUE
        >>    8 LOAD_CONST               3 (5)
             10 RETURN_VALUE

I suspect this is because the optimizer doesn’t want to dive into expressions to identify dead code, only statements (which exist at a higher level in the AST). You can stop the optimization by replacing False with an expression that could be evaluated to False at compile time, but the optimizer similarly does not.

>>> def fun():
...     if 3 == 5:
...         return 1
...     else: return 5
...
>>> dis.dis(fun)
  2           0 LOAD_CONST               1 (3)
              2 LOAD_CONST               2 (5)
              4 COMPARE_OP               2 (==)
              6 POP_JUMP_IF_FALSE       12

  3           8 LOAD_CONST               3 (1)
             10 RETURN_VALUE

  4     >>   12 LOAD_CONST               2 (5)
             14 RETURN_VALUE
             16 LOAD_CONST               0 (None)
             18 RETURN_VALUE

OK, automatically identifying code that is safe to simplify is hard; maybe there could be a way to mark code to be simplified. But even such syntax would still need to verify the optimization is safe. Consider the first example above; you could say that it’s safe to optimize X * 5, but what is Python supposed to do about the following?

X = 5

def fun():
    return __precompute__(X * 5)

X = 7

If you precompute fun, it will always return 25, even though you changed the value of X before fun is ever called. This just makes the definition of fun confusing, and you could have just written return 25 in the first place if that was your intent. Remember, the code generator is constrained by what could be done, not what you intend to be done.

Given that such marked code still has to be analyzed for safety, is it really worth complicating the grammar for something that isn’t really any more useful than automatic optimization?

Answered By: chepner

I think that you should think about the problem from another point of view,

Yes expressions with only constants are evaluated by the interpreter and might not be optimized by it, but why care about this?

If a function given a number of arguments foo(x0, x1, x3, x4, ...) = C always returns the same value is cachable hence it will be evaluated only once

>>> from math import factorial as fact
>>> from functools import lru_cache
>>> import time
>>> 
>>> @lru_cache
... def foo(n):
...     return fact(n)
... 
>>> def timefoo(foo, args):
...     start = time.time()
...     foo(*args)
...     return time.time()-start
>>> timefoo(foo, (998,))
6.508827209472656e-05
>>> timefoo(foo, (998,))
2.1457672119140625e-06

Answered By: Axeltherabbit