Is there a performance cost putting python imports inside functions?
Question:
I build quite complex python apps, often with Django. To simplify inter-application interfaces I sometimes use service.py modules that abstract away from the models.
As these ‘aggregate functionality’, they frequently end up with circular imports which are easily eliminated by placing the import statements inside the service functions.
Is there a significant performance or memory cost associated with generally moving imports as close to their point of use as possible? For example, if I only use a particular imported name in one function in a file, it seems natural to place the import in that particular function rather than at the top of the file in its conventional place.
This issue is subtly different to this question because each import is in the function namespace.
Answers:
See this question.
Basically whenever you import a module, if it’s been imported before it will use a cached value.
This means that the performance will be hit the first time that the module is loaded, but once it’s been loaded it will cache the values for future calls to it.
The point at which you import a module is not expected to cause a performance penalty, if that’s what you’re worried about. Modules are singletons and will not be import
ed every single time an import
statement is encountered. However, how you do the import, and subsequent attribute lookups, does have an impact.
For example, if you import math
and then every time you need to use the sin(...)
function you have to do math.sin(...)
, this will generally be slower than doing from math import sin
and using sin(...)
directly as the system does not have to keep looking up the function name within the module.
This lookup-penalty applies to anything that is accessed using the dot .
and will be particularly noticeable in a loop. It’s therefore advisable to get a local reference to something you might need to use/invoke frequently in a performance critical loop/section.
For example, using the original import math
example, right before a critical loop, you could do something like this:
# ... within some function
sin = math.sin
for i in range(0, REALLY_BIG_NUMBER):
x = sin(i) # faster than: x = math.sin(x)
# ...
This is a trivial example, but note that you could do something similar with methods on other objects (e.g. lists, dictionaries, etc).
I’m probably a bit more concerned about the circular imports you mention. If your intention is to "fix" circular imports by moving the import statements into more "local" places (e.g. within a specific function, or block of code, etc) you probably have a deeper issue that you need to address.
Personally, I’d keep the imports at the top of the module as it’s normally done. Straying away from that pattern for no good reason is likely to make your code more difficult to go through because the dependencies of your module will not be immediately apparent (i.e. there’re import
statements scattered throughout the code instead of in a single location).
It might also make the circular dependency issue you seem to be having more difficult to debug and easier to fall into. After all, if the module is not listed above, someone might happily think your module A
has no dependency on module B
and then up adding an import A
in B
when A
already has import B
hidden in some deep dark corner.
Benchmark Sample
Here’s a benchmark using the lookup notation:
>>> timeit('for i in range(0, 10000): x = math.sin(i)', setup='import math', number=50000)
89.7203312900001
And another benchmark not using the lookup notation:
>>> timeit('for i in range(0, 10000): x = sin(i)', setup='from math import sin', number=50000)
78.27029322999988
Here there’s a 10+ second difference.
Note that your gain depends on how much time the program spends running this code –i.e. a performance critical section instead of sporadic function calls.
As ray said, importing specific functions is (slightly faster)
1.62852311134 for sin()
1.89815092087 for math.sin()
using the following code
from time import time
sin=math.sin
t1=time()
for i in xrange(10000000):
x=sin(i)
t2=time()
for i in xrange(10000000):
z=math.sin(i)
t3=time()
print (t2-t1)
print (t3-t2)
As per timeit
, there is a significant cost to an import statement, even when the module is already imported in the same namespace:
$ python -m timeit -s 'import sys
def foo():
import sys
assert sys is not None
' -- 'foo()'
500000 loops, best of 5: 824 nsec per loop
$ python -m timeit -s 'import sys
def foo():
assert sys is not None
' -- 'foo()'
2000000 loops, best of 5: 96.3 nsec per loop
(Timing figures from Python 3.10.6 on Termux running on a phone.)
Instead of imports within functions, I’ve found that I can take advantage of Python’s support for partially initialized modules and do a "tail import", pushing the import statement to the very bottom of the file (with a # isort:skip
to get isort to leave it alone). This allows circular imports as long as the tail import is not required at module or class level and only at function or method level.
All of these options are valid and pretty fast. It’s likely that even the slowest one, which is 17x slower than the fastest one, won’t make that much of a difference unless you are very CPU-bound
python -m timeit -s '
def foo():
import string
assert string is not None
' -- 'foo()'
2000000 loops, best of 5: 129 nsec per loop
python -m timeit -s '
def foo():
if getattr(foo, "string", None) is None:
import string
foo.string = string
assert foo.string is not None
' -- 'foo()'
5000000 loops, best of 5: 95.5 nsec per loop
python -m timeit -s '
import string
def foo():
assert string is not None
' -- 'foo()'
5000000 loops, best of 5: 41.9 nsec per loop
But if you are using from x import y
style the results are a bit different:
python -m timeit -s '
def foo():
if getattr(foo, "sys", None) is None:
from string import capwords
foo.capwords = capwords
assert foo.capwords is not None
' -- 'foo()'
500000 loops, best of 5: 733 nsec per loop
python -m timeit -s '
def foo():
from string import capwords
assert capwords is not None
' -- 'foo()'
500000 loops, best of 5: 630 nsec per loop
python -m timeit -s '
from string import capwords
def foo():
assert capwords is not None
' -- 'foo()'
5000000 loops, best of 5: 41.9 nsec per loop
I also tried:
python -m timeit -s '
sys = None
def foo():
global sys
if sys is None:
import sys
assert sys is not None
' -- 'foo()'
5000000 loops, best of 5: 44.4 nsec per loop
but it’s strange that
python -m timeit -s '
string = None
def foo():
global string
if string is None:
import string
assert string is not None
' -- 'foo()'
-- ERROR !!
raises an error…
I build quite complex python apps, often with Django. To simplify inter-application interfaces I sometimes use service.py modules that abstract away from the models.
As these ‘aggregate functionality’, they frequently end up with circular imports which are easily eliminated by placing the import statements inside the service functions.
Is there a significant performance or memory cost associated with generally moving imports as close to their point of use as possible? For example, if I only use a particular imported name in one function in a file, it seems natural to place the import in that particular function rather than at the top of the file in its conventional place.
This issue is subtly different to this question because each import is in the function namespace.
See this question.
Basically whenever you import a module, if it’s been imported before it will use a cached value.
This means that the performance will be hit the first time that the module is loaded, but once it’s been loaded it will cache the values for future calls to it.
The point at which you import a module is not expected to cause a performance penalty, if that’s what you’re worried about. Modules are singletons and will not be import
ed every single time an import
statement is encountered. However, how you do the import, and subsequent attribute lookups, does have an impact.
For example, if you import math
and then every time you need to use the sin(...)
function you have to do math.sin(...)
, this will generally be slower than doing from math import sin
and using sin(...)
directly as the system does not have to keep looking up the function name within the module.
This lookup-penalty applies to anything that is accessed using the dot .
and will be particularly noticeable in a loop. It’s therefore advisable to get a local reference to something you might need to use/invoke frequently in a performance critical loop/section.
For example, using the original import math
example, right before a critical loop, you could do something like this:
# ... within some function
sin = math.sin
for i in range(0, REALLY_BIG_NUMBER):
x = sin(i) # faster than: x = math.sin(x)
# ...
This is a trivial example, but note that you could do something similar with methods on other objects (e.g. lists, dictionaries, etc).
I’m probably a bit more concerned about the circular imports you mention. If your intention is to "fix" circular imports by moving the import statements into more "local" places (e.g. within a specific function, or block of code, etc) you probably have a deeper issue that you need to address.
Personally, I’d keep the imports at the top of the module as it’s normally done. Straying away from that pattern for no good reason is likely to make your code more difficult to go through because the dependencies of your module will not be immediately apparent (i.e. there’re import
statements scattered throughout the code instead of in a single location).
It might also make the circular dependency issue you seem to be having more difficult to debug and easier to fall into. After all, if the module is not listed above, someone might happily think your module A
has no dependency on module B
and then up adding an import A
in B
when A
already has import B
hidden in some deep dark corner.
Benchmark Sample
Here’s a benchmark using the lookup notation:
>>> timeit('for i in range(0, 10000): x = math.sin(i)', setup='import math', number=50000)
89.7203312900001
And another benchmark not using the lookup notation:
>>> timeit('for i in range(0, 10000): x = sin(i)', setup='from math import sin', number=50000)
78.27029322999988
Here there’s a 10+ second difference.
Note that your gain depends on how much time the program spends running this code –i.e. a performance critical section instead of sporadic function calls.
As ray said, importing specific functions is (slightly faster)
1.62852311134 for sin()
1.89815092087 for math.sin()
using the following code
from time import time
sin=math.sin
t1=time()
for i in xrange(10000000):
x=sin(i)
t2=time()
for i in xrange(10000000):
z=math.sin(i)
t3=time()
print (t2-t1)
print (t3-t2)
As per timeit
, there is a significant cost to an import statement, even when the module is already imported in the same namespace:
$ python -m timeit -s 'import sys
def foo():
import sys
assert sys is not None
' -- 'foo()'
500000 loops, best of 5: 824 nsec per loop
$ python -m timeit -s 'import sys
def foo():
assert sys is not None
' -- 'foo()'
2000000 loops, best of 5: 96.3 nsec per loop
(Timing figures from Python 3.10.6 on Termux running on a phone.)
Instead of imports within functions, I’ve found that I can take advantage of Python’s support for partially initialized modules and do a "tail import", pushing the import statement to the very bottom of the file (with a # isort:skip
to get isort to leave it alone). This allows circular imports as long as the tail import is not required at module or class level and only at function or method level.
All of these options are valid and pretty fast. It’s likely that even the slowest one, which is 17x slower than the fastest one, won’t make that much of a difference unless you are very CPU-bound
python -m timeit -s '
def foo():
import string
assert string is not None
' -- 'foo()'
2000000 loops, best of 5: 129 nsec per loop
python -m timeit -s '
def foo():
if getattr(foo, "string", None) is None:
import string
foo.string = string
assert foo.string is not None
' -- 'foo()'
5000000 loops, best of 5: 95.5 nsec per loop
python -m timeit -s '
import string
def foo():
assert string is not None
' -- 'foo()'
5000000 loops, best of 5: 41.9 nsec per loop
But if you are using from x import y
style the results are a bit different:
python -m timeit -s '
def foo():
if getattr(foo, "sys", None) is None:
from string import capwords
foo.capwords = capwords
assert foo.capwords is not None
' -- 'foo()'
500000 loops, best of 5: 733 nsec per loop
python -m timeit -s '
def foo():
from string import capwords
assert capwords is not None
' -- 'foo()'
500000 loops, best of 5: 630 nsec per loop
python -m timeit -s '
from string import capwords
def foo():
assert capwords is not None
' -- 'foo()'
5000000 loops, best of 5: 41.9 nsec per loop
I also tried:
python -m timeit -s '
sys = None
def foo():
global sys
if sys is None:
import sys
assert sys is not None
' -- 'foo()'
5000000 loops, best of 5: 44.4 nsec per loop
but it’s strange that
python -m timeit -s '
string = None
def foo():
global string
if string is None:
import string
assert string is not None
' -- 'foo()'
-- ERROR !!
raises an error…