Prevent Python packages from re-exporting imported names
Question:
In a Python package, I have the file structure
package/
__init__.py
import_me.py
The file import_me.py
is thought to provide snippets of functionality:
import re
import sys
def hello():
pass
so that package.import_me.hello
can be imported dynamically via import
. Unfortunately, this also allows to import re
and sys
as package.import_me.re
and package.import_me.sys
, respectively.
Is there a way to prevent the imported modules in import_me.py
to be re-exported again? Preferably this should go beyond name mangling or underscore-prefixing imported modules, since in my case it might pose a security problem under certain instances.
Answers:
There’s a couple of options:
-
Put None
in sys.modules
for the module:
>>> import sys
>>> import re
>>> del re
>>> sys.modules['re'] = None
>>> import re
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named re
-
Use the RestrictedPython package or the pysandbox package.
Be sure to check out this article on Sandboxed Python as well.
There is no easy way to forbid importing a global name from a module; Python simply is not built that way.
While you could possibly achieve the forbidding goal if you wrote your own __import__
function and shadowed the built-in one, but I doubt the cost in time and testing would be worth it nor completely effective.
What you can do is import the dependent modules with a leading underscore, which is a standard Python idiom for communicating “implementation detail, use at your own risk“:
import re as _re
import sys as _sys
def hello():
pass
Note
While just deleting the imported modules as a way of not allowing them to be imported seems like it might work, it actually does not:
import re
import sys
def hello():
sys
print('hello')
del re
del sys
and then importing and using hello
:
>>> import del_mod
>>> del_mod.hello()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "del_mod.py", line 5, in hello
sys
NameError: global name 'sys' is not defined
EDIT: This doesn’t work in most cases. Please see other answers.
I know this question is old, but if you simply put
import re
import sys
def hello():
pass
del re
del sys
then you shouldn’t be able to import re
or sys
from import_me.py
, and this way you don’t have to move your imports from the start of the file.
Update. Some experience later, I’d strongly encourage the use of __all__
, and discourage the initializer-function idea. There is a lot of tooling, that will be confused by it.
1. Initializer function
An alternative might be wrapping definitions into an initializer function.
## --- exporttest.py ---
def _init():
import os # effectively hidden
global get_ext # effectively exports it
def get_ext(filename):
return _pointless_subfunc(filename)
# underscore optional, but good
def _pointless_subfunc(filename): # for the sake of documentation
return os.path.splitext(filename)[1]
if __name__ == '__main__': # for interactive debugging (or doctest)
globals().update(locals()) # we want all definitions accessible
import doctest
doctest.testmod()
_init()
print('is ``get_ext`` accessible? ', 'get_ext' in globals())
print('is ``_pointless_subfunc`` accessible?', '_pointless_subfunc' in globals())
print('is ``os`` accessible? ', 'os' in globals())
For comparison:
>>> python3 -m exporttest
is ``get_ext`` accessible? True
is ``_pointless_subfunc`` accessible? True
is ``os`` accessible? True
>>> python3 -c "import exporttest"
is ``get_ext`` accessible? True
is ``_pointless_subfunc`` accessible? False
is ``os`` accessible? False
1.1. Advantages
- Actual hiding of the imports.
- More convenient for interactive code-exploration, as
dir(exporttest)
is clutter-free.
1.2. Disadvantages
-
Sadly, unlike the import MODULE as _MODULE
pattern, it doesn’t play
nicely with pylint.
C: 4, 4: Invalid constant name "get_ext" (invalid-name)
W: 4, 4: Using global for 'get_ext' but no assignment is done (global-variable-not-assigned)
W: 5, 4: Unused variable 'get_ext' (unused-variable)
-
It also doesn’t play nicely with IDE intellisense features.
2. Embrace __all__
Upon further reading, I’ve found that the pythonic way to do it is to rely on __all__
. It controls not only what is exported on from MODULE import *
, but also what shows up in help(MODULE)
, and according to the "We are all adults here" mantra, it is the users own fault if he uses anything not documented as public.
2.1. Advantages
Tooling has best support for this approach (e.g. through editor support for autoimports through the importmagic library).
2.2. Disadvantages
Personally, I find that whole "we are all adults" mantra quite naive; When working under time pressure with no chance to fully understand a code-base before delivering a change, we can do with any help we can get to prevent "shot your own foot" scenarios. Plus, even many popular packages don’t really follow best practices like providing useful interactive docstrings, or defining __all__
. But it is the pythonic way.
There is no easy way to forbid importing a global name from a module; but in fact, you don’t need to.
Python allows to use local imports instead of global:
def foo():
import sys
print(sys.copyright)
sys.copyright # Throws NameError
Neat and simple.
Actually, I think that using local imports should be a good practice and global ones are just a tribute to C or heritage from it.
UPD: Obvious downside of it is that import sys
statement will be executed each time this function is called, which can be unaffordable.
But you can create a callable object instead:
class Callable(object):
import sys as _sys
def __call__(self):
print(self._sys.copyright)
foo = Callable()
foo()
Though I personally don’t like this approach it may work better with generic classes.
The solution might be to rethink your files structure and create a submodule.
Use a file such as __init__.py
to expose the variables that you want.
package/
__init__.py
import_me/
__init__.py
code.py
'''
import_me/code.py
'''
# imports that you do not want to expose
import re
import sys
def hello():
re.doSomething()
sys.doSomething()
print('hello')
'''
import_me/__init__.py
'''
from code import hello
I know this is a 4-year-old question, but I found myself having to deal with this same problem recently and came up with a solution that seems to work.
This is the file structure we’ll be using:
package/
__init__.py
import_me.py
import_me_no_re_export.py
The contents of package/import_me.py
(used only for comparison) are as per the question:
import re
import sys
def hello():
pass
Those of package/import_me_no_re_export.py
are:
def _():
global hello
import re
import sys
def hello():
pass
_()
del _
The following python
CLI session (Python 3.8.5) shows the results of this approach:
>>> import package.import_me
>>> import package.import_me_no_re_export
>>> dir(package.import_me)
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'hello', 're', 'sys']
>>> dir(package.import_me_no_re_export)
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'hello']
>>> set(dir(package.import_me)) - set(dir(package.import_me_no_re_export))
{'sys', 're'}
As you can see, the import_me_no_re_export
module does not re-export the sys
and re
modules.
Pros:
- It seems to work like a charm, no side effects whatsoever.
- Both
mypy
and flake8
seem to like this approach alright.
Cons:
- At least Visual Studio Code seems to not like it too much (IntelliSense will simply not work and only suggest the deleted
_
function for a module so altered).
- You need to sacrifice an indentation level.
- Boilerplate
As other answers pointed out, embracing __all__
is probably the best bet, although it does not in fact address the issue; other than that, probably deal with the fact that these things are bound to happen.
In a Python package, I have the file structure
package/
__init__.py
import_me.py
The file import_me.py
is thought to provide snippets of functionality:
import re
import sys
def hello():
pass
so that package.import_me.hello
can be imported dynamically via import
. Unfortunately, this also allows to import re
and sys
as package.import_me.re
and package.import_me.sys
, respectively.
Is there a way to prevent the imported modules in import_me.py
to be re-exported again? Preferably this should go beyond name mangling or underscore-prefixing imported modules, since in my case it might pose a security problem under certain instances.
There’s a couple of options:
-
Put
None
insys.modules
for the module:>>> import sys >>> import re >>> del re >>> sys.modules['re'] = None >>> import re Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: No module named re
-
Use the RestrictedPython package or the pysandbox package.
Be sure to check out this article on Sandboxed Python as well.
There is no easy way to forbid importing a global name from a module; Python simply is not built that way.
While you could possibly achieve the forbidding goal if you wrote your own __import__
function and shadowed the built-in one, but I doubt the cost in time and testing would be worth it nor completely effective.
What you can do is import the dependent modules with a leading underscore, which is a standard Python idiom for communicating “implementation detail, use at your own risk“:
import re as _re
import sys as _sys
def hello():
pass
Note
While just deleting the imported modules as a way of not allowing them to be imported seems like it might work, it actually does not:
import re
import sys
def hello():
sys
print('hello')
del re
del sys
and then importing and using hello
:
>>> import del_mod
>>> del_mod.hello()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "del_mod.py", line 5, in hello
sys
NameError: global name 'sys' is not defined
EDIT: This doesn’t work in most cases. Please see other answers.
I know this question is old, but if you simply put
import re
import sys
def hello():
pass
del re
del sys
then you shouldn’t be able to import re
or sys
from import_me.py
, and this way you don’t have to move your imports from the start of the file.
Update. Some experience later, I’d strongly encourage the use of __all__
, and discourage the initializer-function idea. There is a lot of tooling, that will be confused by it.
1. Initializer function
An alternative might be wrapping definitions into an initializer function.
## --- exporttest.py ---
def _init():
import os # effectively hidden
global get_ext # effectively exports it
def get_ext(filename):
return _pointless_subfunc(filename)
# underscore optional, but good
def _pointless_subfunc(filename): # for the sake of documentation
return os.path.splitext(filename)[1]
if __name__ == '__main__': # for interactive debugging (or doctest)
globals().update(locals()) # we want all definitions accessible
import doctest
doctest.testmod()
_init()
print('is ``get_ext`` accessible? ', 'get_ext' in globals())
print('is ``_pointless_subfunc`` accessible?', '_pointless_subfunc' in globals())
print('is ``os`` accessible? ', 'os' in globals())
For comparison:
>>> python3 -m exporttest
is ``get_ext`` accessible? True
is ``_pointless_subfunc`` accessible? True
is ``os`` accessible? True
>>> python3 -c "import exporttest"
is ``get_ext`` accessible? True
is ``_pointless_subfunc`` accessible? False
is ``os`` accessible? False
1.1. Advantages
- Actual hiding of the imports.
- More convenient for interactive code-exploration, as
dir(exporttest)
is clutter-free.
1.2. Disadvantages
-
Sadly, unlike the
import MODULE as _MODULE
pattern, it doesn’t play
nicely with pylint.C: 4, 4: Invalid constant name "get_ext" (invalid-name) W: 4, 4: Using global for 'get_ext' but no assignment is done (global-variable-not-assigned) W: 5, 4: Unused variable 'get_ext' (unused-variable)
-
It also doesn’t play nicely with IDE intellisense features.
2. Embrace __all__
Upon further reading, I’ve found that the pythonic way to do it is to rely on __all__
. It controls not only what is exported on from MODULE import *
, but also what shows up in help(MODULE)
, and according to the "We are all adults here" mantra, it is the users own fault if he uses anything not documented as public.
2.1. Advantages
Tooling has best support for this approach (e.g. through editor support for autoimports through the importmagic library).
2.2. Disadvantages
Personally, I find that whole "we are all adults" mantra quite naive; When working under time pressure with no chance to fully understand a code-base before delivering a change, we can do with any help we can get to prevent "shot your own foot" scenarios. Plus, even many popular packages don’t really follow best practices like providing useful interactive docstrings, or defining __all__
. But it is the pythonic way.
There is no easy way to forbid importing a global name from a module; but in fact, you don’t need to.
Python allows to use local imports instead of global:
def foo():
import sys
print(sys.copyright)
sys.copyright # Throws NameError
Neat and simple.
Actually, I think that using local imports should be a good practice and global ones are just a tribute to C or heritage from it.
UPD: Obvious downside of it is that import sys
statement will be executed each time this function is called, which can be unaffordable.
But you can create a callable object instead:
class Callable(object):
import sys as _sys
def __call__(self):
print(self._sys.copyright)
foo = Callable()
foo()
Though I personally don’t like this approach it may work better with generic classes.
The solution might be to rethink your files structure and create a submodule.
Use a file such as __init__.py
to expose the variables that you want.
package/
__init__.py
import_me/
__init__.py
code.py
'''
import_me/code.py
'''
# imports that you do not want to expose
import re
import sys
def hello():
re.doSomething()
sys.doSomething()
print('hello')
'''
import_me/__init__.py
'''
from code import hello
I know this is a 4-year-old question, but I found myself having to deal with this same problem recently and came up with a solution that seems to work.
This is the file structure we’ll be using:
package/
__init__.py
import_me.py
import_me_no_re_export.py
The contents of package/import_me.py
(used only for comparison) are as per the question:
import re
import sys
def hello():
pass
Those of package/import_me_no_re_export.py
are:
def _():
global hello
import re
import sys
def hello():
pass
_()
del _
The following python
CLI session (Python 3.8.5) shows the results of this approach:
>>> import package.import_me
>>> import package.import_me_no_re_export
>>> dir(package.import_me)
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'hello', 're', 'sys']
>>> dir(package.import_me_no_re_export)
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'hello']
>>> set(dir(package.import_me)) - set(dir(package.import_me_no_re_export))
{'sys', 're'}
As you can see, the import_me_no_re_export
module does not re-export the sys
and re
modules.
Pros:
- It seems to work like a charm, no side effects whatsoever.
- Both
mypy
andflake8
seem to like this approach alright.
Cons:
- At least Visual Studio Code seems to not like it too much (IntelliSense will simply not work and only suggest the deleted
_
function for a module so altered). - You need to sacrifice an indentation level.
- Boilerplate
As other answers pointed out, embracing __all__
is probably the best bet, although it does not in fact address the issue; other than that, probably deal with the fact that these things are bound to happen.