Prevent Python packages from re-exporting imported names

Question:

In a Python package, I have the file structure

package/
    __init__.py
    import_me.py

The file import_me.py is thought to provide snippets of functionality:

import re
import sys

def hello():
    pass

so that package.import_me.hello can be imported dynamically via import. Unfortunately, this also allows to import re and sys as package.import_me.re and package.import_me.sys, respectively.

Is there a way to prevent the imported modules in import_me.py to be re-exported again? Preferably this should go beyond name mangling or underscore-prefixing imported modules, since in my case it might pose a security problem under certain instances.

Asked By: Boldewyn

||

Answers:

There’s a couple of options:

  1. Put None in sys.modules for the module:

    >>> import sys
    >>> import re
    >>> del re
    >>> sys.modules['re'] = None
    >>> import re
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      ImportError: No module named re
    
  2. Use the RestrictedPython package or the pysandbox package.

Be sure to check out this article on Sandboxed Python as well.

Answered By: Bob Dylan

There is no easy way to forbid importing a global name from a module; Python simply is not built that way.

While you could possibly achieve the forbidding goal if you wrote your own __import__ function and shadowed the built-in one, but I doubt the cost in time and testing would be worth it nor completely effective.

What you can do is import the dependent modules with a leading underscore, which is a standard Python idiom for communicating “implementation detail, use at your own risk“:

import re as _re
import sys as _sys

def hello():
    pass

Note

While just deleting the imported modules as a way of not allowing them to be imported seems like it might work, it actually does not:

import re
import sys

def hello():
    sys
    print('hello')

del re
del sys

and then importing and using hello:

>>> import del_mod
>>> del_mod.hello()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "del_mod.py", line 5, in hello
    sys
NameError: global name 'sys' is not defined
Answered By: Ethan Furman

EDIT: This doesn’t work in most cases. Please see other answers.


I know this question is old, but if you simply put

import re
import sys

def hello():
    pass

del re
del sys

then you shouldn’t be able to import re or sys from import_me.py, and this way you don’t have to move your imports from the start of the file.

Answered By: Andrew Dean

Update. Some experience later, I’d strongly encourage the use of __all__, and discourage the initializer-function idea. There is a lot of tooling, that will be confused by it.

1. Initializer function

An alternative might be wrapping definitions into an initializer function.

## --- exporttest.py ---
def _init():
    import os                       # effectively hidden

    global get_ext                  # effectively exports it
    def get_ext(filename):
        return _pointless_subfunc(filename)

                                      # underscore optional, but good 
    def _pointless_subfunc(filename): # for the sake of documentation
        return os.path.splitext(filename)[1]

    if __name__ == '__main__':      # for interactive debugging (or doctest)  
        globals().update(locals())  # we want all definitions accessible
        import doctest
        doctest.testmod()
        
_init()

print('is ``get_ext`` accessible?           ', 'get_ext' in globals())
print('is ``_pointless_subfunc`` accessible?', '_pointless_subfunc' in globals())
print('is ``os`` accessible?                ', 'os' in globals())

For comparison:

>>> python3 -m exporttest
is ``get_ext`` accessible?            True
is ``_pointless_subfunc`` accessible? True
is ``os`` accessible?                 True

>>> python3 -c "import exporttest"
is ``get_ext`` accessible?            True
is ``_pointless_subfunc`` accessible? False
is ``os`` accessible?                 False

1.1. Advantages

  • Actual hiding of the imports.
  • More convenient for interactive code-exploration, as
    dir(exporttest) is clutter-free.

1.2. Disadvantages

  • Sadly, unlike the import MODULE as _MODULE pattern, it doesn’t play
    nicely with pylint.

    C:  4, 4: Invalid constant name "get_ext" (invalid-name)
    W:  4, 4: Using global for 'get_ext' but no assignment is done (global-variable-not-assigned)
    W:  5, 4: Unused variable 'get_ext' (unused-variable)
    
  • It also doesn’t play nicely with IDE intellisense features.

2. Embrace __all__

Upon further reading, I’ve found that the pythonic way to do it is to rely on __all__. It controls not only what is exported on from MODULE import *, but also what shows up in help(MODULE), and according to the "We are all adults here" mantra, it is the users own fault if he uses anything not documented as public.

2.1. Advantages

Tooling has best support for this approach (e.g. through editor support for autoimports through the importmagic library).

2.2. Disadvantages

Personally, I find that whole "we are all adults" mantra quite naive; When working under time pressure with no chance to fully understand a code-base before delivering a change, we can do with any help we can get to prevent "shot your own foot" scenarios. Plus, even many popular packages don’t really follow best practices like providing useful interactive docstrings, or defining __all__. But it is the pythonic way.

Answered By: kdb

There is no easy way to forbid importing a global name from a module; but in fact, you don’t need to.
Python allows to use local imports instead of global:

def foo():
    import sys
    print(sys.copyright)

sys.copyright # Throws NameError

Neat and simple.

Actually, I think that using local imports should be a good practice and global ones are just a tribute to C or heritage from it.

UPD: Obvious downside of it is that import sys statement will be executed each time this function is called, which can be unaffordable.
But you can create a callable object instead:

class Callable(object):
    import sys as _sys
    def __call__(self):
        print(self._sys.copyright)

foo = Callable()
foo()

Though I personally don’t like this approach it may work better with generic classes.

Answered By: Montreal

The solution might be to rethink your files structure and create a submodule.
Use a file such as __init__.py to expose the variables that you want.

package/
    __init__.py
    import_me/
         __init__.py
         code.py
'''
import_me/code.py
'''

# imports that you do not want to expose
import re
import sys


def hello():
    re.doSomething()
    sys.doSomething()
    print('hello')
'''
import_me/__init__.py
'''

from code import hello
Answered By: Tomas G.

I know this is a 4-year-old question, but I found myself having to deal with this same problem recently and came up with a solution that seems to work.

This is the file structure we’ll be using:

package/
    __init__.py
    import_me.py
    import_me_no_re_export.py

The contents of package/import_me.py (used only for comparison) are as per the question:

import re
import sys

def hello():
    pass

Those of package/import_me_no_re_export.py are:

def _():
    global hello

    import re
    import sys

    def hello():
        pass

_()
del _

The following python CLI session (Python 3.8.5) shows the results of this approach:

>>> import package.import_me
>>> import package.import_me_no_re_export
>>> dir(package.import_me)
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'hello', 're', 'sys']
>>> dir(package.import_me_no_re_export)
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'hello']
>>> set(dir(package.import_me)) - set(dir(package.import_me_no_re_export))
{'sys', 're'}

As you can see, the import_me_no_re_export module does not re-export the sys and re modules.

Pros:

  • It seems to work like a charm, no side effects whatsoever.
  • Both mypy and flake8 seem to like this approach alright.

Cons:

  • At least Visual Studio Code seems to not like it too much (IntelliSense will simply not work and only suggest the deleted _ function for a module so altered).
  • You need to sacrifice an indentation level.
  • Boilerplate

As other answers pointed out, embracing __all__ is probably the best bet, although it does not in fact address the issue; other than that, probably deal with the fact that these things are bound to happen.

Answered By: mpr
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.