Why do I need __init__.py at every level?

Question:

Given that I have the following directory structure with . being the current working directory

.
---foo
    ---bar
        ---__init__.py
        ---baz.py

When I run python -c "import foo.bar.baz" I get

Traceback (most recent call last):
  File "<string>", line 1
ImportError: No module named foo.bar.baz

If I echo "" > foo/__init__.py, the above command works.

Am I doing something wrong or do I misunderstand the point of __init__.py? I thought it was to stop modules existing where they shouldn’t, e.g. a directory named string, but if you replace foo with string in my example, I’m seemingly forced to create the module that should never be used, just so I can reference a file deeper in the hierarchy.

Update

I’m working with a build system that’s generating the __init__.py‘s for me and is enforcing the directory structure and while I could mess with the hierarchy, I’d prefer to just add the __init__.py myself. To change the question slightly, why do I need a python package at every level instead of just at the top? Is it just a rule that you can only import modules from the python path or from a chain of packages off of the python path?

Asked By: quittle

||

Answers:

Yes, this file is required if you want directory to be treated as a module.

The __init__.py files are required to make Python treat the directories as containing packages; this is done to prevent directories with a common name, such as string, from unintentionally hiding valid modules that occur later on the module search path. In the simplest case, __init__.py can just be an empty file, but it can also execute initialization code for the package or set the __all__ variable, described later.

https://docs.python.org/3/tutorial/modules.html#packages

In a __init__.py file you have great possibility to document module, to get rid of the nested imports for a user/developer by providing the most useful objects(classes/functions) at the first level… …actually to be as simple in use as possible.

Edit after question update

The default importer/finder (examine the sys.meta_path) is:

  1. BuiltinImporter – searches for/load a built-in module
  2. FrozenImporter – searches for/loads frozen module (e.g. *.pyc)
  3. PathFinder – the one you are interested in, allow to search for/loads a module based on the file system

The third is the __init__.py thing (actually the FrozenImporter as well).

ThePathFinder searches for a module in the paths from sys.path (and in __path__ defined in a package). The module could be either a standalone python file (if it is in the root of the search path) or a directory with __init__.py.

Referring to your example:

foo/
  bar/
    __init__.py
    baz.py
  • If you create __init__.py in foo/, foo.bar.baz will be available (as you said).

  • If you add foo/ to sys.path or pass it through PYTHONPATH=foo/, bar.baz will be available (note without parent module foo).

  • If you write your own finder (and loader) you can load for example any file you want despite where it is. That gives you great power. For example take a look on stack-overflow-import, exposes code based on SO’s search results.

Answered By: kwarunek