Any python function to get "data_files" root directory?

Question:

This should be a very common question for developers who used “setup.py” to build installation packages and it should be asked before but I couldn’t find the proper answer anywhere.

In setup.py

from distutils.core import setup
setup(
    ....,
    ....,
    data_files=[('MyApp/CBV', ['myapp/data/CBV/training.cbv', 'myapp/data/CBV/test.cbv'])],
    ....,
    ....,
    )

Result of sudo python setup.py install

running install
running build
running build_py
running build_scripts
running install_lib
running install_scripts
changing mode of /usr/local/bin/MyApp_trainer to 755
changing mode of /usr/local/bin/MyApp_reference_updater to 755
changing mode of /usr/local/bin/MyApp_predictor to 755
changing mode of /usr/local/bin/reference_updater to 755
running install_data
creating /usr/local/MyApp/CBV
copying MyApp/data/CBV/training.cbv -> /usr/local/MyApp/CBV
copying MyApp/data/CBV/test.cbv -> /usr/local/MyApp/CBV
running install_egg_info
Removing /usr/local/lib/python2.7/dist-packages/MyApp-0.1.0.egg-info
Writing /usr/local/lib/python2.7/dist-packages/MyApp-0.1.0.egg-info

From the observation using the result above, “/usr/local” is the “data_files” root directory. Other than hardcoding, are there any Python functions that can give me this “data_files” root directory?

Answers:

By default, when installing a package as root, relative directory names in the data_files list are are resolved against either the value of sys.prefix (for pure-python libraries) or sys.exec_prefix (for libraries with a compiled extension), so you can retrieve your files based on that. Qouting from the distutils documentation:

If directory is a relative path, it is interpreted relative to the installation prefix (Python’s sys.prefix for pure-Python packages, sys.exec_prefix for packages that contain extension modules).

So for your example, you’ll find your files in os.path.join(sys.prefix, 'MyApp', 'CBV').

However, you would be better off using the importlib.resources library (Python 3.7 and up) to load package data. You do want your data files included in the package for that to work best. That means you would not use data_files but instead either list file patterns in a MANIFEST.in file and set include_package_data=True, or list file patterns in package_data, see Including data files in the setuptools documentation.

For earlier Python versions, you can do the same with the pkg_resources module Resource API to load data files (it is part of the setuptools library, for this very purpose).

You can then load such resource files straight from the package into a string with resource_string() for example:

try:
    from importlib.resources import read_text
except ImportError:
    from pkg_resources import resource_string as read_text

foo_config = read_text(__name__, 'foo.conf')
Answered By: Martijn Pieters
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.