pybind11 module import or link .so dependencies macOS
Question:
TLDR: How do I link a .so/import a dependency when importing my pybind11 module in python?
I am attempting to build a pybind11 module that, in parts, depends on the C++ part of a different python library. On Linux, I can just link that library in CMake using target_link_libraries
— which does not work for .so libraries on macOS (can't link with bundle (MH_BUNDLE) only dylibs (MH_DYLIB) file
).
When importing the pybind11-generated module without linking in Python on macOS, I get an ImportError: dlopen(/path/to/my_module.cpython-38-darwin.so, 0x0002): symbol not found in flat namespace (__<mangled symbol that is part of the library my module depends on>)
. This can be prevented by importing the dependency itself in Python before importing my own module.
Is there a way to either link that library, or to ensure that Python imports the dependency before loading my binary when running import my_module
?
I attempted putting the shared library file in a folder with an __init__.py
that just first imports the dependency, and then *
from the .so — but that resulted in some imports not working any longer (e.g., import my_module.my_submodule
fails).
EDIT: A working, although cumbersome, drop-in solution is to add a dummy module to the pipeline. I.e., rename the original my_module
to _my_module
, and create a dummy my_module
that does nothing besides importing the dependency:
#include <Python.h>
PyMODINIT_FUNC
PyInit_my_module(void)
{
PyImport_ImportModule("the_dependency");
return PyImport_ImportModule("_my_module");
}
Answers:
This is not an ideal solution, but seemingly the best way to solve the import-before-binary problem, while also retaining the ability to use the imported module just in the same way as one would in normal cases. This is achieved by using a dummy module to import the python dependency (which contains the associated C++ dependency as .so), before importing the original module.
So here’s how it is done, assuming CMake is used to compile the project.
- Conditionally set the module name to _my_module instead of my_module if it is compiled for macOS:
if (APPLE)
set(MAIN_LIB_NAME _my_module)
else()
set(MAIN_LIB_NAME my_module)
endif()
pybind11_add_module(${MAIN_LIB_NAME}
src/source1.cpp
# your source files, as before
)
- Add a dummy module that takes the original name, this one is then used to import the dependency and load the actual module
if (APPLE)
pybind11_add_module(my_module macos_dummy.h macos_dummy.cpp)
elseif (UNIX)
# in my case, on linux I just linked against the .so
target_link_libraries(my_module PUBLIC my_dependency)
endif()
- Define a
PYBIND11_MODULE
in your original module that takes the dummy name, so that it can be properly imported by Python later on (i.e., let Pybind declare the PyInit_
function). Do this while keeping your original PYBIND11_MODULE
(with the original name):
#ifdef __APPLE__ // If apple, a dummy module is added, so that the dependency can be imported before loading the actual binary
PYBIND11_MODULE(_my_module, m) {
m.doc() = "dummy module; doesn't do anything; if you see this instead of the actual module, something went wrong.";
}
#endif
PYBIND11_MODULE(my_module, m) { // the original module, left unchanged
// ...
- Implement the actual dummy module, that is using Python’s import mechanics to import the dependency, find the original module and pretend to have been that original module all along:
#include <dlfcn.h>
#include <macos_dummy.h>
typedef PyObject* (*PyInitFunc)(void);
PyMODINIT_FUNC PyInit_my_module(void)
{
PyImport_ImportModule("my_dependency"); // import the dependency, this is the entire reason this exists in the first place
PyObject* obj = PyImport_ImportModule("_my_module"); // let python find the correct binary
const char* actual_module_path = PyUnicode_AsUTF8(PyObject_GetAttrString(obj, "__file__")); // get the path of the binary found by python
void* actual_module = dlopen(actual_module_path, RTLD_LAZY | RTLD_GLOBAL); // access the binary
if (!actual_module) {
printf("Module %s not foundn", actual_module_path);
return NULL;
} else {
PyInitFunc actual_pyinit = dlsym(actual_module, "PyInit_my_module"); // retrieve the actual module
return actual_pyinit();
}
}
and the associated header:
#ifndef MY_MODULE_MACOS_DUMMY_H
#define MY_MODULE_MACOS_DUMMY_H
#include <Python.h>
__attribute__((visibility("default"))) PyMODINIT_FUNC PyInit_my_module(void);
#endif //MY_MODULE_MACOS_DUMMY_H
That’s it. From now on, given that both generated .so files are in the path, importing the module under the original name will now import the dependency too.
TLDR: How do I link a .so/import a dependency when importing my pybind11 module in python?
I am attempting to build a pybind11 module that, in parts, depends on the C++ part of a different python library. On Linux, I can just link that library in CMake using target_link_libraries
— which does not work for .so libraries on macOS (can't link with bundle (MH_BUNDLE) only dylibs (MH_DYLIB) file
).
When importing the pybind11-generated module without linking in Python on macOS, I get an ImportError: dlopen(/path/to/my_module.cpython-38-darwin.so, 0x0002): symbol not found in flat namespace (__<mangled symbol that is part of the library my module depends on>)
. This can be prevented by importing the dependency itself in Python before importing my own module.
Is there a way to either link that library, or to ensure that Python imports the dependency before loading my binary when running import my_module
?
I attempted putting the shared library file in a folder with an __init__.py
that just first imports the dependency, and then *
from the .so — but that resulted in some imports not working any longer (e.g., import my_module.my_submodule
fails).
EDIT: A working, although cumbersome, drop-in solution is to add a dummy module to the pipeline. I.e., rename the original my_module
to _my_module
, and create a dummy my_module
that does nothing besides importing the dependency:
#include <Python.h>
PyMODINIT_FUNC
PyInit_my_module(void)
{
PyImport_ImportModule("the_dependency");
return PyImport_ImportModule("_my_module");
}
This is not an ideal solution, but seemingly the best way to solve the import-before-binary problem, while also retaining the ability to use the imported module just in the same way as one would in normal cases. This is achieved by using a dummy module to import the python dependency (which contains the associated C++ dependency as .so), before importing the original module.
So here’s how it is done, assuming CMake is used to compile the project.
- Conditionally set the module name to _my_module instead of my_module if it is compiled for macOS:
if (APPLE)
set(MAIN_LIB_NAME _my_module)
else()
set(MAIN_LIB_NAME my_module)
endif()
pybind11_add_module(${MAIN_LIB_NAME}
src/source1.cpp
# your source files, as before
)
- Add a dummy module that takes the original name, this one is then used to import the dependency and load the actual module
if (APPLE)
pybind11_add_module(my_module macos_dummy.h macos_dummy.cpp)
elseif (UNIX)
# in my case, on linux I just linked against the .so
target_link_libraries(my_module PUBLIC my_dependency)
endif()
- Define a
PYBIND11_MODULE
in your original module that takes the dummy name, so that it can be properly imported by Python later on (i.e., let Pybind declare thePyInit_
function). Do this while keeping your originalPYBIND11_MODULE
(with the original name):
#ifdef __APPLE__ // If apple, a dummy module is added, so that the dependency can be imported before loading the actual binary
PYBIND11_MODULE(_my_module, m) {
m.doc() = "dummy module; doesn't do anything; if you see this instead of the actual module, something went wrong.";
}
#endif
PYBIND11_MODULE(my_module, m) { // the original module, left unchanged
// ...
- Implement the actual dummy module, that is using Python’s import mechanics to import the dependency, find the original module and pretend to have been that original module all along:
#include <dlfcn.h>
#include <macos_dummy.h>
typedef PyObject* (*PyInitFunc)(void);
PyMODINIT_FUNC PyInit_my_module(void)
{
PyImport_ImportModule("my_dependency"); // import the dependency, this is the entire reason this exists in the first place
PyObject* obj = PyImport_ImportModule("_my_module"); // let python find the correct binary
const char* actual_module_path = PyUnicode_AsUTF8(PyObject_GetAttrString(obj, "__file__")); // get the path of the binary found by python
void* actual_module = dlopen(actual_module_path, RTLD_LAZY | RTLD_GLOBAL); // access the binary
if (!actual_module) {
printf("Module %s not foundn", actual_module_path);
return NULL;
} else {
PyInitFunc actual_pyinit = dlsym(actual_module, "PyInit_my_module"); // retrieve the actual module
return actual_pyinit();
}
}
and the associated header:
#ifndef MY_MODULE_MACOS_DUMMY_H
#define MY_MODULE_MACOS_DUMMY_H
#include <Python.h>
__attribute__((visibility("default"))) PyMODINIT_FUNC PyInit_my_module(void);
#endif //MY_MODULE_MACOS_DUMMY_H
That’s it. From now on, given that both generated .so files are in the path, importing the module under the original name will now import the dependency too.