Is there a standard way to list names of Python modules in a package?
Question:
Is there a straightforward way to list the names of all modules in a package, without using __all__
?
For example, given this package:
/testpkg
/testpkg/__init__.py
/testpkg/modulea.py
/testpkg/moduleb.py
I’m wondering if there is a standard or built-in way to do something like this:
>>> package_contents("testpkg")
['modulea', 'moduleb']
The manual approach would be to iterate through the module search paths in order to find the package’s directory. One could then list all the files in that directory, filter out the uniquely-named py/pyc/pyo files, strip the extensions, and return that list. But this seems like a fair amount of work for something the module import mechanism is already doing internally. Is that functionality exposed anywhere?
Answers:
import module
help(module)
Maybe this will do what you’re looking for?
import imp
import os
MODULE_EXTENSIONS = ('.py', '.pyc', '.pyo')
def package_contents(package_name):
file, pathname, description = imp.find_module(package_name)
if file:
raise ImportError('Not a package: %r', package_name)
# Use a set because some may be both source and compiled.
return set([os.path.splitext(module)[0]
for module in os.listdir(pathname)
if module.endswith(MODULE_EXTENSIONS)])
def package_contents(package_name):
package = __import__(package_name)
return [module_name for module_name in dir(package) if not module_name.startswith("__")]
Using python2.3 and above, you could also use the pkgutil
module:
>>> import pkgutil
>>> [name for _, name, _ in pkgutil.iter_modules(['testpkg'])]
['modulea', 'moduleb']
EDIT: Note that the parameter for pkgutil.iter_modules
is not a list of modules, but a list of paths, so you might want to do something like this:
>>> import os.path, pkgutil
>>> import testpkg
>>> pkgpath = os.path.dirname(testpkg.__file__)
>>> print([name for _, name, _ in pkgutil.iter_modules([pkgpath])])
Don’t know if I’m overlooking something, or if the answers are just out-dated but;
As stated by user815423426 this only works for live objects and the listed modules are only modules that were imported before.
Listing modules in a package seems really easy using inspect:
>>> import inspect, testpkg
>>> inspect.getmembers(testpkg, inspect.ismodule)
['modulea', 'moduleb']
Based on cdleary’s example, here’s a recursive version listing path for all submodules:
import imp, os
def iter_submodules(package):
file, pathname, description = imp.find_module(package)
for dirpath, _, filenames in os.walk(pathname):
for filename in filenames:
if os.path.splitext(filename)[1] == ".py":
yield os.path.join(dirpath, filename)
This is a recursive version that works with python 3.6 and above:
import importlib.util
from pathlib import Path
import os
MODULE_EXTENSIONS = '.py'
def package_contents(package_name):
spec = importlib.util.find_spec(package_name)
if spec is None:
return set()
pathname = Path(spec.origin).parent
ret = set()
with os.scandir(pathname) as entries:
for entry in entries:
if entry.name.startswith('__'):
continue
current = '.'.join((package_name, entry.name.partition('.')[0]))
if entry.is_file():
if entry.name.endswith(MODULE_EXTENSIONS):
ret.add(current)
elif entry.is_dir():
ret.add(current)
ret |= package_contents(current)
return ret
This should list the modules:
help("modules")
If you would like to view an inforamtion about your package outside of the python code (from a command prompt) you can use pydoc for it.
# get a full list of packages that you have installed on you machine
$ python -m pydoc modules
# get information about a specific package
$ python -m pydoc <your package>
You will have the same result as pydoc but inside of interpreter using help
>>> import <my package>
>>> help(<my package>)
The other answers here will run the code in the package as they inspect it. If you don’t want that, you can grep the files like this answer
def _get_class_names(file_name: str) -> List[str]:
"""Get the python class name defined in a file without running code
file_name: the name of the file to search for class definitions in
return: all the classes defined in that python file, empty list if no matches"""
defined_class_names = []
# search the file for class definitions
with open(file_name, "r") as file:
for line in file:
# regular expression for class defined in the file
# searches for text that starts with "class" and ends with ( or :,
# whichever comes first
match = re.search("^class(.+?)((|:)", line) # noqa
if match:
# add the cleaned match to the list if there is one
defined_class_name = match.group(1).strip()
defined_class_names.append(defined_class_name)
return defined_class_names
There is a __loader__
variable inside each package instance. So, if you import the package, you can find the "module resources" inside the package:
import testpkg # change this by your package name
for mod in testpkg.__loader__.get_resource_reader().contents():
print(mod)
You can of course improve the loop to find the "module" name:
import testpkg
from pathlib import Path
for mod in testpkg.__loader__.get_resource_reader().contents():
# You can filter the name like
# Path(l).suffix not in (".py", ".pyc")
print(Path(mod).stem)
Inside the package, you can find your modules by directly using __loader__
of course.
To complete @Metal3d answer, yes you can do testpkg.__loader__.get_resource_reader().contents()
to list the "module resources" but it will work only if you imported your package in the "normal" way and your loader is _frozen_importlib_external.SourceFileLoader object
.
But if you imported your library with zipimport
(ex: to load your package in memory), your loader will be a zipimporter object
, and its get_resource_reader
function is different from importlib; it will require a "fullname" argument.
To make it work in these two loaders, just specify your package name in argument to get_resource_reader
:
# An example with CrackMapExec tool
import importlib
import cme.protocols as cme_protocols
class ProtocolLoader:
def get_protocols(self):
protocols = {}
protocols_names = [x for x in cme_protocols.__loader__.get_resource_reader("cme.protocols").contents()]
for prot_name in protocols_names:
prot = importlib.import_module(f"cme.protocols.{prot_name}")
protocols[prot_name] = prot
return protocols
Is there a straightforward way to list the names of all modules in a package, without using __all__
?
For example, given this package:
/testpkg
/testpkg/__init__.py
/testpkg/modulea.py
/testpkg/moduleb.py
I’m wondering if there is a standard or built-in way to do something like this:
>>> package_contents("testpkg")
['modulea', 'moduleb']
The manual approach would be to iterate through the module search paths in order to find the package’s directory. One could then list all the files in that directory, filter out the uniquely-named py/pyc/pyo files, strip the extensions, and return that list. But this seems like a fair amount of work for something the module import mechanism is already doing internally. Is that functionality exposed anywhere?
import module
help(module)
Maybe this will do what you’re looking for?
import imp
import os
MODULE_EXTENSIONS = ('.py', '.pyc', '.pyo')
def package_contents(package_name):
file, pathname, description = imp.find_module(package_name)
if file:
raise ImportError('Not a package: %r', package_name)
# Use a set because some may be both source and compiled.
return set([os.path.splitext(module)[0]
for module in os.listdir(pathname)
if module.endswith(MODULE_EXTENSIONS)])
def package_contents(package_name):
package = __import__(package_name)
return [module_name for module_name in dir(package) if not module_name.startswith("__")]
Using python2.3 and above, you could also use the pkgutil
module:
>>> import pkgutil
>>> [name for _, name, _ in pkgutil.iter_modules(['testpkg'])]
['modulea', 'moduleb']
EDIT: Note that the parameter for pkgutil.iter_modules
is not a list of modules, but a list of paths, so you might want to do something like this:
>>> import os.path, pkgutil
>>> import testpkg
>>> pkgpath = os.path.dirname(testpkg.__file__)
>>> print([name for _, name, _ in pkgutil.iter_modules([pkgpath])])
Don’t know if I’m overlooking something, or if the answers are just out-dated but;
As stated by user815423426 this only works for live objects and the listed modules are only modules that were imported before.
Listing modules in a package seems really easy using inspect:
>>> import inspect, testpkg
>>> inspect.getmembers(testpkg, inspect.ismodule)
['modulea', 'moduleb']
Based on cdleary’s example, here’s a recursive version listing path for all submodules:
import imp, os
def iter_submodules(package):
file, pathname, description = imp.find_module(package)
for dirpath, _, filenames in os.walk(pathname):
for filename in filenames:
if os.path.splitext(filename)[1] == ".py":
yield os.path.join(dirpath, filename)
This is a recursive version that works with python 3.6 and above:
import importlib.util
from pathlib import Path
import os
MODULE_EXTENSIONS = '.py'
def package_contents(package_name):
spec = importlib.util.find_spec(package_name)
if spec is None:
return set()
pathname = Path(spec.origin).parent
ret = set()
with os.scandir(pathname) as entries:
for entry in entries:
if entry.name.startswith('__'):
continue
current = '.'.join((package_name, entry.name.partition('.')[0]))
if entry.is_file():
if entry.name.endswith(MODULE_EXTENSIONS):
ret.add(current)
elif entry.is_dir():
ret.add(current)
ret |= package_contents(current)
return ret
This should list the modules:
help("modules")
If you would like to view an inforamtion about your package outside of the python code (from a command prompt) you can use pydoc for it.
# get a full list of packages that you have installed on you machine
$ python -m pydoc modules
# get information about a specific package
$ python -m pydoc <your package>
You will have the same result as pydoc but inside of interpreter using help
>>> import <my package>
>>> help(<my package>)
The other answers here will run the code in the package as they inspect it. If you don’t want that, you can grep the files like this answer
def _get_class_names(file_name: str) -> List[str]:
"""Get the python class name defined in a file without running code
file_name: the name of the file to search for class definitions in
return: all the classes defined in that python file, empty list if no matches"""
defined_class_names = []
# search the file for class definitions
with open(file_name, "r") as file:
for line in file:
# regular expression for class defined in the file
# searches for text that starts with "class" and ends with ( or :,
# whichever comes first
match = re.search("^class(.+?)((|:)", line) # noqa
if match:
# add the cleaned match to the list if there is one
defined_class_name = match.group(1).strip()
defined_class_names.append(defined_class_name)
return defined_class_names
There is a __loader__
variable inside each package instance. So, if you import the package, you can find the "module resources" inside the package:
import testpkg # change this by your package name
for mod in testpkg.__loader__.get_resource_reader().contents():
print(mod)
You can of course improve the loop to find the "module" name:
import testpkg
from pathlib import Path
for mod in testpkg.__loader__.get_resource_reader().contents():
# You can filter the name like
# Path(l).suffix not in (".py", ".pyc")
print(Path(mod).stem)
Inside the package, you can find your modules by directly using __loader__
of course.
To complete @Metal3d answer, yes you can do testpkg.__loader__.get_resource_reader().contents()
to list the "module resources" but it will work only if you imported your package in the "normal" way and your loader is _frozen_importlib_external.SourceFileLoader object
.
But if you imported your library with zipimport
(ex: to load your package in memory), your loader will be a zipimporter object
, and its get_resource_reader
function is different from importlib; it will require a "fullname" argument.
To make it work in these two loaders, just specify your package name in argument to get_resource_reader
:
# An example with CrackMapExec tool
import importlib
import cme.protocols as cme_protocols
class ProtocolLoader:
def get_protocols(self):
protocols = {}
protocols_names = [x for x in cme_protocols.__loader__.get_resource_reader("cme.protocols").contents()]
for prot_name in protocols_names:
prot = importlib.import_module(f"cme.protocols.{prot_name}")
protocols[prot_name] = prot
return protocols