How to get a list of all the Python standard library modules?

Question:

I want something like sys.builtin_module_names except for the standard library. Other things that didn’t work:

  • sys.modules – only shows modules that have already been loaded
  • sys.prefix – a path that would include non-standard library modules and doesn’t seem to work inside a virtualenv.

The reason I want this list is so that I can pass it to the --ignore-module or --ignore-dir command line options of trace.

So ultimately, I want to know how to ignore all the standard library modules when using trace or sys.settrace.

Asked By: saltycrane

||

Answers:

This will get you close:

import sys; import glob
glob.glob(sys.prefix + "/lib/python%d.%d" % (sys.version_info[0:2]) + "/*.py")

Another possibility for the ignore-dir option:

os.pathsep.join(sys.path)
Answered By: Keith

Why not work out what’s part of the standard library yourself?

import distutils.sysconfig as sysconfig
import os
std_lib = sysconfig.get_python_lib(standard_lib=True)
for top, dirs, files in os.walk(std_lib):
    for nm in files:
        if nm != '__init__.py' and nm[-3:] == '.py':
            print os.path.join(top, nm)[len(std_lib)+1:-3].replace(os.sep, '.')

gives

abc
aifc
antigravity
--- a bunch of other files ----
xml.parsers.expat
xml.sax.expatreader
xml.sax.handler
xml.sax.saxutils
xml.sax.xmlreader
xml.sax._exceptions

Edit: You’ll probably want to add a check to avoid site-packages if you need to avoid non-standard library modules.

Answered By: Caspar

Here’s an improvement on Caspar’s answer, which is not cross-platform, and misses out top-level modules (e.g. email), dynamically loaded modules (e.g. array), and core built-in modules (e.g. sys):

import distutils.sysconfig as sysconfig
import os
import sys

std_lib = sysconfig.get_python_lib(standard_lib=True)

for top, dirs, files in os.walk(std_lib):
    for nm in files:
        prefix = top[len(std_lib)+1:]
        if prefix[:13] == 'site-packages':
            continue
        if nm == '__init__.py':
            print top[len(std_lib)+1:].replace(os.path.sep,'.')
        elif nm[-3:] == '.py':
            print os.path.join(prefix, nm)[:-3].replace(os.path.sep,'.')
        elif nm[-3:] == '.so' and top[-11:] == 'lib-dynload':
            print nm[0:-3]

for builtin in sys.builtin_module_names:
    print builtin

This is still not perfect because it will miss things like os.path which is defined from within os.py in a platform-dependent manner via code such as import posixpath as path, but it’s probably as good as you’ll get, bearing in mind that Python is a dynamic language and you can’t ever really know which modules are defined until they’re actually defined at runtime.

Answered By: Adam Spiers

Python >= 3.10:

sys.stdlib_module_names

Python < 3.10:

The author of isort, a tool which cleans up imports, had to grapple this same problem in order to satisfy the pep8 requirement that core library imports should be ordered before third party imports.

I have been using this tool and it seems to be working well. You can use the method place_module in the file isort.py:

>>> from isort import place_module
>>> place_module("json")
'STDLIB'
>>> place_module("requests")
'THIRDPARTY'

Or you can get a set of module names directly, which is depending on Python version, for example:

>>> from isort.stdlibs.py39 import stdlib
>>> for name in sorted(stdlib): print(name)
... <200+ lines>
xml
xmlrpc
zipapp
zipfile
zipimport
zlib
zoneinfo
Answered By: wim

Take a look at this,
https://docs.python.org/3/py-modindex.html
They made an index page for the standard modules.

Answered By: Edmund

I brute forced it by writing some code to scrape the TOC of the Standard Library page in the official Python docs. I also built a simple API for getting a list of standard libraries (for Python version 2.6, 2.7, 3.2, 3.3, and 3.4).

The package is here, and its usage is fairly simple:

>>> from stdlib_list import stdlib_list
>>> libraries = stdlib_list("2.7")
>>> libraries[:10]
['AL', 'BaseHTTPServer', 'Bastion', 'CGIHTTPServer', 'ColorPicker', 'ConfigParser', 'Cookie', 'DEVICE', 'DocXMLRPCServer', 'EasyDialogs']
Answered By: user554546

On Python 3.10 there is now sys.stdlib_module_names.

Answered By: CCCC_David

Building on @Edmund’s answer, this solution pulls the list from the official website:

def standard_libs(version=None, top_level_only=True):
    import re
    from urllib.request import urlopen
    if version is None:
        import sys
        version = sys.version_info
        version = f"{version.major}.{version.minor}"
    url = f"https://docs.python.org/{version}/py-modindex.html"
    with urlopen(url) as f:
        page = f.read()
    modules = set()
    for module in re.findall(r'#module-(.*?)['"]',
                             page.decode('ascii', 'replace')):
        if top_level_only:
            module = module.split(".")[0]
        modules.add(module)
    return modules

It returns a set. For example, here are the modules that were added between 3.5 and 3.10:

>>> standard_libs("3.10") - standard_libs("3.5")
{'contextvars', 'dataclasses', 'graphlib', 'secrets', 'zoneinfo'}

Since this is based on the official documentation, it doesn’t include undocumented modules, such as:

  • Easter eggs, namely this and antigravity
  • Internal modules, such as genericpath, posixpath or ntpath, which are not supposed to be used directly (you should use os.path instead). Other internal modules: idlelib (which implements the IDLE editor), opcode, sre_constants, sre_compile, sre_parse, pyexpat, pydoc_data, nt.
  • All modules with a name starting with an underscore (which are also internal), except for __main__', '_thread', and '__future__ which are public and documented.

If you’re concerned that the website may be down, you can just cache the list locally. For example, you can use the following function to create a small Python module containing all the module names:

def create_stdlib_module_names(
        module_name="stdlib_module_names",
        variable="stdlibs",
        version=None,
        top_level_only=True):
    stdlibs = standard_libs(
        version=version, top_level_only=top_level_only)
        with open(f"{module_name}.py", "w") as f:
            f.write(f"{variable} = {stdlibs!r}n")

Here’s how to use it:

>>> create_stdlib_module_names()  # run this just once
>>> from stdlib_module_names import stdlibs
>>> len(stdlibs)
207
>>> "collections" in stdlibs
True
>>> "numpy" in stdlibs
False
Answered By: MiniQuark

This isn’t perfect, but should get you pretty close if you can’t run 3.10:

import os
import distutils.sysconfig

def get_stdlib_module_names():
    stdlib_dir = distutils.sysconfig.get_python_lib(standard_lib=True)
    return {f.replace(".py", "") for f in os.listdir(stdlib_dir)}

This misses some modules such as sys, math, time, and itertools.

My use case is logging which modules were imported during an app run, so having a rough filter for stdlib modules is fine. Also I return it as a set rather than a list so membership checks are faster.

Answered By: xjcl

This works on Anaconda on Windows, and I suspect it will work on Linux distros.

It goes to your Anaconda directory, e.g.:
C:Users{user}anaconda3Lib, where standard libraries are installed. It then pulls folder names and filenames (dropping extensions).

import sys
import os

standard_libs = []
standard_lib_path = os.path.join(sys.prefix, "Lib")
for file in os.listdir(standard_lib_path):
    standard_libs.append(file.split(".py")[0].strip().lower())

NB: Builtins, viewable via print(dir(__builtins__)), are automatically loaded, whereas standard libs are not.

Answered By: MinneapolisCoder9
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.