how many pip packages do one have installed on google colab?
Question:
I can do !pip list
to see a list of all the packages.
I can do this to count all the sub folders in the python 3.7
folder:
import os
containing_folder = '/usr/local/lib/python3.7/dist-packages'
f = []
for (dirpath, dirnames, filenames) in os.walk(containing_folder):
f.extend(dirnames)
break
print('there are', len(f), 'folders in the python 3.7 module')
but the number of folders does not equate to the number of modules as there appear to be more files than modules.
So how can i identify all the modules (and not folders) ?
(ie. count all the pip installed folders).
Answers:
Python packages are denoted by the existence of a file named __init__.py
. So if you want to count packages in terms of that formal definition, you can just count the number of files you find with this name. Here’s code that will do that:
import os
containing_folder = '/usr/local/lib/python3.7/dist-packages'
f = []
for (dirpath, dirnames, filenames) in os.walk(containing_folder):
if '__init__.py' in filenames:
f.append(os.path.basename(dirpath))
print(f)
print('there are', len(f), 'folders in the python 3.7 module')
If you just want to count the number of packages at the first level of a directory, which is probably what you want, here’s code that does that:
import os
containing_folder = '/usr/local/lib/python3.7/dist-packages'
r = []
for entity in os.listdir(containing_folder):
f = os.path.join(containing_folder, entity, '__init__.py')
if os.path.isdir(os.path.join(containing_folder, entity)) and os.path.join(entity) and os.path.exists(f):
r.append(entity)
print(len(r))
When I ran this code on one of my Python installs, and compared it against what I get when I do pip list | wc -l
on that same version of Python, I got almost the same result…125
for the Python code, 129
for pip
.
You could use pip
programmatically (though frankly, finding the documentation for this is a bit daunting).
In theory, you could import pip
and use its internal APIs to fetch the information you want, but the documentation discourages this, and instead suggests you use pip
like any non-native external utility.
import subprocess
packages = subprocess.run(
['pip', 'list'].
check=True, text=True, capture_output=True)
packagelist = packages.stdout.splitlines()
There are many details about subprocess
which could benefit from a more detailed explanation, but it has been done many times before. Perhaps for this discussion, mainly note that passing the first argument as a list of tokens is required on Unix-like systems when you want avoid shell=True
(which you want, when you can).
I can do !pip list
to see a list of all the packages.
I can do this to count all the sub folders in the python 3.7
folder:
import os
containing_folder = '/usr/local/lib/python3.7/dist-packages'
f = []
for (dirpath, dirnames, filenames) in os.walk(containing_folder):
f.extend(dirnames)
break
print('there are', len(f), 'folders in the python 3.7 module')
but the number of folders does not equate to the number of modules as there appear to be more files than modules.
So how can i identify all the modules (and not folders) ?
(ie. count all the pip installed folders).
Python packages are denoted by the existence of a file named __init__.py
. So if you want to count packages in terms of that formal definition, you can just count the number of files you find with this name. Here’s code that will do that:
import os
containing_folder = '/usr/local/lib/python3.7/dist-packages'
f = []
for (dirpath, dirnames, filenames) in os.walk(containing_folder):
if '__init__.py' in filenames:
f.append(os.path.basename(dirpath))
print(f)
print('there are', len(f), 'folders in the python 3.7 module')
If you just want to count the number of packages at the first level of a directory, which is probably what you want, here’s code that does that:
import os
containing_folder = '/usr/local/lib/python3.7/dist-packages'
r = []
for entity in os.listdir(containing_folder):
f = os.path.join(containing_folder, entity, '__init__.py')
if os.path.isdir(os.path.join(containing_folder, entity)) and os.path.join(entity) and os.path.exists(f):
r.append(entity)
print(len(r))
When I ran this code on one of my Python installs, and compared it against what I get when I do pip list | wc -l
on that same version of Python, I got almost the same result…125
for the Python code, 129
for pip
.
You could use pip
programmatically (though frankly, finding the documentation for this is a bit daunting).
In theory, you could import pip
and use its internal APIs to fetch the information you want, but the documentation discourages this, and instead suggests you use pip
like any non-native external utility.
import subprocess
packages = subprocess.run(
['pip', 'list'].
check=True, text=True, capture_output=True)
packagelist = packages.stdout.splitlines()
There are many details about subprocess
which could benefit from a more detailed explanation, but it has been done many times before. Perhaps for this discussion, mainly note that passing the first argument as a list of tokens is required on Unix-like systems when you want avoid shell=True
(which you want, when you can).