What is pip's `–no-cache-dir` good for?
Question:
I’ve recently seen the --no-cache-dir
being used in a Docker file. I’ve never seen that flag before and the help is not explaining it:
--no-cache-dir Disable the cache.
- Question: What is cached?
- Question: What is the cache used for?
- Question: Why would I want to disable it?
Answers:
- Cached is: store away in hiding or for future use
- Used for
- store the installation files(
.whl
, etc) of the modules that you install through pip
- store the source files (
.tar.gz
, etc) to avoid re-download when not expired
- Possible Reason you might want to disable cache:
- you don’t have space on your hard drive
- previously run
pip install
with unexpected settings
- eg:
- previously run
export PYCURL_SSL_LIBRARY=nss
and pip install pycurl
- want new run
export PYCURL_SSL_LIBRARY=openssl
and pip install pycurl --compile --no-cache-dir
- you want to keep a Docker image as small as possible
Links to documentation
https://pip.pypa.io/en/stable/reference/pip_install/#caching – @emredjan
https://pip.pypa.io/en/stable/reference/pip_install/ – @mikea
Another reason to disable the pip cache – if you run pip as a user that does not yet exist, their home directory will be created, but owned by root.
This happens to us when building Amazon AMIs in a chroot – pip is being run as a user that exists on the builder machine, but not in the chroot jail where the AMI is being constructed. This is problematic as that specific user can now not ssh to what was just built as their .ssh directory is not readable by them.
I can’t think of any other reason pip would be run as a user that doesn’t exist though, so it’s very much an edge case.
I think there is a good reason to use --no-cache-dir
when you are building Docker images. The cache is usually useless in a Docker image, and you can definitely shrink the image size by disabling the cache.
Reduce your docker image size if you’re having python dependencies in your DockerFile, as your private registries/artifactories or your deployment servcies may have size limitation.
I get permission error for installation of some pip packages if I don’t use --no-cache-dir
option.
Building wheels for collected packages: pyyaml, bottleneck, nvidia-ml-py3
WARNING: Building wheel for pyyaml failed: [Errno 13] Permission denied: '/home/user/.cache/pip/wheels/b1'
WARNING: Building wheel for bottleneck failed: [Errno 13] Permission denied: '/home/user/.cache/pip/wheels/92'
WARNING: Building wheel for nvidia-ml-py3 failed: [Errno 13] Permission denied: '/home/user/.cache/pip/wheels/7f'
chown /.cache
folder didn’t help for some reason but with --no-cache-dir
it works ok.
From fastapi official doc
The –no-cache-dir option tells pip to not save the downloaded packages locally, as that is only if pip was going to be run again to install the same packages, but that’s not the case when working with containers.
Basically, there is no need to store whatever package cache you’re installing locally since it is not required by docker containers.
I’ve recently seen the --no-cache-dir
being used in a Docker file. I’ve never seen that flag before and the help is not explaining it:
--no-cache-dir Disable the cache.
- Question: What is cached?
- Question: What is the cache used for?
- Question: Why would I want to disable it?
- Cached is: store away in hiding or for future use
- Used for
- store the installation files(
.whl
, etc) of the modules that you install through pip - store the source files (
.tar.gz
, etc) to avoid re-download when not expired
- Possible Reason you might want to disable cache:
- you don’t have space on your hard drive
- previously run
pip install
with unexpected settings- eg:
- previously run
export PYCURL_SSL_LIBRARY=nss
andpip install pycurl
- want new run
export PYCURL_SSL_LIBRARY=openssl
andpip install pycurl --compile --no-cache-dir
- previously run
- eg:
- you want to keep a Docker image as small as possible
Links to documentation
https://pip.pypa.io/en/stable/reference/pip_install/#caching – @emredjan
https://pip.pypa.io/en/stable/reference/pip_install/ – @mikea
Another reason to disable the pip cache – if you run pip as a user that does not yet exist, their home directory will be created, but owned by root.
This happens to us when building Amazon AMIs in a chroot – pip is being run as a user that exists on the builder machine, but not in the chroot jail where the AMI is being constructed. This is problematic as that specific user can now not ssh to what was just built as their .ssh directory is not readable by them.
I can’t think of any other reason pip would be run as a user that doesn’t exist though, so it’s very much an edge case.
I think there is a good reason to use --no-cache-dir
when you are building Docker images. The cache is usually useless in a Docker image, and you can definitely shrink the image size by disabling the cache.
Reduce your docker image size if you’re having python dependencies in your DockerFile, as your private registries/artifactories or your deployment servcies may have size limitation.
I get permission error for installation of some pip packages if I don’t use --no-cache-dir
option.
Building wheels for collected packages: pyyaml, bottleneck, nvidia-ml-py3
WARNING: Building wheel for pyyaml failed: [Errno 13] Permission denied: '/home/user/.cache/pip/wheels/b1'
WARNING: Building wheel for bottleneck failed: [Errno 13] Permission denied: '/home/user/.cache/pip/wheels/92'
WARNING: Building wheel for nvidia-ml-py3 failed: [Errno 13] Permission denied: '/home/user/.cache/pip/wheels/7f'
chown /.cache
folder didn’t help for some reason but with --no-cache-dir
it works ok.
From fastapi official doc
The –no-cache-dir option tells pip to not save the downloaded packages locally, as that is only if pip was going to be run again to install the same packages, but that’s not the case when working with containers.
Basically, there is no need to store whatever package cache you’re installing locally since it is not required by docker containers.