Is it safe to manually delete all files in pkgs folder in anaconda python?

Question:

I ran this command to release disk space on anaconda

$ conda clean --all

However, there are still some big files that remain in pkgs folder in anaconda python.

Is it safe to manually delete all the files in pkgs folder? Any risk of corrupting my anaconda environment? What are some side effects, if any?

I am using anaconda 2018 on windows 10.

Asked By: guagay_wk

||

Answers:

Edit Commentary

After reviewing the documentation pointed out in @Robert’s answer, I must admit my initial response was overly alarmist and, in parts, blatantly incorrect. My apologies for the misleading response.

Nevertheless, I do believe some of what I raised still has some merit for this thread, and so I am deciding to retain the answer with amendments. In particular, I think it worth emphasizing that deleting the pkgs directory may not actually achieve what OP was hoping for (to save space) and that removing the package cache undermines Conda’s redundancy minimization strategy going forward by making it impossible to share already installed packages.

Instead, my final recommendation concurs with what @Robert suggested, namely, use conda clean -p to delete unused packages, but keep the cache (pkgs dir) so that future environments can still leverage hardlinks. One last point to note, is that some tools, such as conda-pack, rely on the integrity of the package cache in order work, so deleting pkgs will prevent their use.


Amended Original Response

No, it is definitely not safe, and in fact the only way you would actually free disk space is if you broke your base env. The issue is that all envs use hardlinks to the pkgs directory, so even if you delete the link located in the pkgs directory, the ones in the envs will still be there and so you won’t delete any physical files on the disk. The only real deletion you might do is something that is only referenced by base, i.e., the only copy is in pkgs, hence the potential for a breaking base.

Correction: The base env still links packages to other locations, so deleting pkgs will not impact base as I originally concluded.

I’d highly recommend looking at this other post on estimating the real disk usage of Conda. You may be overestimating how much space is really being used. For most files in pkgs, there is only one physical copy, so there isn’t any additional manual optimization to be done.

Answered By: merv

Actually, under certain conditions it is an option to have the pkgs subdirs removed. As stated here by Anaconda Community Support "the pkgs directory is only a cache. You can remove it completely is you want to.
However, when creating new environments, it is more efficient to leave whatever packages are in the cache around."

According to the documentation you can use conda clean --packages to remove unused packages in pkgs (which will move them to pkgs/.trash from which you can then safely delete them). While this does not check for packages installed using symlinks back to the package cache, this is not a topic if you don’t use such environments or work under Windows. I guess that’s why conda clean --packages is included in conda clean --all.

To more aggressively save space you can use conda clean --force-pkgs-dirs to remove all writable package caches (with the same caveat that there could be environments linked to these dirs). If you don’t use environments or use Anaconda under Windows, you’re probably safe. Personally, I use this option without issues.

Answered By: Robert
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.