Pythonic way to delete files/folders older than X days

Question:

In my Python3 program I need to delete files and folders that are older than X days. I know there a many similar questions here, but in my case I don’t need to check the modification times of these folders and files. Instead I have the following folder structure:

/root_folder/<year>/<month>/<day>/<files>

So for example something like this:

.
└── 2020
    ├── 04
    │   └── 30
    │       ├── file.1
    │       └── file.2
    └── 05
        ├── 14
        │   ├── file.1
        │   └── file.2
        ├── 19
        ├── 21
        │   └── file.1
        └── 22
            ├── file.1
            ├── file.2
            └── file.3

What I want now is to delete all the folders and their files that represent the date older than X days. I have created a solution, but coming from Java it seems to me that is not very Pythonic, or it might be easier to solve in Pyhton. Can you Python experts guide me a bit here, of course taking into account “jumps” over months and years?

Asked By: Matthias

||

Answers:

Not a python expert here either, but here’s something simple:

  1. Find the date oldest date that you want to keep. Anything older than this will be deleted. Let’s say it is the 28/04/2020

  2. From that date, you can build a string “/root_folder/2020/04/28”

  3. List all the files, if their path (as string) is less than the string from the previous step, you can delete them all

Example:

files = []
# r=root, d=directories, f = files
for r, d, f in os.walk(path):
    for file in f:
        if '.txt' in file:
            files.append(os.path.join(r, file))

Source of that snippet: https://mkyong.com/python/python-how-to-list-all-files-in-a-directory/

Now, you can do:

for f in files:
    if f < date_limit:
        os.remove(f)

Note: This is not optimal

  1. It deletes file by file, but the moment you enter the if you could just delete the whole folder where this file is (but then the list of files points to files that have been deleted).

  2. You actually don’t care about the files. You could apply the logic to folders alone and remove them recursively.


Update: doing both steps as we browse the folders:

for r, d, f in os.walk(path):
    if( r < date_limit ):
        print(f"Deleting {r}")
        shutil.rmtree(r)
Answered By: Pedro Loureiro

Glob your paths to get your filepaths in an array then run it something like this (below), good luck!

def is_file_access_older_than(file_path, seconds, from_time=None):
    """based on st_atime --> https://docs.python.org/3/library/os.html#os.stat_result.st_atime"""
    if not from_time:
        from_time = time.time()
    if (from_time - os.stat(file_path).st_atime) > seconds:
        return file_path
    return False
Answered By: Goran B.
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.