Find leaf folders that aren't hidden folders

Question:

I have a folder structure with some epubs and json files in the down-most folders (not counting the .ts folders). I’m exporting tags from the json files to tagspaces, by creating a .ts folder with some other json files. I’ve already processed part of the files and now I want to find the leaf folders that don’t have a .ts folder in their path, to find the remaining files without having to process the others twice.

So for this example I only want to do something for the folder t5:

test
├── t1
│   ├── t2
│   │   └── t5
│   └── t3
│       └── .ts
└── .ts
    └── t4

This is what I’ve tried:

def process_files_in_leaf_subdirectories(dir: str) -> None:
    dirs = []
    for root, subdirs, filenames in os.walk(dir):
        if subdirs or '.ts' in root:
            continue
        dirs.append(root)
    return dirs


def test_process_files_in_leaf_subdirectories():
    os.makedirs('tmp/t1/t2/t5', exist_ok=True)
    os.makedirs('tmp/t1/t3/.ts', exist_ok=True)
    os.makedirs('tmp/.ts/t4', exist_ok=True)
    assert get_files_in_leaf_subdirectories('tmp') == ['tmp/t1/t2/t5']
    shutil.rmtree('tmp')

context

Asked By: ajr-dev

||

Answers:

Since you want to find leaf directory, without counting .ts directory – just recursively visiting non-hidden path and yielding directories without any subdirectory would be enough.

For such path operations in python, I’d recommend using pathlib.Path instead.

Here’s generator to yield leaf directories without any subdir:

import pathlib

def find_leaf_dir_gen(root_path: pathlib.Path) -> pathlib.Path:

    # filter subdirectories
    child_dirs = [path for path in root_path.iterdir() if path.is_dir()]

    # if no child_dir, yield & return
    if not child_dirs:
        yield root_path
        return
    
    # otherwise iter tru subdir
    for path in child_dirs:
        # ignore hidden dir
        if path.stem[0] == ".":
            continue

        # step in and recursive yield
        yield from find_leaf_dir_gen(path)

Sample usage

>>> leaves = list(find_leaf_dir_gen(ROOT))
>>> leaves
[WindowsPath('X:/test/t1/t2/t5'), WindowsPath('X:/test/t1/t3/t6')]

>>> for path in leaves:
...     ts_path = path.joinpath(".ts")
...     ts_path.mkdir()

Test directory structure – Before:

X:TEST
├─.ts
│  └─t4
└─t1
    ├─t2
    │  └─t5
    └─t3
        ├─.ts
        └─t6

After:

X:TEST
├─.ts
│  └─t4
└─t1
    ├─t2
    │  └─t5
    │      └─.ts
    └─t3
        ├─.ts
        └─t6
            └─.ts
Answered By: jupiterbjy
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.