List directory tree structure in python from a list of path file

Question:

The question is intended to broaden the scope of a question already answered on stackoverflow by the topic "List directory tree structure in python?".

The goal is to form a list of strings that visually represent a directory tree, with branchs.

But instead of the input being a valid directory path (as in the already answered topic),
the quest is to generate the same behavior being a "path file list" as input.

Naturally the function needs to be recursive to accommodate any depth of files.

Exemple

input:

['main_folder\file01.txt',
 'main_folder\file02.txt',
 'main_folder\folder_sub1\file03.txt',
 'main_folder\folder_sub1\file04.txt',
 'main_folder\folder_sub1\file05.txt',
 'main_folder\folder_sub1\folder_sub1-1\file06.txt',
 'main_folder\folder_sub1\folder_sub1-1\file07.txt',
 'main_folder\folder_sub1\folder_sub1-1\file08.txt',
 'main_folder\folder_sub2\file09.txt',
 'main_folder\folder_sub2\file10.txt',
 'main_folder\folder_sub2\file11.txt']

output:

├── file01.txt
├── file02.txt
├── folder_sub1
│   ├── file03.txt
│   ├── file04.txt
│   ├── file05.txt
│   └── folder_sub1-1
│       ├── file06.txt
│       ├── file07.txt
│       └── file08.txt
└── folder_sub2
    ├── file09.txt
    ├── file10.txt
    └── file11.txt

Transforming the list of file paths into nested dictionaries representing the structure of a directory has been answered in the topic "Python convert path to dict".
With this output:

{'main_folder': {'file01.txt': 'txt',
                 'file02.txt': 'txt',
                 'folder_sub1': {'file03.txt': 'txt',
                                 'file04.txt': 'txt',
                                 'file05.txt': 'txt',
                                 'folder_sub1-1': {'file06.txt': 'txt',
                                                   'file07.txt': 'txt',
                                                   'file08.txt': 'txt'}},
                 'folder_sub2': {'file09.txt': 'txt',
                                 'file10.txt': 'txt',
                                 'file11.txt': 'txt'}}}

But generating the beautiful layout with branchs remains unsolved.

Asked By: the_RR

||

Answers:

Possible solution:

paths = {
    'main_folder': {
        'file01.txt': 'txt',
        'file02.txt': 'txt',
        'folder_sub1': {
            'file03.txt': 'txt',
            'file04.txt': 'txt',
            'file05.txt': 'txt',
            'folder_sub1-1': {
                'file06.txt': 'txt',
                'file07.txt': 'txt',
                'file08.txt': 'txt'
            }
        },
        'folder_sub2': {
            'file09.txt': 'txt',
            'file10.txt': 'txt',
            'file11.txt': 'txt'
        }
    }
}


# prefix components:
space =  '    '
branch = '│   '
# pointers:
tee =    '├── '
last =   '└── '

def tree(paths: dict, prefix: str = ''):
    """A recursive generator, given a directory Path object
    will yield a visual tree structure line by line
    with each line prefixed by the same characters
    """
    # contents each get pointers that are ├── with a final └── :
    pointers = [tee] * (len(paths) - 1) + [last]
    for pointer, path in zip(pointers, paths):
        yield prefix + pointer + path
        if isinstance(paths[path], dict): # extend the prefix and recurse:
            extension = branch if pointer == tee else space
            # i.e. space because last, └── , above so no more |
            yield from tree(paths[path], prefix=prefix+extension)


for line in tree(paths):
    print(line)

Ref: List directory tree structure in python?

Answered By: RafaelDSS

A little modification to @rafaeldss’s answer

# prefix components:
space =  '    '
branch = '│   '
# pointers:
tee =    '├── '
last =   '└── '

def tree(paths: dict, prefix: str = '', first: bool = True):
    """A recursive generator, given a directory Path object
    will yield a visual tree structure line by line
    with each line prefixed by the same characters
    """
    # contents each get pointers that are ├── with a final └── :
    pointers = [tee] * (len(paths) - 1) + [last]
    for pointer, path in zip(pointers, paths):
        if first:
            yield prefix + path
        else:
            yield prefix + pointer + path
        if isinstance(paths[path], dict): # extend the prefix and recurse:
            if first:
                extension = ''
            else:
                extension = branch if pointer == tee else space
                # i.e. space because last, └── , above so no more │
            yield from tree(paths[path], prefix=prefix+extension, first=False)

for line in tree(paths):
    print(line)

diffs

Answered By: Guido Modarelli

bigtree is a Python tree implementation that integrates with Python lists, dictionaries, and pandas DataFrame.

For this scenario, we can use 3 lines of code,

path_list = [
    'main_folder\file01.txt',
    'main_folder\file02.txt',
    'main_folder\folder_sub1\file03.txt',
    'main_folder\folder_sub1\file04.txt',
    'main_folder\folder_sub1\file05.txt',
    'main_folder\folder_sub1\folder_sub1-1\file06.txt',
    'main_folder\folder_sub1\folder_sub1-1\file07.txt',
    'main_folder\folder_sub1\folder_sub1-1\file08.txt',
    'main_folder\folder_sub2\file09.txt',
    'main_folder\folder_sub2\file10.txt',
    'main_folder\folder_sub2\file11.txt']

from bigtree import list_to_tree, print_tree
root = list_to_tree(path_list, sep="\")
print_tree(root)

This will result in output,

main_folder
├── file01.txt
├── file02.txt
├── folder_sub1
│   ├── file03.txt
│   ├── file04.txt
│   ├── file05.txt
│   └── folder_sub1-1
│       ├── file06.txt
│       ├── file07.txt
│       └── file08.txt
└── folder_sub2
    ├── file09.txt
    ├── file10.txt
    └── file11.txt

Source/Disclaimer: I’m the creator of bigtree 😉

Answered By: Kay Jan