How to parse a directory tree in python?

Question:

I have a directory called “notes” within the notes I have categories which are named “science”, “maths” … within those folder are sub-categories, such as “Quantum Mechanics”, “Linear Algebra”.

./notes
--> ./notes/maths
------> ./notes/maths/linear_algebra
--> ./notes/physics/
------> ./notes/physics/quantum_mechanics

My problem is that I don’t know how to put the categories and subcategories into TWO SEPARATE list/array.

Asked By: chutsu

||

Answers:

You could utilize os.walk.

#!/usr/bin/env python

import os
for root, dirs, files in os.walk('notes'):
    print(root, dirs, files)

Naive two level traversing:

import os
from os.path import isdir, join

def cats_and_subs(root='notes'):
    """
    Collect categories and subcategories.
    """
    categories = filter(lambda d: isdir(join(root, d)), os.listdir(root))
    sub_categories = []
    for c in categories:
        sub_categories += filter(lambda d: isdir(join(root, c, d)), 
            os.listdir(join(root, c)))
    
    # categories and sub_categories are arrays,
    # categories would hold stuff like 'science', 'maths'
    # sub_categories would contain 'Quantum Mechanics', 'Linear Algebra', ...
    return (categories, sub_categories)

if __name__ == '__main__':
    print(cats_and_subs(root='/path/to/your/notes'))
Answered By: miku

os.walk is pretty much ideal for this. By default it will do a top-down walk, and you can terminate it easily at the 2nd level by settings ‘dirnames’ to be empty at that point.

import os
pth = "/path/to/notes"
def getCats(pth):
    cats = []
    subcats = []
    for (dirpath, dirnames, filenames) in os.walk(pth):
        #print dirpath+"nt", "nt".join(dirnames), "n%d files"%(len(filenames))
        if dirpath == pth:
            cats = dirnames
        else:
            subcats.extend(dirnames)
            dirnames[:]=[] # don't walk any further downwards
    # subcats = list(set(subcats)) # uncomment this if you want 'subcats' to be unique
    return (cats, subcats)
Answered By: pycruft

The 6th icon from the left (with green little circle) allows you to add new document:

Answered By: Natalia Antonova
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.