How to recursively go through all subdirectories and read files?

Question:

I have a root-ish directory containing multiple subdirectories, all of which contain a file name data.txt. What I would like to do is write a script that takes in the “root” directory, and then reads through all of the subdirectories and reads every “data.txt” in the subdirectories, and then writes stuff from every data.txt file to an output file.

Here’s a snippet of my code:

import os
import sys
rootdir = sys.argv[1]

with open('output.txt','w') as fout:
    for root, subFolders, files in os.walk(rootdir):
        for file in files:
            if (file == 'data.txt'):
                #print file
                with open(file,'r') as fin:
                    for lines in fin:
                        dosomething()

My dosomething() part — I’ve tested and confirmed for it to work if I am running that part just for one file. I’ve also confirmed that if I tell it to print the file instead (the commented out line) the script prints out ‘data.txt’.

Right now if I run it Python gives me this error:

File "recursive.py", line 11, in <module>
    with open(file,'r') as fin:
IOError: [Errno 2] No such file or directory: 'data.txt'

I’m not sure why it can’t find it — after all, it prints out data.txt if I uncomment the ‘print file’ line. What am I doing incorrectly?

Asked By: Joe

||

Answers:

You need to use absolute paths, your file variable is just a local filename without a directory path. The root variable is that path:

with open('output.txt','w') as fout:
    for root, subFolders, files in os.walk(rootdir):
        if 'data.txt' in files:
            with open(os.path.join(root, 'data.txt'), 'r') as fin:
                for lines in fin:
                    dosomething()
Answered By: Martijn Pieters
[os.path.join(dirpath, filename) for dirpath, dirnames, filenames in os.walk(rootdir) 
                                 for filename in filenames]

A functional approach to get the tree looks shorter, cleaner and more Pythonic.

You can wrap the os.path.join(dirpath, filename) into any function to process the files you get or save the array of paths for further processing

Answered By: Himura
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.