Run the for loop for each file in directory using Python

Question:

I want to run for loop in python for each file in a directory. The directory names will be passed through a separate file (folderlist.txt).

Inside my main folder (/user/), new folders get added daily. So I want to run for loop for each file in the given folder. And don’t want to run against folder which files have already been run through the loop. I’m thinking of maintaining folderlist.txt which will have folder names of only newly added folders each day which will be then passed to for loop.
For example under my main path (/user/) we see below folders :
(file present inside each folder are listed below folder name just to give the idea)

(day 1)
folder1
file1, file2, file3

folder2
file4, file5

folder3
file6

(day 2)
folder4
file7, file8, file9, file10

folder5
file11, file12
import os
with open('/user/folderlist.txt') as f:
    for line in f:
        line=line.strip("n")

        dir='/user/'+line
        for files in os.walk (dir):
            for file in files:
                print(file)
#        for filename in glob.glob(os.path.join (dir, '*.json')):
#                print(filename)

I tried using os.walk and glob modules in the above code but looks like the loop is running more number of times than files in the folder. Please provide inputs.

Asked By: more09

||

Answers:

Try changing os.walk(dir) for os.listdir(dir). This will give you a list of all the elements in the directory.

import os
with open('/user/folderlist.txt') as f:
    for line in f:
        line = line.strip("n")

        dir = '/user/' + line
        for file in os.listdir(dir):
            if file.endswith("fileExtension"):
                print(file)

Hope it helps

Answered By: TavoGLC

*Help on function walk in module os:

walk(top, topdown=True, onerror=None, followlinks=False)
Directory tree generator.

For each directory in the directory tree rooted at top (including top
itself, but excluding '.' and '..'), yields a 3-tuple

    dirpath, dirnames, filenames

dirpath is a string, the path to the directory.  dirnames is a list of
the names of the subdirectories in dirpath (excluding '.' and '..').
filenames is a list of the names of the non-directory files in dirpath.
Note that the names in the lists are just names, with no path components.
To get a full path (which begins with top) to a file or directory in
dirpath, do os.path.join(dirpath, name).*

Therefore the files in the second loop is iterating on dirpath(string), dirnames(list), filenames(list).
Using os.listdir(dir) gives a list of all the files and folders in the dir as list.

Answered By: NIKESH SINGH
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.