In what order does os.walk iterates iterate?

Question:

I am concerned about the order of files and directories given by os.walk(). If I have these directories, 1, 10, 11, 12, 2, 20, 21, 22, 3, 30, 31, 32, what is the order of the output list?

Is it sorted by numeric values?

1 2 3 10 20 30 11 21 31 12 22 32

Or sorted by ASCII values, like what is given by ls?

1 10 11 12 2 20 21 22 3 30 31 32

Additionally, how can I get a specific sort?

Asked By: Vahid Mirjalili

||

Answers:

os.walk uses os.listdir. Here is the docstring for os.listdir:

listdir(path) -> list_of_strings

Return a list containing the names of the entries in the directory.

path: path of directory to list

The list is in arbitrary order. It does not include the special
entries ‘.’ and ‘..’ even if they are present in the directory.

(my emphasis).

You could, however, use sort to ensure the order you desire.

for root, dirs, files in os.walk(path):
   for dirname in sorted(dirs):
        print(dirname)

(Note the dirnames are strings not ints, so sorted(dirs) sorts them as strings — which is desirable for once.

As Alfe and Ciro Santilli point out, if you want the directories to be recursed in sorted order, then modify dirs in-place:

for root, dirs, files in os.walk(path):
   dirs.sort()
   for dirname in dirs:
        print(os.path.join(root, dirname))

You can test this yourself:

import os

os.chdir('/tmp/tmp')
for dirname in '1 10 11 12 2 20 21 22 3 30 31 32'.split():
     try:
          os.makedirs(dirname)
     except OSError: pass


for root, dirs, files in os.walk('.'):
   for dirname in sorted(dirs):
        print(dirname)

prints

1
10
11
12
2
20
21
22
3
30
31
32

If you wanted to list them in numeric order use:

for dirname in sorted(dirs, key=int):

To sort alphanumeric strings, use natural sort.

Answered By: unutbu

os.walk() yields in each step what it will do in the next steps. You can in each step influence the order of the next steps by sorting the lists the way you want them. Quoting the 2.7 manual:

When topdown is True, the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search, impose a specific order of visiting

So sorting the dirNames will influence the order in which they will be visited:

for rootName, dirNames, fileNames in os.walk(path):
  dirNames.sort()  # you may want to use the args cmp, key and reverse here

After this, the dirNames are sorted in-place and the next yielded values of walk will be accordingly.

Of course you also can sort the list of fileNames but that won’t influence any further steps (because files don’t have descendants walk will visit).

And of course you can iterate through sorted versions of these lists as unutbu’s answer proposes, but that won’t influence the further progress of the walk itself.

The unmodified order of the values is undefined by os.walk, meaning that it will be “any” order. You should not rely on what you experience today. But in fact it will probably be what the underlying file system returns. In some file systems this will be alphabetically ordered.

Answered By: Alfe

The simplest way is to sort the return values of os.walk(), e.g. using:

for rootName, dirNames, fileNames in sorted(os.walk(path)):
    #root, dirs and files are iterated in order... 
Answered By: vpuente
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.