What is the difference between pathlib glob('*') and iterdir?

Question:

Suppose I’m writing code using pathlib and I want to iter over all the files in the same level of a directory.

I can do this in two ways:

p = pathlib.Path('/some/path')
for f in p.iterdir():
    print(f)
p = pathlib.Path('/some/path')
for f in p.glob('*'):
    print(f)

Is one of the options better in any way?

Asked By: kaki gadol

||

Answers:

Expansion of my comment: Why put the API to extra work parsing and testing against a filter pattern when you could just… not?

glob is better when you need to make use of the filtering feature and the filter is simple and string-based, as it simplifies the work. Sure, hand-writing simple matches (filtering iterdir via if path.endswith('.txt'): instead of glob('*.txt')) might be more efficient than the regex based pattern matching glob hides, but it’s generally not worth the trouble of reinventing the wheel given that disk I/O is orders of magnitude slower.

But if you don’t need the filtering functionality at all, don’t use it. glob is gaining you nothing in terms of code simplicity or functionality, and hurting performance, so just use iterdir.

Answered By: ShadowRanger

In addition to the excellent existing answer, there’s at least one difference in behavior:

If the directory doesn’t exist, iterdir() raises a FileNotFoundError. glob('*') treats this case like an empty folder, returning an empty iterable.

>>> import pathlib
>>> path = pathlib.Path('/some/path')
>>> list(path.glob('*'))
[]
>>> list(path.iterdir())
Traceback (most recent call last):
  [...]
FileNotFoundError: [Errno 2] No such file or directory: '/some/path'
Answered By: flornquake
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.