Simplest way to get the equivalent of "find ." in python?
Question:
What is the simplest way to get the full recursive list of files inside a folder with python? I know about os.walk()
, but it seems overkill for just getting the unfiltered list of all files. Is it really the only option?
Answers:
Either that or manually recursing with isdir()
/ isfile()
and listdir()
or you could use subprocess.check_output()
and call find .
. Bascially os.walk()
is highest level, slightly lower level is semi-manual solution based on listdir()
and if you want the same output find .
would give you for some reason you can make a system call with subprocess
.
There’s nothing preventing you from creating your own function:
import os
def listfiles(folder):
for root, folders, files in os.walk(folder):
for filename in folders + files:
yield os.path.join(root, filename)
You can use it like so:
for filename in listfiles('/etc/'):
print filename
import os
path = "path/to/your/dir"
for (path, dirs, files) in os.walk(path):
print files
Is this overkill, or am I missing something?
os.walk()
is not overkill by any means. It can generate your list of files and directories in a jiffy:
files = [os.path.join(dirpath, filename)
for (dirpath, dirs, files) in os.walk('.')
for filename in (dirs + files)]
You can turn this into a generator, to only process one path at a time and safe on memory.
You could also use the find
program itself from Python by using sh
import sh
text_files = sh.find(".", "-iname", "*.txt")
pathlib.Path.rglob
is pretty simple. It lists the entire directory tree
(The argument is a filepath search pattern. "*"
means list everything)
import pathlib
for path in pathlib.Path("directory_to_list/").rglob("*"):
print(path)
os.walk() is hard to use, just kick it and use pathlib instead.
Here is a python function mimicking a similar function of list.files in R language.
def list_files(path,pattern,full_names=False,recursive=True):
if(recursive):
files=pathlib.Path(path).rglob(pattern)
else:
files=pathlib.Path(path).glob(pattern)
if full_names:
files=[str(f) for f in files]
else:
files=[f.name for f in files]
return(files)
What is the simplest way to get the full recursive list of files inside a folder with python? I know about os.walk()
, but it seems overkill for just getting the unfiltered list of all files. Is it really the only option?
Either that or manually recursing with isdir()
/ isfile()
and listdir()
or you could use subprocess.check_output()
and call find .
. Bascially os.walk()
is highest level, slightly lower level is semi-manual solution based on listdir()
and if you want the same output find .
would give you for some reason you can make a system call with subprocess
.
There’s nothing preventing you from creating your own function:
import os
def listfiles(folder):
for root, folders, files in os.walk(folder):
for filename in folders + files:
yield os.path.join(root, filename)
You can use it like so:
for filename in listfiles('/etc/'):
print filename
import os
path = "path/to/your/dir"
for (path, dirs, files) in os.walk(path):
print files
Is this overkill, or am I missing something?
os.walk()
is not overkill by any means. It can generate your list of files and directories in a jiffy:
files = [os.path.join(dirpath, filename)
for (dirpath, dirs, files) in os.walk('.')
for filename in (dirs + files)]
You can turn this into a generator, to only process one path at a time and safe on memory.
You could also use the find
program itself from Python by using sh
import sh
text_files = sh.find(".", "-iname", "*.txt")
pathlib.Path.rglob
is pretty simple. It lists the entire directory tree
(The argument is a filepath search pattern. "*"
means list everything)
import pathlib
for path in pathlib.Path("directory_to_list/").rglob("*"):
print(path)
os.walk() is hard to use, just kick it and use pathlib instead.
Here is a python function mimicking a similar function of list.files in R language.
def list_files(path,pattern,full_names=False,recursive=True):
if(recursive):
files=pathlib.Path(path).rglob(pattern)
else:
files=pathlib.Path(path).glob(pattern)
if full_names:
files=[str(f) for f in files]
else:
files=[f.name for f in files]
return(files)