Recursively search a directory, subdirectories, and files and print any with an uppercase letter in Python
Question:
I’m trying to create a function that will recursively search a directory, it’s subdirectories, files and print out any that contain an uppercase letter. I’ve searched SO and other fourms for a way to do this and I’ve sorta got most of it working. However, I can’t get it to print anything that contains an uppercase.
The folder structure would look like this:
rootfolder
|-subfolder1
|-filelower1
|-filelower2
|-fileUpper1
|-subfolder2
|-filelower4
|-subFolder3
|-fileUpper2
|-fileupPer3
code:
def check_name():
directory = r'Folder Path'
for d in next(os.walk(directory))[1]:
itempath = os.path.join(directory, d)
path = d
#this recursion is working as intended, this prints out if
# 'rootfolder' has an uppercase.
if any(letter.isupper() for letter in path):
print("Found uppercase at:n" + itempath)
else:
pass
#this works correctly, but obviously this will print the entire path for
#the directories and files, including the top level one, which isn't
#how I would want this loop to work either way.
for root, directories, files in os.walk(directory):
for name in files:
print(os.path.join(root, name))
for name in directories:
print(os.path.join(root, name))
I tried adjusting the second for loop to look like this:
for root, directories, files in os.walk(directory):
for name in files:
if any(letter.isupper() for letter in files)
print(os.path.join(root, name))
else:
pass
for name in directories:
if any(letter.isupper() for letter in directories)
print(os.path.join(root, name))
else:
pass
but nothing printed. I assumed it was because I wasn’t pointing inside the desired folder, so I adjusted the beginning to
for root, directories, files in os.walk(directory)[2]:
but I got TypeError: ‘generator’ object is not subscriptable’.
Then, I tried adjusting it to
for root, directories, files in next(os.walk(directory))[2]:
but nothing was printed, but the code ran.
I’m not sure where else to go from here to get it working as intended
Answers:
Try replacing files
and directories
with name
for name in files:
if any(letter.isupper() for letter in name):
print(os.path.join(root, name))
for name in directories:
if any(letter.isupper() for letter in name):
print(os.path.join(root, name))
I would recommend a simple solution using the pathlib.Path.rglob
function:
from pathlib import Path
directory = Path(r'Folder Path')
# Recursively get all files with a capital letter in its path
for i in directory.rglob(r"[A-Z]*.*"):
...
The argument in rglob
is the pattern the paths must match.
Notice the .*
.
That will make files match and will prevent folders to match.
If you want to match just folders the pattern would be: [A-Z]
.
It is important to notice that the regex [A-Z]
will match every capital letter in the English alphabet.
If you had a path with characters such as Ñ
, Ç
… might not work as expected.
To get your path as an absolute path you can use Path.absolute
, but I think what you want is the relative path so use relative_to
:
for i in directory.rglob(r"[A-Z]*.*"):
print(i.relative_to(directory))
You could use os.walk()
like you used.
import os
for path, dirs, files in os.walk(mypath):
for name in (files + dirs):
if name != name.lower():
print(os.path.join(path, name))
I think os.walk()
needs the full path. Which can be resolved by using os.path.join(os.getcwd(), mypath)
The code would then look like:
import os
for path, dirs, files in os.walk(os.path.join(os.getcwd(), mypath)):
for name in (files + dirs):
if name != name.lower():
print(os.path.join(path, name))
I’m trying to create a function that will recursively search a directory, it’s subdirectories, files and print out any that contain an uppercase letter. I’ve searched SO and other fourms for a way to do this and I’ve sorta got most of it working. However, I can’t get it to print anything that contains an uppercase.
The folder structure would look like this:
rootfolder
|-subfolder1
|-filelower1
|-filelower2
|-fileUpper1
|-subfolder2
|-filelower4
|-subFolder3
|-fileUpper2
|-fileupPer3
code:
def check_name():
directory = r'Folder Path'
for d in next(os.walk(directory))[1]:
itempath = os.path.join(directory, d)
path = d
#this recursion is working as intended, this prints out if
# 'rootfolder' has an uppercase.
if any(letter.isupper() for letter in path):
print("Found uppercase at:n" + itempath)
else:
pass
#this works correctly, but obviously this will print the entire path for
#the directories and files, including the top level one, which isn't
#how I would want this loop to work either way.
for root, directories, files in os.walk(directory):
for name in files:
print(os.path.join(root, name))
for name in directories:
print(os.path.join(root, name))
I tried adjusting the second for loop to look like this:
for root, directories, files in os.walk(directory):
for name in files:
if any(letter.isupper() for letter in files)
print(os.path.join(root, name))
else:
pass
for name in directories:
if any(letter.isupper() for letter in directories)
print(os.path.join(root, name))
else:
pass
but nothing printed. I assumed it was because I wasn’t pointing inside the desired folder, so I adjusted the beginning to
for root, directories, files in os.walk(directory)[2]:
but I got TypeError: ‘generator’ object is not subscriptable’.
Then, I tried adjusting it to
for root, directories, files in next(os.walk(directory))[2]:
but nothing was printed, but the code ran.
I’m not sure where else to go from here to get it working as intended
Try replacing files
and directories
with name
for name in files:
if any(letter.isupper() for letter in name):
print(os.path.join(root, name))
for name in directories:
if any(letter.isupper() for letter in name):
print(os.path.join(root, name))
I would recommend a simple solution using the pathlib.Path.rglob
function:
from pathlib import Path
directory = Path(r'Folder Path')
# Recursively get all files with a capital letter in its path
for i in directory.rglob(r"[A-Z]*.*"):
...
The argument in rglob
is the pattern the paths must match.
Notice the .*
.
That will make files match and will prevent folders to match.
If you want to match just folders the pattern would be: [A-Z]
.
It is important to notice that the regex [A-Z]
will match every capital letter in the English alphabet.
If you had a path with characters such as Ñ
, Ç
… might not work as expected.
To get your path as an absolute path you can use Path.absolute
, but I think what you want is the relative path so use relative_to
:
for i in directory.rglob(r"[A-Z]*.*"):
print(i.relative_to(directory))
You could use os.walk()
like you used.
import os
for path, dirs, files in os.walk(mypath):
for name in (files + dirs):
if name != name.lower():
print(os.path.join(path, name))
I think os.walk()
needs the full path. Which can be resolved by using os.path.join(os.getcwd(), mypath)
The code would then look like:
import os
for path, dirs, files in os.walk(os.path.join(os.getcwd(), mypath)):
for name in (files + dirs):
if name != name.lower():
print(os.path.join(path, name))