Filter out elements from list if matching one of multiple patterns
Question:
I have found solutions in different languages, but not Python. Having limited experience in coding except Python and R, I can’t translate them properly.
I have a list of files like this:
file_list1 = ['/home/qrs/sample1.csv',
'/home/abc/sample1.csv',
'/home/mno/sample13.csv',
'/home/xyz/sample2.csv']
I also have a list of folders like:
change_folder = ['mno', 'xyz']
Now i want to filter the file list where folders do not match as those in the folder list. The intended result would be:
file_list2 = ['/home/qrs/sample1.csv',
'/home/abc/sample1.csv']
I have tried:
file_list2 = [x for x in file_list1 if change_folder not in x]
This only works if I have just one pattern, not for multiple patterns. Please help.
Edit: Got some of the answer in another question. However, the answer gotten here also includes elements that cater specifically to filtering path strings, which will be valuable for future visitors.
Already stated why this question is not a duplicate even if it seems similar to some other generic matching question.
Answers:
I would strongly recommend to use Path.parts
API to check if a folder is part of a path. Also you can use any
with generator expression to test the condition against multiple patterns.
>>> from pathlib import Path
>>>
>>>
>>> file_list1 = [
... "/home/qrs/sample1.csv",
... "/home/abc/sample1.csv",
... "/home/mno/sample13.csv",
... "/home/xyz/sample2.csv",
... ]
>>>
>>> change_folder = ["mno", "xyz"]
>>>
>>> r = [
... path
... for path in file_list1
... if not any(folder in Path(path).parts for folder in change_folder)
... ]
>>> print(r)
['/home/qrs/sample1.csv', '/home/abc/sample1.csv']
Without libs or anything by just using basic functions i came up with this:
file_list1 = ['/home/qrs/sample1.csv', '/home/abc/sample1.csv', '/home/mno/sample13.csv', '/home/xyz/sample2.csv']
change_folder = ['mno', 'xyz']
output_list = file_list1.copy()
for i in range(len(file_list1)): # loop through paths
for folder in change_folder: # loop through folders
if folder in file_list1[i]: # checks if the folder string is inside the path string
output_list.pop(output_list.index(file_list1[i])) # pops the path out of the output list
print(output_list)
Output:
['/home/qrs/sample1.csv', '/home/abc/sample1.csv']
I have found solutions in different languages, but not Python. Having limited experience in coding except Python and R, I can’t translate them properly.
I have a list of files like this:
file_list1 = ['/home/qrs/sample1.csv',
'/home/abc/sample1.csv',
'/home/mno/sample13.csv',
'/home/xyz/sample2.csv']
I also have a list of folders like:
change_folder = ['mno', 'xyz']
Now i want to filter the file list where folders do not match as those in the folder list. The intended result would be:
file_list2 = ['/home/qrs/sample1.csv',
'/home/abc/sample1.csv']
I have tried:
file_list2 = [x for x in file_list1 if change_folder not in x]
This only works if I have just one pattern, not for multiple patterns. Please help.
Edit: Got some of the answer in another question. However, the answer gotten here also includes elements that cater specifically to filtering path strings, which will be valuable for future visitors.
Already stated why this question is not a duplicate even if it seems similar to some other generic matching question.
I would strongly recommend to use Path.parts
API to check if a folder is part of a path. Also you can use any
with generator expression to test the condition against multiple patterns.
>>> from pathlib import Path
>>>
>>>
>>> file_list1 = [
... "/home/qrs/sample1.csv",
... "/home/abc/sample1.csv",
... "/home/mno/sample13.csv",
... "/home/xyz/sample2.csv",
... ]
>>>
>>> change_folder = ["mno", "xyz"]
>>>
>>> r = [
... path
... for path in file_list1
... if not any(folder in Path(path).parts for folder in change_folder)
... ]
>>> print(r)
['/home/qrs/sample1.csv', '/home/abc/sample1.csv']
Without libs or anything by just using basic functions i came up with this:
file_list1 = ['/home/qrs/sample1.csv', '/home/abc/sample1.csv', '/home/mno/sample13.csv', '/home/xyz/sample2.csv']
change_folder = ['mno', 'xyz']
output_list = file_list1.copy()
for i in range(len(file_list1)): # loop through paths
for folder in change_folder: # loop through folders
if folder in file_list1[i]: # checks if the folder string is inside the path string
output_list.pop(output_list.index(file_list1[i])) # pops the path out of the output list
print(output_list)
Output:
['/home/qrs/sample1.csv', '/home/abc/sample1.csv']