Extract string in list based on character in Python
Question:
I have a list of filenames in Python that looks like this, except that the list is much longer:
filenames = ['BETON\map (120).png',
'BETON\map (125).png',
'BETON\map (134).png',
'BETON\map (137).png',
'TUILES\map (885).png',
'TUILES\map (892).png',
'TUILES\map (924).png',
'TUILES\map (936).png',
'TUILES\map (954).png',
'TUILES\map (957).png',
'TUILES\map (97).png',
'TUILES\map (974).png',
'TUILES\map (987).png']
I would like to only keep the first part of the filename strings in my list in order to only keep its type, like so:
filenames = ['BETON',
'BETON',
'BETON',
'BETON',
'TUILES',
'TUILES',
'TUILES',
'TUILES',
'TUILES',
'TUILES',
'TUILES',
'TUILES',
'TUILES']
I have been using a workaround grabbing the first 5 elements
def Extract(files):
return [item[:5] for item in files]
# Driver code
files2 = Extract(files)
However, it’s becoming an issue as I have many more types (indicated in the list of filenames) coming with varying lengths in and I cannot just take the first elements.
How can I extract as soon as it spots the backslash \
?
Many thanks!
Answers:
Split the filenames on a backslash, and take only the first item from the split.
filenames = [n.split('\')[0] for n in filenames]
string.split()
Yeah, you indeed can split every string and take only the part you need.
Try this:
for index in range(len(filenames)):
# Only take the name
filenames[index] = filenames[index].split('\')[0]
This code above doesn’t distingues by file name lenght but it just take the string before the character you pass to the split function. ” in your case.
Alternate solution producing output as desired by OP.
Use python’s operator.methodcaller()
function
The methodcaller()
assists in maintaining your code when you start to define functions, as your use-case expands. With methodcaller()
, you can use list comprehension or map
, or even lambda
(if required).
Please note that you’ll need to import the methodcaller from the operator library. There is nothing to install!
methodcaller
Return a callable object that calls the method name on its operand. If additional arguments and/or keyword arguments are given, they will be given to the method as well.
## import methodcaller
from operator import methodcaller
method_to_use = 'split'
arg1 = '\'
## use methodcaller with list comprehension to split filenames and return the first split
## similar to @John Gordon above
#[methodcaller('split', '\')(n)[0] for n in filenames]
[methodcaller(method_to_use, arg1)(n)[0] for n in filenames]
##alternatively
#'''
## call methodcaller
#f = methodcaller('split', '\')
f = methodcaller(method_to_use, arg1)
filenames_split_caller = [f(n)[0] for n in filenames]
filenames_split_caller
#'''
methodcaller()
with map()
[e[0] for e in list(map(methodcaller(method_to_use, arg1), filenames))]
## get the first element of the first list (list of list after split)
[Edit]
The ‘three’ codes above each provide desired output.
I would like to only keep the first part of the filename strings in my list in order to only keep its type.
PS: The author’s context is not properly addressed by the ‘duplicate’.
import os
list(map(os.path.dirname, filenames))
[‘BETON’, ‘BETON’, ‘BETON’, ‘BETON’, ‘TUILES’, ‘TUILES’, ‘TUILES’, ‘TUILES’, ‘TUILES’, ‘TUILES’, ‘TUILES’, ‘TUILES’, ‘TUILES’]
I have a list of filenames in Python that looks like this, except that the list is much longer:
filenames = ['BETON\map (120).png',
'BETON\map (125).png',
'BETON\map (134).png',
'BETON\map (137).png',
'TUILES\map (885).png',
'TUILES\map (892).png',
'TUILES\map (924).png',
'TUILES\map (936).png',
'TUILES\map (954).png',
'TUILES\map (957).png',
'TUILES\map (97).png',
'TUILES\map (974).png',
'TUILES\map (987).png']
I would like to only keep the first part of the filename strings in my list in order to only keep its type, like so:
filenames = ['BETON',
'BETON',
'BETON',
'BETON',
'TUILES',
'TUILES',
'TUILES',
'TUILES',
'TUILES',
'TUILES',
'TUILES',
'TUILES',
'TUILES']
I have been using a workaround grabbing the first 5 elements
def Extract(files):
return [item[:5] for item in files]
# Driver code
files2 = Extract(files)
However, it’s becoming an issue as I have many more types (indicated in the list of filenames) coming with varying lengths in and I cannot just take the first elements.
How can I extract as soon as it spots the backslash \
?
Many thanks!
Split the filenames on a backslash, and take only the first item from the split.
filenames = [n.split('\')[0] for n in filenames]
string.split()
Yeah, you indeed can split every string and take only the part you need.
Try this:
for index in range(len(filenames)):
# Only take the name
filenames[index] = filenames[index].split('\')[0]
This code above doesn’t distingues by file name lenght but it just take the string before the character you pass to the split function. ” in your case.
Alternate solution producing output as desired by OP.
Use python’s operator.methodcaller()
function
The methodcaller()
assists in maintaining your code when you start to define functions, as your use-case expands. With methodcaller()
, you can use list comprehension or map
, or even lambda
(if required).
Please note that you’ll need to import the methodcaller from the operator library. There is nothing to install!
methodcaller
Return a callable object that calls the method name on its operand. If additional arguments and/or keyword arguments are given, they will be given to the method as well.
## import methodcaller
from operator import methodcaller
method_to_use = 'split'
arg1 = '\'
## use methodcaller with list comprehension to split filenames and return the first split
## similar to @John Gordon above
#[methodcaller('split', '\')(n)[0] for n in filenames]
[methodcaller(method_to_use, arg1)(n)[0] for n in filenames]
##alternatively
#'''
## call methodcaller
#f = methodcaller('split', '\')
f = methodcaller(method_to_use, arg1)
filenames_split_caller = [f(n)[0] for n in filenames]
filenames_split_caller
#'''
methodcaller()
with map()
[e[0] for e in list(map(methodcaller(method_to_use, arg1), filenames))]
## get the first element of the first list (list of list after split)
[Edit]
The ‘three’ codes above each provide desired output.
I would like to only keep the first part of the filename strings in my list in order to only keep its type.
PS: The author’s context is not properly addressed by the ‘duplicate’.
import os
list(map(os.path.dirname, filenames))
[‘BETON’, ‘BETON’, ‘BETON’, ‘BETON’, ‘TUILES’, ‘TUILES’, ‘TUILES’, ‘TUILES’, ‘TUILES’, ‘TUILES’, ‘TUILES’, ‘TUILES’, ‘TUILES’]