Python: How to copy elements from list that start with a certain letter into new list or remove elements from list that do not start with letter
Question:
I am getting a list from an excel file using pandas.
start_path = r'C:scratch\'
File = 'test.xlsx'
import pandas as pd
mylist = []
df = pd.read_excel(start_path + File, sheet_name='GIS')
mylist = df['Column A'].tolist()
List:
mylist = ['LB-52/LP-7', 'LB-53/LI-5', 'LB-54/LP-8', 'LB-55', 'LB-56', 'ABC']
My goal is to then create a new list from this list, only with the elements that start with LB. So the new list would then be:
newlist = ['LB-52/LP-7', 'LB-53/LI-5', 'LB-54/LP-8', 'LB-55', 'LB-56']
Or just remove all elements from a list that do not start with ‘LB’ (thus removing ABC from the list).
newlist = [str(x for x in mylist if "LB" in x)]
I tried the above and this just spits out:
['<generator object <genexpr> at 0x0000024B5B8F62C8>']
I also have tried the following:
approved = ['LB']
mylist[:] = [str(x for x in mylist if any(sub in x for sub in approved))]
This gets the same generator object message as before.
I feel like this is really simple but cannot figure it out.
Answers:
You can use str.startswith
in list-comprehension:
mylist = ["LB-52/LP-7", "LB-53/LI-5", "LB-54/LP-8", "LB-55", "LB-56", "ABC"]
newlist = [value for value in mylist if value.startswith("LB")]
print(newlist)
Prints:
['LB-52/LP-7', 'LB-53/LI-5', 'LB-54/LP-8', 'LB-55', 'LB-56']
You can remove str()
in newlist = [str(x for x in mylist if "LB" in x)]
but this will leave values such as xxxLBxxx
(LB
is inside the string)
newlist = [x for x in mylist if x[0:2]=="LB"]
You can also use slicing and also check on desired indices with slicing
In general, pandas tends to do the same job faster than python. Therefore you should try to do most of your computation or filtering in pandas before "moving over" to python.
mylist = df.loc[df['Column A'].str.startswith("LB"), 'Column A'].tolist()
mylist
>>> ['LB-52/LP-7', 'LB-53/LI-5', 'LB-54/LP-8', 'LB-55', 'LB-56']
mylist = ['LB-52/LP-7', 'LB-53/LI-5', 'LB-54/LP-8', 'LB-55', 'LB-56', 'ABC']
new_list = [*filter(lambda x: x.startswith('LB'), mylist)]
# ['LB-52/LP-7', 'LB-53/LI-5', 'LB-54/LP-8', 'LB-55', 'LB-56']
This code works
newlist = ['LB-52/LP-7', 'LB-53/LI-5', 'LB-54/LP-8', 'LB-55', 'LB-56']
mylist= [ch for ch in newlist if ch[0:2]=="LB"]
I am getting a list from an excel file using pandas.
start_path = r'C:scratch\'
File = 'test.xlsx'
import pandas as pd
mylist = []
df = pd.read_excel(start_path + File, sheet_name='GIS')
mylist = df['Column A'].tolist()
List:
mylist = ['LB-52/LP-7', 'LB-53/LI-5', 'LB-54/LP-8', 'LB-55', 'LB-56', 'ABC']
My goal is to then create a new list from this list, only with the elements that start with LB. So the new list would then be:
newlist = ['LB-52/LP-7', 'LB-53/LI-5', 'LB-54/LP-8', 'LB-55', 'LB-56']
Or just remove all elements from a list that do not start with ‘LB’ (thus removing ABC from the list).
newlist = [str(x for x in mylist if "LB" in x)]
I tried the above and this just spits out:
['<generator object <genexpr> at 0x0000024B5B8F62C8>']
I also have tried the following:
approved = ['LB']
mylist[:] = [str(x for x in mylist if any(sub in x for sub in approved))]
This gets the same generator object message as before.
I feel like this is really simple but cannot figure it out.
You can use str.startswith
in list-comprehension:
mylist = ["LB-52/LP-7", "LB-53/LI-5", "LB-54/LP-8", "LB-55", "LB-56", "ABC"]
newlist = [value for value in mylist if value.startswith("LB")]
print(newlist)
Prints:
['LB-52/LP-7', 'LB-53/LI-5', 'LB-54/LP-8', 'LB-55', 'LB-56']
You can remove str()
in newlist = [str(x for x in mylist if "LB" in x)]
but this will leave values such as xxxLBxxx
(LB
is inside the string)
newlist = [x for x in mylist if x[0:2]=="LB"]
You can also use slicing and also check on desired indices with slicing
In general, pandas tends to do the same job faster than python. Therefore you should try to do most of your computation or filtering in pandas before "moving over" to python.
mylist = df.loc[df['Column A'].str.startswith("LB"), 'Column A'].tolist()
mylist
>>> ['LB-52/LP-7', 'LB-53/LI-5', 'LB-54/LP-8', 'LB-55', 'LB-56']
mylist = ['LB-52/LP-7', 'LB-53/LI-5', 'LB-54/LP-8', 'LB-55', 'LB-56', 'ABC']
new_list = [*filter(lambda x: x.startswith('LB'), mylist)]
# ['LB-52/LP-7', 'LB-53/LI-5', 'LB-54/LP-8', 'LB-55', 'LB-56']
This code works
newlist = ['LB-52/LP-7', 'LB-53/LI-5', 'LB-54/LP-8', 'LB-55', 'LB-56']
mylist= [ch for ch in newlist if ch[0:2]=="LB"]