Removing an item from list matching a substring
Question:
How do I remove an element from a list if it matches a substring?
I have tried removing an element from a list using the pop()
and enumerate
method but seems like I’m missing a few contiguous items that needs to be removed:
sents = ['@$tthis sentences needs to be removed', 'this doesnt',
'@$tthis sentences also needs to be removed',
'@$tthis sentences must be removed', 'this shouldnt',
'# this needs to be removed', 'this isnt',
'# this must', 'this musnt']
for i, j in enumerate(sents):
if j[0:3] == "@$t":
sents.pop(i)
continue
if j[0] == "#":
sents.pop(i)
for i in sents:
print i
Output:
this doesnt
@$ this sentences must be removed
this shouldnt
this isnt
#this should
this musnt
Desired output:
this doesnt
this shouldnt
this isnt
this musnt
Answers:
How about something simple like:
>>> [x for x in sents if not x.startswith('@$t') and not x.startswith('#')]
['this doesnt', 'this shouldnt', 'this isnt', 'this musnt']
This should work:
[i for i in sents if not ('@$t' in i or '#' in i)]
If you want only things that begin with those specified sentential use the str.startswith(stringOfInterest)
method
[i for i in sents if i.startswith('#')]
Another technique using filter
filter( lambda s: not (s[0:3]=="@$t" or s[0]=="#"), sents)
The problem with your orignal approach is when you’re on list item i
and determine it should be deleted, you remove it from the list, which slides the i+1
item into the i
position. The next iteration of the loop you’re at index i+1
but the item is actually i+2
.
Make sense?
How do I remove an element from a list if it matches a substring?
I have tried removing an element from a list using the pop()
and enumerate
method but seems like I’m missing a few contiguous items that needs to be removed:
sents = ['@$tthis sentences needs to be removed', 'this doesnt',
'@$tthis sentences also needs to be removed',
'@$tthis sentences must be removed', 'this shouldnt',
'# this needs to be removed', 'this isnt',
'# this must', 'this musnt']
for i, j in enumerate(sents):
if j[0:3] == "@$t":
sents.pop(i)
continue
if j[0] == "#":
sents.pop(i)
for i in sents:
print i
Output:
this doesnt
@$ this sentences must be removed
this shouldnt
this isnt
#this should
this musnt
Desired output:
this doesnt
this shouldnt
this isnt
this musnt
How about something simple like:
>>> [x for x in sents if not x.startswith('@$t') and not x.startswith('#')]
['this doesnt', 'this shouldnt', 'this isnt', 'this musnt']
This should work:
[i for i in sents if not ('@$t' in i or '#' in i)]
If you want only things that begin with those specified sentential use the str.startswith(stringOfInterest)
method
[i for i in sents if i.startswith('#')]
Another technique using filter
filter( lambda s: not (s[0:3]=="@$t" or s[0]=="#"), sents)
The problem with your orignal approach is when you’re on list item i
and determine it should be deleted, you remove it from the list, which slides the i+1
item into the i
position. The next iteration of the loop you’re at index i+1
but the item is actually i+2
.
Make sense?