Splitting a list of strings based on substring with variable character

Question:

I have the following list of strings:

my_list = ['2022-09-18 1234 name O0A raw.txt',
'2022-09-18 1234 name O0P raw.txt',
'2022-09-18 1234 name O1A raw.txt',
'2022-09-18 1234 name O1P raw.txt',
'2022-09-18 1234 name O2A raw.txt',
'2022-09-18 1234 name O2P raw.txt',
'2022-09-18 1234 name O3A raw.txt',
'2022-09-18 1234 name O3P raw.txt',
'2022-09-18 1234 name O4A raw.txt',
'2022-09-18 1234 name O4P raw.txt',
'2022-09-18 1234 name O5A raw.txt',
'2022-09-18 1234 name O5P raw.txt',
'2022-09-18 1234 name M0A raw.txt',
'2022-09-18 1234 name M0P raw.txt',
...
'2022-09-18 1234 name M5P raw.txt']

I want to split this into a new list containing let’s say all "O?A", so

my_list_split = ['2022-09-18 1234 name O0A raw.txt',
'2022-09-18 1234 name O1A raw.txt',
'2022-09-18 1234 name O2A raw.txt',
'2022-09-18 1234 name O3A raw.txt',
'2022-09-18 1234 name O4A raw.txt',
'2022-09-18 1234 name O5A raw.txt',]

Based on previous posts on string list substring splitting, it seems the fastest way to do this is by

[s for s in my_list if ' O?A raw' in s]

but this returns an empty string. I guess there is some syntax that I am missing?

Thank you.

Asked By: Woodywoodleg

||

Answers:

It seems like what you’re trying to do is a regular expression to match ‘ O?A raw’, where ‘?’ is any character. Here’s what you want to do:

import re

# ... the lists ...

lst = [s for s in my_list if re.search(".+ O.P raw.+", s)]
print(lst)
Answered By: Michael M.
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.