Python script using import RE to put list of words into bracket
Question:
I would like to split the following string into a list. I have tried:
import re
mystr = """
MA1-ETLP-01
MA1-ETLP-02
MA1-ETLP-03
MA1-ETLP-04
MA1-ETLP-05
"""
wordList = re.sub("[^w]"," ",mystr).split()
print wordList
I get the output:
['MA1', 'ETLP', '01', 'MA1', 'ETLP', '02', 'MA1', 'ETLP', '03', 'MA1', 'ETLP', '04', 'MA1', 'ETLP', '05']
I want it to look more like:
['MA1-ETLP-01', 'MA1-ETLP-02', 'MA1-ETLP-03', 'MA1-ETLP-04', 'MA1-ETLP-05']
How can I achieve the second output?
Answers:
The following will do the trick:
mystr.split()
You don’t need a regular expression for that. Just send the string to split()
.
>>> mystr = """
...
...
... MA1-ETLP-01
... MA1-ETLP-02
... MA1-ETLP-03
... MA1-ETLP-04
... MA1-ETLP-05
...
... """
>>> mystr.split()
['MA1-ETLP-01', 'MA1-ETLP-02', 'MA1-ETLP-03', 'MA1-ETLP-04', 'MA1-ETLP-05']
If you can have spaces in the lines you will want splitlines instead of split and to filter the empty lines:
mystr = """
MA1-ETLP-01
MA1-ETLP-02
MA1-ETLP-03
MA1-ETLP-04
MA1-ETLP-05
"""
print([line for line in mystr.splitlines() if line])
Based on the script name OpenFileAndFormat
it seems you are reading from a file which if you are you need not split anything, you can read line by line into a list stripping newlines and filtering empty lines:
with open("your_file") as f:
lines = [line for line in map(str.strip, f) if line]
I would like to split the following string into a list. I have tried:
import re
mystr = """
MA1-ETLP-01
MA1-ETLP-02
MA1-ETLP-03
MA1-ETLP-04
MA1-ETLP-05
"""
wordList = re.sub("[^w]"," ",mystr).split()
print wordList
I get the output:
['MA1', 'ETLP', '01', 'MA1', 'ETLP', '02', 'MA1', 'ETLP', '03', 'MA1', 'ETLP', '04', 'MA1', 'ETLP', '05']
I want it to look more like:
['MA1-ETLP-01', 'MA1-ETLP-02', 'MA1-ETLP-03', 'MA1-ETLP-04', 'MA1-ETLP-05']
How can I achieve the second output?
The following will do the trick:
mystr.split()
You don’t need a regular expression for that. Just send the string to split()
.
>>> mystr = """
...
...
... MA1-ETLP-01
... MA1-ETLP-02
... MA1-ETLP-03
... MA1-ETLP-04
... MA1-ETLP-05
...
... """
>>> mystr.split()
['MA1-ETLP-01', 'MA1-ETLP-02', 'MA1-ETLP-03', 'MA1-ETLP-04', 'MA1-ETLP-05']
If you can have spaces in the lines you will want splitlines instead of split and to filter the empty lines:
mystr = """
MA1-ETLP-01
MA1-ETLP-02
MA1-ETLP-03
MA1-ETLP-04
MA1-ETLP-05
"""
print([line for line in mystr.splitlines() if line])
Based on the script name OpenFileAndFormat
it seems you are reading from a file which if you are you need not split anything, you can read line by line into a list stripping newlines and filtering empty lines:
with open("your_file") as f:
lines = [line for line in map(str.strip, f) if line]