Selectively Split Python String by Position
Question:
I have a string that I’ve split almost to the level of what I need but not completely. My string looks like this to start:
str1 =
Out[135]: 'C:\\Users\\U321103\\OneDrive - IBERDROLA S.A\VARIABILIDAD CLIMATICA\\VORTEX\\WIND8\349039.SPAIN.ESTE.CARRASCOSA.Power.csv'
I have used a split technique to get it to here:
str2 = str1.split('WIND8\')[1].split('.csv')[0]
Out[132]: '349039.SPAIN.ESTE.CARRASCOSA.Power'
However, I really need this final answer:
str3 = SPAIN.ESTE.Power
And, I’m not sure how to remove the string content before "SPAIN" and between "ESTE" and ".Power". The word "ESTE" will change – meaning that "ESTE" is a region of a country and will change each time the script is run. In the str1 variable, these subset strings will change each time the script is run: "349039", "SPAIN", "ESTE", "CARRASCOSA" so I think that the code needs to select by position between the periods "." in str2. Thank you for your help!
Answers:
_str = 'C:\\Users\\U321103\\OneDrive - IBERDROLA S.A\VARIABILIDAD CLIMATICA\\VORTEX\\WIND8\349039.SPAIN.ESTE.CARRASCOSA.Power.csv'
# split by '', than by '.', than slice.
_str = _str.split('\')[-1].split('.')[1:-1]
Output:
['SPAIN', 'ESTE', 'CARRASCOSA', 'Power']
If you want to join:
_str = '.'.join(_str)
As McLovin said in a comment, you should be able to split by .
and then rejoin by index, assuming that the structure remains the same.
str2 = '349039.SPAIN.ESTE.CARRASCOSA.Power'
substrs = str2.split('.')
str3 = '.'.join([substrs[i] for i in [1,2,-1]])
str3
>> 'SPAIN.ESTE.Power'
For more complicated / flexible splitting and parsing, consider using regular expressions with the re
module. There is a bit of a learning curve but they’re very useful and there are lots of tutorials out there
I have a string that I’ve split almost to the level of what I need but not completely. My string looks like this to start:
str1 =
Out[135]: 'C:\\Users\\U321103\\OneDrive - IBERDROLA S.A\VARIABILIDAD CLIMATICA\\VORTEX\\WIND8\349039.SPAIN.ESTE.CARRASCOSA.Power.csv'
I have used a split technique to get it to here:
str2 = str1.split('WIND8\')[1].split('.csv')[0]
Out[132]: '349039.SPAIN.ESTE.CARRASCOSA.Power'
However, I really need this final answer:
str3 = SPAIN.ESTE.Power
And, I’m not sure how to remove the string content before "SPAIN" and between "ESTE" and ".Power". The word "ESTE" will change – meaning that "ESTE" is a region of a country and will change each time the script is run. In the str1 variable, these subset strings will change each time the script is run: "349039", "SPAIN", "ESTE", "CARRASCOSA" so I think that the code needs to select by position between the periods "." in str2. Thank you for your help!
_str = 'C:\\Users\\U321103\\OneDrive - IBERDROLA S.A\VARIABILIDAD CLIMATICA\\VORTEX\\WIND8\349039.SPAIN.ESTE.CARRASCOSA.Power.csv'
# split by '', than by '.', than slice.
_str = _str.split('\')[-1].split('.')[1:-1]
Output:
['SPAIN', 'ESTE', 'CARRASCOSA', 'Power']
If you want to join:
_str = '.'.join(_str)
As McLovin said in a comment, you should be able to split by .
and then rejoin by index, assuming that the structure remains the same.
str2 = '349039.SPAIN.ESTE.CARRASCOSA.Power'
substrs = str2.split('.')
str3 = '.'.join([substrs[i] for i in [1,2,-1]])
str3
>> 'SPAIN.ESTE.Power'
For more complicated / flexible splitting and parsing, consider using regular expressions with the re
module. There is a bit of a learning curve but they’re very useful and there are lots of tutorials out there