Regex to drop character before a character set
Question:
I need convert pandas dataframe.
I have dataframe:
import pandas as pd
df = pd.DataFrame({'data': ['10SGD01|AA169|10SGD01|AA170']})
I need to get:
data
10SGD01AA169|10SGD01AA170
I use str.replace:
df['data'] = df['data'].str.replace('|(?=AA)', '', regex=True)
This regex does not work. Where is the mistake?
Version of pandas == 2.0.3
Answers:
You need escape |
because special regex character:
df['data'] = df['data'].str.replace(r'|(?=AA)', '', regex=True)
print (df)
data
0 10SGD01AA169|10SGD01AA170
I need convert pandas dataframe.
I have dataframe:
import pandas as pd
df = pd.DataFrame({'data': ['10SGD01|AA169|10SGD01|AA170']})
I need to get:
data
10SGD01AA169|10SGD01AA170
I use str.replace:
df['data'] = df['data'].str.replace('|(?=AA)', '', regex=True)
This regex does not work. Where is the mistake?
Version of pandas == 2.0.3
You need escape |
because special regex character:
df['data'] = df['data'].str.replace(r'|(?=AA)', '', regex=True)
print (df)
data
0 10SGD01AA169|10SGD01AA170