Regex to remove character before a character set

Question:

I have dataframe with column ‘code’.

import pandas as pd
df = pd.DataFrame({'code': ['10SGD35/AA501/10SGD35/AA599/10SGD36/AA501/10SGD36AA599/10SGD37/AA501/10SGD37/AA527',
                            '10SGD08/AA701/10SGD08/AA704/10SGD09/AA701/10SGD09AA708']})

How can I drop character ‘/’ before character set ‘AA’ in pandas?

                                                                            code
0  10SGD35AA501/10SGD35AA599/10SGD36AA501/10SGD36AA599/10SGD37AA501/10SGD37AA527
1                            10SGD08AA701/10SGD08AA704/10SGD09AA701/10SGD09AA708

I use str.replace:

df['data'] = df['data'].str.replace('|(?=AA)', '', regex=True)

This regex does not work. Where is the mistake?

Asked By: irina_ikonn

||

Answers:

Use a regex with a lookahead and str.replace:

df['code'] = df['code'].str.replace('/(?=AA)', '', regex=True)

Output:

                                                                            code   text
0  10SGD35AA501/10SGD35AA599/10SGD36AA501/10SGD36AA599/10SGD37AA501/10SGD37AA527  text1
1                            10SGD08AA701/10SGD08AA704/10SGD09AA701/10SGD09AA708  text2
regex

regex demo

/       # match "/"
(?=AA)  # only if followed by "AA"
Answered By: mozway
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.