Regular expression to character being repeated more than 1 times
Question:
I have dataframe with column ‘code’:
import pandas as pd
df = pd.DataFrame({'code': ['10SGD01AA103||||||10SGD01AA105||||||10SGD01AA111']})
How can I drop repeated character ‘|’ and leave only one.
10SGD01AA103|10SGD01AA105|10SGD01AA111
I use str.replace:
df['code'] = df['code'].str.replace('|(?=|1+)', '', regex=True)
or
df['code'] = df['code'].str.replace('|(?=|)', '', regex=True)
But repeated character does not drop.
Answers:
Try this:
df["code"] = df["code"].str.replace(r'|+', '|')
I have dataframe with column ‘code’:
import pandas as pd
df = pd.DataFrame({'code': ['10SGD01AA103||||||10SGD01AA105||||||10SGD01AA111']})
How can I drop repeated character ‘|’ and leave only one.
10SGD01AA103|10SGD01AA105|10SGD01AA111
I use str.replace:
df['code'] = df['code'].str.replace('|(?=|1+)', '', regex=True)
or
df['code'] = df['code'].str.replace('|(?=|)', '', regex=True)
But repeated character does not drop.
Try this:
df["code"] = df["code"].str.replace(r'|+', '|')