Python: replace a pattern in a df column with another pattern

Question:

I have a dataframe as per the below:

import pandas as pd
df = pd.DataFrame(
     columns=['deal','details'],
     data=[
           ['deal1', 'MH92h'],
           ['deal2', 'L97h'],
           ['deal3', '97.538'],
           ['deal4', 'LM98h'],
           ['deal5', 'TRD (97.612 cvr)'],
          ]
                 )

I would like to replace the any row that has details = MH[0-9]h with [0-9].75
For example, the output would look as follows:

df =
['deal1', 'MH92h', '92.75']
['deal2', 'L97h', 'L97h'],
['deal3', '97.538', '97.538'],
['deal4', 'MH98h', '98.75'],
['deal5', 'TRD 97.61', 'TRD 97.61']

I’ve tried the below, but it doesn’t work:

df = df.assign(test_col=df.details.str.replace("d+",r'd+'+'75'), regex=True)
Asked By: Mike

||

Answers:

You could match MH([0-9]+)h and replace with capture group 1.

See the capture group 1 at this regex demo.

Note that deal4 has LM98h and not MH98h

import pandas as pd

df = pd.DataFrame(
    columns=['deal','details'],
    data=[
        ['deal1', 'MH92h'],
        ['deal2', 'L97h'],
        ['deal3', '97.538'],
        ['deal4', 'LM98h'],
        ['deal5', 'TRD (97.612 cvr)'],
    ]
)

df = df.assign(test_col=df.details.str.replace(r"MH([0-9]+)h", "g<1>.75"))
print(df)

print(df)

Output

    deal           details          test_col
0  deal1             MH92h             92.75
1  deal2              L97h              L97h
2  deal3            97.538            97.538
3  deal4             LM98h             LM98h
4  deal5  TRD (97.612 cvr)  TRD (97.612 cvr)
Answered By: The fourth bird
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.