How do I update column based on value in pandas df?

Question

I have a column of values where some rows have a range of values. I want to keep the second value of the range.

[10%,15%,30%,16.1% - 22%,16.1% - 22%,16.189% - 20.67%,16.189% - 20.67%]

The code I use to loop through the column and do something based on the condition is :

for i in df.column:
    if ' - ' in str(i):
        split = i.replace(' ', '').split('-')[1]
    
        print(split)

When I print split, I get the value that I want. However, I can’t figure out how to update the value in the column to keep the value I want.

When I do this:

for i in df.column:
    if ' - ' in str(i):
        split = i.replace(' ', '').split('-')[1]
        df['column'][i] =split

I get:
**C:UserslukemAppDataLocalTempipykernel_15752833050674.py:4: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df[‘column’][i] =split
C:UserslukemAppDataLocalTempipykernel_15752833050674.py:4: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame**

Any advice is appreciated.

Asked By: Dookie9636

||

Source

Answer 1

You can use str.extract for this:

import pandas as pd

data = {'column': ['10%','15%','30%','16.1% - 22%','16.1% - 22%',
                   '16.189% - 20.67%','16.189% - 20.67%']}
df = pd.DataFrame(data)

df['column'] = df.column.str.extract(r'(d+.?d+%$)')

print(df)

   column
0     10%
1     15%
2     30%
3     22%
4     22%
5  20.67%
6  20.67%

Or you could use str.split with expand=True. Apply df.ffill on axis=1 and select only the last column:

df['column'] = df.column.str.split(' - ', expand=True).ffill(axis=1).iloc[:,-1]

Answered By: ouroboros1

Answer 2

For the first code you can try this instead df.apply function.

df[column]=df[column].apply(functn_name)

Answered By: anu agarwal

Answer 3

you can use apply with lambda to apply it directly on the whole column:

   import pandas as pd
    df=pd.DataFrame(['10%','15%','30%','16.1% - 22%','16.1% - 22%','16.189% - 20.67%','16.189% - 20.67%'],columns=['column'])


df.column=df.column.apply(lambda x :x.split('-')[1] if ' - ' in x else x)

Answered By: Ran A

How do I update column based on value in pandas df?

Question:

Answers: