(Pandas) How to replace certain values of a column from a different dataset but leave other values in the dataset untouched?

Question:

Let’s say I have the dataset:

df1 = pd.DataFrame()
df1['number'] = [0,0,0,0,0]
df1["decade"] = ["1970", "1980", "1990", "2000", "2010"]`

print(df1)

#output:
   number decade
0       0   1970
1       0   1980
2       0   1990
3       0   2000
4       0   2010

and I want to merge it with another dataset:

df2 = pd.DataFrame()
df2['number'] = [1,1]
df2["decade"] = ["1990", "2010"]

print(df2)

#output:
   number decade
0       1   1990
1       1   2010

such that it get’s values only from the decades from df2 that have values in them and leaves the others untouched, yielding:

   number decade
0       0   1970
1       0   1980
2       1   1990
3       0   2000
4       1   2010

how must one go about doing that in pandas? I’ve tried stuff like join, merge, and concat but they all seem to either not give the desired result or not work because of the different dimensions of the 2 datasets. Any suggestions regarding which function I should be looking at?

Thank you so much!

Asked By: BoyAcc

||

Answers:

What about using apply?

First you create a function

def validation(previous,latest):
    if pd.isna(latest):
        return previous
    else:
        return latest

Then you can use the function dataframe.apply to compare the data in df1 to df2

df1['number'] = df1.apply(lambda row: validation(row['number'],df2.loc[df2['decade'] == row.decade].number.max()),axis = 1)

Your result:

   number decade
0       0   1970
1       0   1980
2       1   1990
3       0   2000
4       1   2010
Answered By: Joshua

You can use pandas.DataFrame.merge with pandas.Series.fillna :

out = (
        df1[["decade"]]
            .merge(df2, on="decade", how="left")
            .fillna({"number": df1["number"]}, downcast="infer")
      )

# Output :

print(out)

  decade  number
0   1970       0
1   1980       0
2   1990       1
3   2000       0
4   2010       1
Answered By: abokey
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.