Pandas replace specific cells with corresponding values from another series
Question:
Let’s say I have the following pd.DataFrame:
INDEX
a
b
c
A
5
7
2
B
3
2
1
C
9
6
3
And also the following pd.Series:
a
b
c
-1
-4
-5
I would like to replace the values is the DataFrame that are bigger than, or equal to 6, with the respective values from the Series, according to the column name.
For example, I would like to replace cell Ab
(7>6), with -4 (since cell Ab
is in col b
, and the series had -4 in that index).
In the above example, the DataFrame will look like:
~
a
b
c
A
5
-4
2
B
3
2
1
C
-1
-4
3
I know how to identify the required cells using:
df[df>=6]
, but when I’m trying to assign the series (df[df>=6]=series
) I get an error.
Thanks 🙂
Answers:
Lets do mask
along axis=1
df.mask(df >= 6, series, axis=1)
a b c
INDEX
A 5 -4 2
B 3 2 1
C -1 -4 3
You can mask
and fillna
:
out = df.mask(df.ge(6)).fillna(s, downcast='infer')
output:
a b c
INDEX
A 5 -4 2
B 3 2 1
C -1 -4 3
With boolean indexing and fillna
:
s = pd.Series([-1,-4,-5],['a','b','c'])
df[df.lt(6)].fillna(s)
a b c
INDEX
A 5.0 -4.0 2
B 3.0 2.0 1
C -1.0 -4.0 3
Let’s say I have the following pd.DataFrame:
INDEX | a | b | c |
---|---|---|---|
A | 5 | 7 | 2 |
B | 3 | 2 | 1 |
C | 9 | 6 | 3 |
And also the following pd.Series:
a | b | c |
---|---|---|
-1 | -4 | -5 |
I would like to replace the values is the DataFrame that are bigger than, or equal to 6, with the respective values from the Series, according to the column name.
For example, I would like to replace cell Ab
(7>6), with -4 (since cell Ab
is in col b
, and the series had -4 in that index).
In the above example, the DataFrame will look like:
~ | a | b | c |
---|---|---|---|
A | 5 | -4 | 2 |
B | 3 | 2 | 1 |
C | -1 | -4 | 3 |
I know how to identify the required cells using:
df[df>=6]
, but when I’m trying to assign the series (df[df>=6]=series
) I get an error.
Thanks 🙂
Lets do mask
along axis=1
df.mask(df >= 6, series, axis=1)
a b c
INDEX
A 5 -4 2
B 3 2 1
C -1 -4 3
You can mask
and fillna
:
out = df.mask(df.ge(6)).fillna(s, downcast='infer')
output:
a b c
INDEX
A 5 -4 2
B 3 2 1
C -1 -4 3
With boolean indexing and fillna
:
s = pd.Series([-1,-4,-5],['a','b','c'])
df[df.lt(6)].fillna(s)
a b c
INDEX
A 5.0 -4.0 2
B 3.0 2.0 1
C -1.0 -4.0 3