Assigning values with both boolean masking and indexing

Question:

Consider the following toy example:

df = pd.DataFrame([0,1,2,3,4,5,6], columns=['Value'])
df_subset = df.loc[[3,4,5]]

df.loc[df.Value % 2 == 0, 'Value'] = df_subset.Value * 10

df before assignment:

 0. 0
 1. 1
 2. 2
 3. 3
 4. 4
 5. 5
 6. 6

df after assignment:

 0. NaN
 1. 1
 2. NaN
 3. 3
 4. 40
 5. 5
 6. NaN

This happens due to the following reason:

  • Only items for which the mask / boolean index is true are modified, i.e only even elements
  • This is the reason why idx=1 is not set to NaN
  • Any indices which don’t appear in the index of the right hand side are set to NaN

What I want to achieve however is the same behaviour without setting missing index entries to NaN, i.e

  • Modify elements for which the mask is true
  • For those elements: Replace a value in df with that in df_subset if the particular index is part of df.index

desired output:

 0. 0
 1. 1
 2. 2
 3. 3
 4. 40
 5. 5
 6. 6
Asked By: Sebastian Hoffmann

||

Answers:

First idea is chain both masks by & for bitwise AND, for test index is used Index.isin:

df = pd.DataFrame([0,1,2,3,4,5,6], columns=['Value'])
df_subset = df.loc[[3,4,5]]

mask = (df.Value % 2 == 0) & (df.index.isin([3,4,5]))
df.loc[mask, 'Value'] = df_subset.Value * 10

print (df)
   Value
0      0
1      1
2      2
3      3
4     40
5      5
6      6

Or:

df = pd.DataFrame([0,1,2,3,4,5,6], columns=['Value'])
mask = (df.Value % 2 == 0) & (df.index.isin([3,4,5]))
df.loc[mask, 'Value'] *= 10

print (df)
   Value
0      0
1      1
2      2
3      3
4     40
5      5
6      6

Another idea is filter subset by original mask and use DataFrame.update:

df = pd.DataFrame([0,1,2,3,4,5,6], columns=['Value'])
df_subset = df.loc[[3,4,5]]

df.update(df_subset.loc[df.Value % 2 == 0, 'Value'] * 10)

#alternative
#df.update(df_subset.loc[df_subset.Value % 2 == 0, 'Value'] * 10)

print (df)
   Value
0    0.0
1    1.0
2    2.0
3    3.0
4   40.0
5    5.0
6    6.0
Answered By: jezrael
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.