Pandas select rows with smaller index

Question:

Here is I think a simple question, But I can’t find an answer.

I’m trying to set a few rows from a dataframe using a mask of a mask. but I get an "Unalignable boolean Series provided as indexer" error.

Here is a small example:

import pandas as pd

data = {'c1':[1,2,3,4], 
        'c2':[2,4,6,8]}

df = pd.DataFrame(data)


mask = df['c1'] >= 3
mask2 = df.loc[mask, 'c2'] <= 6

df[mask2]
df[mask2, 'c2'] = -1

Here mask is:

0    False
1    False
2     True
3     True
Name: c1, dtype: bool

And mask2 is:

2     True
3    False
Name: c2, dtype: bool

But now df[mask2] yields:

IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match

What I should expect it to return is row with index 2:

   c1  c2
2   3   6

I realize that for this example df[(df['c1'] >= 3) & (df['c2'] <= 6)] Would give me the expected result, but my program flow requires a mask of a mask, instead of the intersection of 2 masks.

Asked By: MartijnK

||

Answers:

One solution is to do a logical and (&) of both predicates:

mask = (df['c1'] >= 3) & (df['c2'] <= 6)
res = df[mask]
print(res)

Output

   c1  c2
2   3   6

Use the mask to change the values as follows:

df.loc[mask, 'c2'] = -1
print(df)

Output (after changing df)

   c1  c2
0   1   2
1   2   4
2   3  -1
3   4   8
Answered By: Dani Mesejo

Your approach failed because you need a full array for boolean indexing and mask2 is not aligned with df anymore.

If you want to avoid computing your mask2 on all items (let’s say you subselected only a fraction of the input and the operation is costly), you can reindex the mask:

df.loc[mask2.reindex_like(mask).fillna(False), 'c2_modified'] = -1

output (as new column for clarity):

   c1  c2  c2_modified
0   1   2          NaN
1   2   4          NaN
2   3   6         -1.0
3   4   8          NaN
Answered By: mozway
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.