How to filter rows based on the value in two columns and a different value in a third

Question:

The following code generates a dataframe to illustrate my question

import pandas as pd
data = [[1152, '1', '10'], [1154, '1', '4'],
       [1152, '1', '10'],  [1155, '2', '10'], 
       [1152, '1', '4'],  [1155, '2', '10']]
    
df = pd.DataFrame(data, columns =['Cow', 'Lact', 'Procedure'])

This generates the following

    Cow   Lact  Procedure
0   1152    1   10
1   1154    1   4
2   1152    1   10
3   1155    2   10
4   1152    1   4
5   1155    2   10

I want to identify the rows where Cow and Lact are the same but procedure is different. The output I am looking for is

    Cow   Lact  Procedure
0   1152    1   10
1   1152    1   10
2   1152    1   4

I figure it will require a groupby and filter function but not sure how to put it together.
Thanks

Asked By: JohnH

||

Answers:

Use groupby.transform('nunique') and boolean indexing:

df[df.groupby(['Cow', 'Lact'])['Procedure'].transform('nunique').gt(1)]

Output:

    Cow Lact Procedure
0  1152    1        10
2  1152    1        10
4  1152    1         4
Answered By: mozway
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.