How to locate and add text to rows that don't fit the conditions. In python

Question:

I have a dataset that uses numpy and pandas with employment history and in my code I look for those employees that report to a vacant manager spot being held by another manager in the mean time. Right now it kind of works but needs to be refined. Here is the current data and code that I have.

Code:

m = df.groupby('ID','Reporting_Manager_ID'])['Manager_Name'].transform('first' ).ne(['Manger_Name'])
df.loc[m,'Manager_Name'] += ' (Vacant)' 

The output for this is:

Emp_ID     Reporting_Manager_ID     Manager_Name
   1             4012                John Wick 
   1             4012                John Wick 
   2             2812                Sarah Smith 
   2             2812                Sarah Smith 
   2             2812                John Wick (Vacant) 
   3             9236                Peter Doe 
   3             9236                John Wick (Vacant)
   3             9236                John Wick 
   4             1293                John Wick 
   4             1293                John Wick  

The original Manager ID for ‘John Wick’ is 4012 and should show as it does however for the other Manager IDs that he takes over [2812, 9236, 1293] should all show (Vacant) for all lines.

Desired Output:

Emp_ID     Reporting_Manager_ID    Manager_Name 
   1              4012              John Wick
   1              4012              John Wick 
   2              2812              Sarah Smith
   2              2812              Sarah Smith
   2              2812              John Wick (Vacant)
   3              9236              Peter Doe
   3              9236              John Wick (Vacant)
   3              9236              John Wick (Vacant)
   4              1293              John Wick (Vacant)
   4              1293              John Wick (Vacant)

The dataset has about 300+ Reporting Manager IDs and this happens multiple times, Any suggestions on how to fix this?

Asked By: Coding_Nubie

||

Answers:

How you choose for a manager his reporting id is unclear but it looks like you choose the first:

report_id = df.groupby('Manager_Name')['Reporting_Manager_ID'].transform('first')
m = ~df['Reporting_Manager_ID'].eq(report_id)
df.loc[m, 'Manager_Name'] += ' (Vacant)'
print(df)

# Output
   Emp_ID  Reporting_Manager_ID        Manager_Name
0       1                  4012           John Wick
1       1                  4012           John Wick
2       2                  2812         Sarah Smith
3       2                  2812         Sarah Smith
4       2                  2812  John Wick (Vacant)
5       3                  9236           Peter Doe
6       3                  9236  John Wick (Vacant)
7       3                  9236  John Wick (Vacant)
8       4                  1293  John Wick (Vacant)
9       4                  1293  John Wick (Vacant)
Answered By: Corralien