How to modify rows with conditions? In Python
Question:
I have a dataset of employee history containing information on job, manager, and etc. What I am trying to see is if a manager has taken over for another in their absence. If that happens have the current manager filing in add a (Sub) next to their name.
This is the output I have:
Emp_ID Job_Title Manager_Pos Manager Name MGR_ID
1 Sales 627 John Doe 12
1 Sales 627 John Doe 12
1 Sales 627 David Stern 4
2 Tech 324 Mark Smith 7
2 Tech 324 Henry Ford 13
2 Tech 324 Henry Ford 13
This the output I want:
Emp_ID Job_Title Manager_pos Manager Name Mgr_ID
1 Sales 627 John Doe 12
1 Sales 627 John Doe 12
1 Sales 627 David Stern(Sub) 4
2 Tech 324 Mark Smith 7
2 Tech 324 Henry Ford(Sub) 13
2 Tech 324 Henry Ford(Sub) 13
I have tried using:
`np.where((df['Manager_pos].head(1) == df['Manager_pos') & (df['Manager Name'].head(1) != df['Manager Name'].tail(1)), df['Manager Name'] + 'Sub', df['Manager Name')
This code ends up throwing an error. Any Suggestions?
Answers:
Use boolean mask. If the rank is greater than one, append ‘(Sub)’ to Manager Name
column:
cols = ['Emp_ID', 'Manager Pos']
m = df.groupby(cols)['Manager Name'].rank(method='dense', ascending=False).gt(1)
df.loc[m, 'Manager Name'] += ' (Sub)'
Output:
>>> df
Emp_ID Job_Title Manager_pos Manager Name Mgr_ID
0 1 Sales 627 John Doe 12
1 1 Sales 627 John Doe 12
2 1 Sales 627 David Stern (Sub) 4
3 2 Tech 324 Mark Smith 7
4 2 Tech 324 Henry Ford (Sub) 13
5 2 Tech 324 Henry Ford (Sub) 13
Assuming you want to append '(sub)'
whenever the manager has changed since the first one within a group, use groupby.transform
to identify the first name and then boolean indexing:
m = (df.groupby(['Emp_ID', 'Manager_pos']) # for each group
['Manager Name'].transform('first') # get first name
.ne(df['Manager Name']) # check if current row is different
)
df.loc[m, 'Manager Name'] += '(sub)'
Output:
Emp_ID Job_Title Manager_pos Manager Name Mgr_ID
0 1 Sales 627 John Doe 12
1 1 Sales 627 John Doe 12
2 1 Sales 627 David Stern(sub) 4
3 2 Tech 324 Mark Smith 7
4 2 Tech 324 Henry Ford(sub) 13
5 2 Tech 324 Henry Ford(sub) 13
I have a dataset of employee history containing information on job, manager, and etc. What I am trying to see is if a manager has taken over for another in their absence. If that happens have the current manager filing in add a (Sub) next to their name.
This is the output I have:
Emp_ID Job_Title Manager_Pos Manager Name MGR_ID
1 Sales 627 John Doe 12
1 Sales 627 John Doe 12
1 Sales 627 David Stern 4
2 Tech 324 Mark Smith 7
2 Tech 324 Henry Ford 13
2 Tech 324 Henry Ford 13
This the output I want:
Emp_ID Job_Title Manager_pos Manager Name Mgr_ID
1 Sales 627 John Doe 12
1 Sales 627 John Doe 12
1 Sales 627 David Stern(Sub) 4
2 Tech 324 Mark Smith 7
2 Tech 324 Henry Ford(Sub) 13
2 Tech 324 Henry Ford(Sub) 13
I have tried using:
`np.where((df['Manager_pos].head(1) == df['Manager_pos') & (df['Manager Name'].head(1) != df['Manager Name'].tail(1)), df['Manager Name'] + 'Sub', df['Manager Name')
This code ends up throwing an error. Any Suggestions?
Use boolean mask. If the rank is greater than one, append ‘(Sub)’ to Manager Name
column:
cols = ['Emp_ID', 'Manager Pos']
m = df.groupby(cols)['Manager Name'].rank(method='dense', ascending=False).gt(1)
df.loc[m, 'Manager Name'] += ' (Sub)'
Output:
>>> df
Emp_ID Job_Title Manager_pos Manager Name Mgr_ID
0 1 Sales 627 John Doe 12
1 1 Sales 627 John Doe 12
2 1 Sales 627 David Stern (Sub) 4
3 2 Tech 324 Mark Smith 7
4 2 Tech 324 Henry Ford (Sub) 13
5 2 Tech 324 Henry Ford (Sub) 13
Assuming you want to append '(sub)'
whenever the manager has changed since the first one within a group, use groupby.transform
to identify the first name and then boolean indexing:
m = (df.groupby(['Emp_ID', 'Manager_pos']) # for each group
['Manager Name'].transform('first') # get first name
.ne(df['Manager Name']) # check if current row is different
)
df.loc[m, 'Manager Name'] += '(sub)'
Output:
Emp_ID Job_Title Manager_pos Manager Name Mgr_ID
0 1 Sales 627 John Doe 12
1 1 Sales 627 John Doe 12
2 1 Sales 627 David Stern(sub) 4
3 2 Tech 324 Mark Smith 7
4 2 Tech 324 Henry Ford(sub) 13
5 2 Tech 324 Henry Ford(sub) 13