How to Get Proper Count of Values in Python

Question:

I tried asking this question earlier and miscommunicated what I am having trouble with. I have a dataset in python that I am using numpy and pandas in and I am trying to get a count of reports by job type. There are are 100+ titles so I will shorten it for an example:

 ID       Job_Title        reports  
 1        Sales Manager       2
 1        Sales Manager       2
 2        Tech Support        0
 3        Tech Support        1
 3        Tech Support        1
 4        Sales Lead          4
 4        Sales Lead          4
 5        Sales Manager       5
 6        Tech Support        2

I would like to get a accurate count of the reports by position. Something Like this:

 Job_Title      reports    
Sales Manager     7
Sales Lead        4
Tech Support      3

So far what I have is this:

 df.groupby('Job_Title')['reports'].count().sort_values(ascending = False) 

And this is what I am getting:

Job_Title         reports
Tech Support         4
Sales Manager        3
Sales Lead           2
Asked By: That_non_coder

||

Answers:

To get the expected result, you have to drop duplicates by ID and Job_Title before grouping by Job_Title and sum reports values:

>>> (df.drop_duplicates(['ID', 'Job_Title'])
       .groupby('Job_Title', as_index=False)['reports'].sum()
       .sort_values(by='reports', ascending=False, ignore_index=True))

       Job_Title  reports
0  Sales Manager        7
1     Sales Lead        4
2   Tech Support        3
Answered By: Corralien

Try this:

df_new = (df.groupby('Job_Title', as_index=False)['reports'].apply(lambda g: sum(set(g))).sort_values('reports', ascending = False, ignore_index=True))
print(df_new)

Output:

       Job_Title  reports
0  Sales Manager        7
1     Sales Lead        4
2   Tech Support        3
Answered By: I'mahdi
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.