mean of a column after group by returns nan


I have this:

df = name   year.   salary.   d.   
     a      1990.     3.       5
     b      1992.     90.      1 
     c      1990.     234.     3 

I am trying to group my data frame based on year, and then get the average of the salaries in that year. Then my goal is to assign it to a new column. This is what I do:

df['averageSalaryPerYear'] = df.groupby('year')['salary'].mean()

I do get the correct results for df.groupby(‘year’)[‘salary’].mean(), since when I print them, I get a column of numbers in scientific notation. However, when I assign it to df[‘averageSalaryPerYear’], they all turn into nan. I am not sure why this is happening as the printed values seem to be fine, although they are in scientific notation like this:

1990 1.707235e+07

1991 2.357879e+07

1992 3.098244e+07

which is year and avgOfSalary

Why is this happening? I want my new column to show the correct results of averages.


Asked By: Amin



After groupby the length of rows are different so you can’t add it as new column.

Try transform.

df['averageSalaryPerYear'] = df.groupby('year')['salary'].transform(np.mean)
Answered By: Shuo
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.