Creating a function that calculate the mean with out using a built in function on python

Question:

Hi I am trying to create a function that will calculate the mean of a column in a dataframe, without using python built in functions.
This is how I did it initially

    A       B
0   180.0   70
1   170.0   65
2   190.5   80
3   175.0   75
4   190.0   90
5   190.0   90
6   195.0   95
7   200.0   100
8   205.0   105
9   210.0   110
n =len(df2["B"]) 
total = sum(df2["B"])
mean=total/n 

Now I wanted to create a built in function this was my attempt but it is giving me an error. Please assist where I went wrong. (A disclaimer this is a tutorial question).

def summary_statistics(df, column_name):
    n =len(df2[column_name])
    total=sum(df2[column_name])
    mean = total/n
    return mean
    
summary_statistics(df2,["B"])
Asked By: thole

||

Answers:

I suggest use pandas functions Series.size and Series.sum, then pass column name without [] to function abd change df2 to df:

def summary_statistics(df, column_name):
    n = df[column_name].size
    total = df[column_name].sum()
    mean = total/n
    return mean
    
out = summary_statistics(df2,"B")

If using length is always same for each column, so is possible use:

def summary_statistics(df, column_name):
    n = len(df)
    total = df[column_name].sum()
    mean = total/n
    return mean
Answered By: jezrael

You need to do this:

def summary_statistics(df, column_name):
    n =len(df[column_name])
    total=sum(df[column_name])
    mean = total/n
    return mean
    
summary_statistics(df2, "B")