Creating a function that calculate the mean with out using a built in function on python
Question:
Hi I am trying to create a function that will calculate the mean of a column in a dataframe, without using python built in functions.
This is how I did it initially
A B
0 180.0 70
1 170.0 65
2 190.5 80
3 175.0 75
4 190.0 90
5 190.0 90
6 195.0 95
7 200.0 100
8 205.0 105
9 210.0 110
n =len(df2["B"])
total = sum(df2["B"])
mean=total/n
Now I wanted to create a built in function this was my attempt but it is giving me an error. Please assist where I went wrong. (A disclaimer this is a tutorial question).
def summary_statistics(df, column_name):
n =len(df2[column_name])
total=sum(df2[column_name])
mean = total/n
return mean
summary_statistics(df2,["B"])
Answers:
I suggest use pandas functions Series.size
and Series.sum
, then pass column name without []
to function abd change df2
to df
:
def summary_statistics(df, column_name):
n = df[column_name].size
total = df[column_name].sum()
mean = total/n
return mean
out = summary_statistics(df2,"B")
If using length is always same for each column, so is possible use:
def summary_statistics(df, column_name):
n = len(df)
total = df[column_name].sum()
mean = total/n
return mean
You need to do this:
def summary_statistics(df, column_name):
n =len(df[column_name])
total=sum(df[column_name])
mean = total/n
return mean
summary_statistics(df2, "B")
Hi I am trying to create a function that will calculate the mean of a column in a dataframe, without using python built in functions.
This is how I did it initially
A B
0 180.0 70
1 170.0 65
2 190.5 80
3 175.0 75
4 190.0 90
5 190.0 90
6 195.0 95
7 200.0 100
8 205.0 105
9 210.0 110
n =len(df2["B"])
total = sum(df2["B"])
mean=total/n
Now I wanted to create a built in function this was my attempt but it is giving me an error. Please assist where I went wrong. (A disclaimer this is a tutorial question).
def summary_statistics(df, column_name):
n =len(df2[column_name])
total=sum(df2[column_name])
mean = total/n
return mean
summary_statistics(df2,["B"])
I suggest use pandas functions Series.size
and Series.sum
, then pass column name without []
to function abd change df2
to df
:
def summary_statistics(df, column_name):
n = df[column_name].size
total = df[column_name].sum()
mean = total/n
return mean
out = summary_statistics(df2,"B")
If using length is always same for each column, so is possible use:
def summary_statistics(df, column_name):
n = len(df)
total = df[column_name].sum()
mean = total/n
return mean
You need to do this:
def summary_statistics(df, column_name):
n =len(df[column_name])
total=sum(df[column_name])
mean = total/n
return mean
summary_statistics(df2, "B")