Find minimum of the entire dataframe?

Question:

I have a dataFrame like:

    a  b
0   4  7
1   3  2
2   1  9
3   3  4
4   2  Nan

I need to calculate min, mean, std, sum, for all dataFrame as a single list of numbers. (e.g minimum here is 1)

EDIT: The data may have Nans or different size columns.

df.to_numpy().mean()

Produce Nan, because there are nans in the arrays and they have different length.

How to calculate all normal math stuff on all of these numbers ?

Asked By: gotiredofcoding

||

Answers:

Pandas solution is with reshape by DataFrame.stack and Series.agg:

def std_ddof0(x):
    return x.std(ddof=0)

out = df.stack().agg(['mean','sum',std_ddof0, 'min'])
print (out)
mean          3.888889
sum          35.000000
std_ddof0     2.424158
min           1.000000
dtype: float64

Numpy solution with np.nanmean, np.nansum, np.nanstd, np.nanmin:

totalp = df.to_numpy().reshape(-1)

out = np.nanmean(totalp), np.nansum(totalp), np.nanstd(totalp), np.nanmin(totalp)
print (out)
(3.888888888888889, 35.0, 2.4241582476968255, 1.0)

Another idea is remove missing values first:

totalp = df.to_numpy().reshape(-1)
totalp = totalp[~np.isnan(totalp)]
print (totalp)
[4. 7. 3. 2. 1. 9. 3. 4. 2.]

out = np.mean(totalp), np.sum(totalp), np.std(totalp), np.min(totalp)
print (out)
(3.888888888888889, 35.0, 2.4241582476968255, 1.0)
Answered By: jezrael
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.