pandas pivot_table percentile / quantile

Question:

Is it possible to use percentile or quantile as the aggfunc in a pandas pivot table? I’ve tried both numpy.percentile and pandas quantile without success.

Asked By: Chris

||

Answers:

Dummy data:

In [135]: df = pd.DataFrame([['a',2,3],
                             ['a',5,6],
                             ['a',7,8], 
                             ['b',9,10], 
                             ['b',11,12], 
                             ['b',13,14]], columns=list('abc'))

np.percentile seems to work just fine?

In [140]: df.pivot_table(columns='a', aggfunc=lambda x: np.percentile(x, 50))
Out[140]: 
a  a   b
b  5  11
c  6  12
Answered By: chrisb

The lambda function solutions works, but produces column names of "<lambda_0>" , etc. which need to be renamed later.

Instead of using a lambda (i.e. unnamed function), we could alternatively define our own functions. They should operate on a Series of values.

df = pd.DataFrame([['a',2,3],
                   ['a',5,6],
                   ['a',7,8], 
                   ['b',9,10], 
                   ['b',11,12], 
                   ['b',13,14]], columns=list('abc'))
def quantile_25(growth_vals:pd.Series):
    return growth_vals.quantile(.25)

def quantile_75(growth_vals:pd.Series):
    return growth_vals.quantile(.75)


df.pivot_table(columns='a', aggfunc=[quantile_25, np.median, quantile_75])

The resulting column names will correspond with the function names.

Answered By: s2t2
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.