groupby DataFrame by N columns or N rows

Question:

I’d like to find a general solution to groupby a DataFrame by a specified amount of rows or columns. Example DataFrame:

df = pd.DataFrame(0, index=['a', 'b', 'c', 'd', 'e', 'f'], columns=['c1', 'c2', 'c3', 'c4', 'c5', 'c6', 'c7'])

   c1  c2  c3  c4  c5  c6  c7
a   0   0   0   0   0   0   0
b   0   0   0   0   0   0   0
c   0   0   0   0   0   0   0
d   0   0   0   0   0   0   0
e   0   0   0   0   0   0   0
f   0   0   0   0   0   0   0

For example I’d like to group by 2 rows a time and apply a function like mean or similar. I’d also like to know how to group by N columns a time and apply a function.

Group by 2 rows a time expected output:

   c1  c2  c3  c4  c5  c6  c7
0   0   0   0   0   0   0   0
1   0   0   0   0   0   0   0
2   0   0   0   0   0   0   0

Group by 2 columns a time expected output:

   0  1  2  3
a  0  0  0  0
b  0  0  0  0
c  0  0  0  0
d  0  0  0  0
e  0  0  0  0
f  0  0  0  0
Asked By: luca

||

Answers:

This groups by N rows

>>> N=2

>>> df.groupby(np.arange(len(df.index))//N, axis=0).mean()
   c1  c2  c3  c4  c5  c6  c7
0   0   0   0   0   0   0   0
1   0   0   0   0   0   0   0
2   0   0   0   0   0   0   0

This groups by N columns

>>> df.groupby(np.arange(len(df.columns))//N, axis=1).mean()
   0  1  2  3
a  0  0  0  0
b  0  0  0  0
c  0  0  0  0
d  0  0  0  0
e  0  0  0  0
f  0  0  0  0
Answered By: luca
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.