Pandas: How can I select every nth column including groups of columns that are less than n?
Question:
I want to select groups of 5 columns and column groups that are less than 5, for example:
Column 1
Column 2
Column 3
Column 4
Column 5
Column 6
Column 7
Cell 1
Cell 2
Cell 3
Cell 4
Cell 5
Cell 6
Cell 7
Cell 8
Cell 9
Cell 10
Cell 11
Cell 12
Cell 13
Cell 14
The dataframe will always add new columns so this is why I want to make a loop for the summation of the columns. So to explain further, I want to take the sum of Column 1 – Column 5 and create a new column called "Column Sum 1", and the sum of Column 6 and Column 7 as "Column Sum 2".
I tried working with
loc[1:5].apply(np.sum, axis=1)
and it works if the column group is exactly 5, however if the column group is less than 5, then it returns NaN instead of the summation of the last few columns.
Answers:
Create a custom range grouper to group the dataframe along column axis then aggregate with sum
df.groupby(np.arange(df.shape[1]) // 5, axis=1).sum()
I want to select groups of 5 columns and column groups that are less than 5, for example:
Column 1 | Column 2 | Column 3 | Column 4 | Column 5 | Column 6 | Column 7 |
---|---|---|---|---|---|---|
Cell 1 | Cell 2 | Cell 3 | Cell 4 | Cell 5 | Cell 6 | Cell 7 |
Cell 8 | Cell 9 | Cell 10 | Cell 11 | Cell 12 | Cell 13 | Cell 14 |
The dataframe will always add new columns so this is why I want to make a loop for the summation of the columns. So to explain further, I want to take the sum of Column 1 – Column 5 and create a new column called "Column Sum 1", and the sum of Column 6 and Column 7 as "Column Sum 2".
I tried working with
loc[1:5].apply(np.sum, axis=1)
and it works if the column group is exactly 5, however if the column group is less than 5, then it returns NaN instead of the summation of the last few columns.
Create a custom range grouper to group the dataframe along column axis then aggregate with sum
df.groupby(np.arange(df.shape[1]) // 5, axis=1).sum()