Pandas sort multilevel columns with year and month

Question:

I have created a pivotable in pandas with multilevel columns, but the order of columns are not sorted –

Year           2022     2021     2023      
Month      Jan  Feb  Mar Jan  Dec Jun

What I want:

Year           2021     2022     2023      
Month      Jan  Mar  Jan Feb  Jun Dec

How can I get the above order?

Asked By: Atharva Katre

||

Answers:

The (best?) strategy is to convert your Month column as an ordered CategoricalDtype before pivot using astype:

months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
          'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
months = pd.CategoricalDtype(months, ordered=True)

rng = np.random.default_rng(2023)
df = pd.DataFrame({'ID': rng.integers(1, 3, 20),
                   'Year': rng.integers(2021, 2024, 20),
                   'Month': rng.choice(months.categories, 20),
                   'Value': rng.integers(1, 10, 20)})
out = (df.astype({'Month': months})
        .pivot_table(index='ID', columns=['Year', 'Month'], values='Value',
                     aggfunc='mean', fill_value=0))

Output:

>>> out
Year  2021                   2022              2023            
Month  Feb Mar  Sep  Oct Dec  Jan Jun Aug  Oct  Jun Sep Nov Dec
ID                                                             
1        0   8  1.5  6.5   6    8   8   2  7.0    9   9   3   0
2        4   4  0.0  0.0   0    0   0   2  8.5    0   0   0   3

Now you can use sort_index if needed.

Answered By: Corralien