Ordering multi-indexed pandas dataframe on two levels, with different criteria for each level

Question:

Consider the dataframe df_counts, constructed as follows:

df2 = pd.DataFrame({
    "word" : ["AA", "AC", "AC", "BA", "BB", "BB", "BB"],
    "letter1": ["A", "A", "A", "B", "B", "B", "B"],
    "letter2": ["A", "C", "C", "A", "B", "B", "B"]
})
df_counts = df2[["word", "letter1", "letter2"]].groupby(["letter1", "letter2"]).count()

Output:

enter image description here

What I would like to do from here, is to order first by letter1 totals, so the rows for letter1 == "B" appear first (there are four words starting with B, vs only three with A), and then ordered within each grouping of letter1 by the values in the word column.

So the final output should be:

                 word
letter1 letter2 
      B       B     3
              A     1
      A       C     2
              A     1

Is this possible to do?

Asked By: butterflyknife

||

Answers:

When you have a complex sorting order, it’s always easy to use numpy.lexsort:

# minor sorting order first, major one last
# - to inverse the order
order = np.lexsort([-df_counts['word'],
                    -df_counts.groupby('letter1')['word'].transform('sum')])

out = df_counts.iloc[order]

The pandas equivalent would be:

(df_counts
 .assign(total=df_counts.groupby('letter1')['word'].transform('sum'))
 .sort_values(by=['total', 'word'], ascending=False)
 .drop(columns='total')
)

Output:

                 word
letter1 letter2      
B       B           3
        A           1
A       C           2
        A           1
Answered By: mozway

use sort_index with index name and ascending pair as True or False

df2 = pd.DataFrame({
    "word" : ["AA", "AC", "AC", "BA", "BB", "BB", "BB"],
    "letter1": ["A", "A", "A", "B", "B", "B", "B"],
    "letter2": ["A", "C", "C", "A", "B", "B", "B"]
})

df_counts = df2[["word", "letter1", "letter2"]].groupby(["letter1", "letter2"]).count()

print(df_counts.index)
print(df_counts.sort_index(level=["letter1","letter2"],ascending=[False,False]))

output:

                    word
letter1 letter2      
B       B           3
        A           1
A       C           2
        A           1
Answered By: Golden Lion
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.