Pandas DataFrame sort ignoring the case

Question:

I have a Pandas dataframe in Python. The contents of the dataframe are from here. I modified the case of the first alphabet in the “Single” column slightly. Here is what I have:

import pandas as pd
df = pd.read_csv('test.csv')
print df

Position                       Artist                  Single               Year     Weeks
       1                Frankie Laine               I Believe               1953  18 weeks
       2                  Bryan Adams         I Do It for You               1991  16 weeks
       3                  Wet Wet Wet      love Is All Around               1994  15 weeks
       4  Drake (feat. Wizkid & Kyla)               One Dance               2016  15 weeks
       5                        Queen       bohemian Rhapsody  1975/76 & 1991/92  14 weeks
       6                 Slim Whitman              Rose Marie               1955  11 weeks
       7              Whitney Houston  i Will Always Love You               1992  10 weeks

I would like to sort by the Single column in ascending order (a to z). When I run

df.sort_values(by='Single',inplace=True)

it seems that the sort is not able to combine upper and lowercase. Here is what I get:

Position                       Artist                  Single               Year     Weeks
       1                Frankie Laine               I Believe               1953  18 weeks
       2                  Bryan Adams         I Do It for You               1991  16 weeks
       4  Drake (feat. Wizkid & Kyla)               One Dance               2016  15 weeks
       6                 Slim Whitman              Rose Marie               1955  11 weeks
       5                        Queen       bohemian Rhapsody  1975/76 & 1991/92  14 weeks
       7              Whitney Houston  i Will Always Love You               1992  10 weeks
       3                  Wet Wet Wet      love Is All Around               1994  15 weeks

So, it is sorting by uppercase first and then performing a separate sort by lower case. I want a combined sort, regardless of the case of the starting alphabet in the Single column. The row with “bohemian Rhapsody” is in the wrong location after sorting. It should be first; instead it is appearing as the 5th row after the sort.

Is there a way to do sort a Pandas DataFrame while ignoring the case of the text in the Single column?

Asked By: edesz

||

Answers:

Create a copy of Single in all upper case letters and sort by that column:

df["Single.Upper"] = df["Single"].str.upper()
df.sort_values(by="Single.Upper", inplace=True)

You can delete the column later:

del df["Single.Upper"] 
Answered By: DYZ

You can convert all strings to upper/lower case and then call argsort() which gives the index value to reorder the data frame by Single ignoring the case:

df.iloc[df.Single.str.lower().argsort()]

enter image description here

Answered By: Psidom

make the new column, use it while sorting and delete afterward.

df["Single.Lower"] = df["Name"].str.lower()
df.sort_values(['Single.Lower'], axis=0, ascending=True, inplace=True)
del df["Single.Lower"]
Answered By: Sujata Khedkar

Pandas 1.1.0 introduced the key argument as a more intuitive way to achieve this:

df.sort_values(by='Single', inplace=True, key=lambda col: col.str.lower())
Answered By: RafG
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.