changing sort in value_counts

Question:

If I do

mt = mobile.PattLen.value_counts()   # sort True by default

I get

4    2831
3    2555 
5    1561
[...]

If I do

mt = mobile.PattLen.value_counts(sort=False) 

I get

8    225
9    120
2   1234 
[...]

What I am trying to do is get the output in 2, 3, 4 ascending order (the left numeric column). Can I change value_counts somehow or do I need to use a different function.

Asked By: Mark Ginsburg

||

Answers:

I think you need sort_index, because the left column is called index. The full command would be mt = mobile.PattLen.value_counts().sort_index(). For example:

mobile = pd.DataFrame({'PattLen':[1,1,2,6,6,7,7,7,7,8]})
print (mobile)
   PattLen
0        1
1        1
2        2
3        6
4        6
5        7
6        7
7        7
8        7
9        8

print (mobile.PattLen.value_counts())
7    4
6    2
1    2
8    1
2    1
Name: PattLen, dtype: int64


mt = mobile.PattLen.value_counts().sort_index()
print (mt)
1    2
2    1
6    2
7    4
8    1
Name: PattLen, dtype: int64
Answered By: jezrael

use sort_values if you want largest to smallest horizontal bars

 df['education'].value_counts().sort_values().plot.barh()
Answered By: Golden Lion

As hinted by normanius’ comment under jezrael’s answer :

>>> df = pd.DataFrame({"a":[1,1,2,6,6,7,7,7,7,8]})
>>> df.a.value_counts()[df.a.unique()]
1    2
2    1
6    2
7    4
8    1
Name: a, dtype: int64

one can sort by any order by providing a custom index explicitely :

>>> df.a.value_counts()[[8,7,6,2,1]]
8    1
7    4
6    2
2    1
1    2
Name: a, dtype: int64
>>> df.a.value_counts()[[1,8,6,2,7]]
1    2
8    1
6    2
2    1
7    4
Name: a, dtype: int64

This is of particular interest for plotting categorical data :

>>> df.a.value_counts()[['hourly','daily','weekly','monthly']].plot(type="bar")

Anecdotically, it can be used to remove some entries or to make others appear several times :

>>> df.a.value_counts()[[1,1,1,8]]
1    2
1    2
1    2
8    1
Name: a, dtype: int64
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# load the 'tips' dataset from seaborn
tips_data = sns.load_dataset('tips')
tips_data['size'].value_counts().**sort_index(0)**
Answered By: Rajesh Srinivasan
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.