changing sort in value_counts
Question:
If I do
mt = mobile.PattLen.value_counts() # sort True by default
I get
4 2831
3 2555
5 1561
[...]
If I do
mt = mobile.PattLen.value_counts(sort=False)
I get
8 225
9 120
2 1234
[...]
What I am trying to do is get the output in 2, 3, 4 ascending order (the left numeric column). Can I change value_counts somehow or do I need to use a different function.
Answers:
I think you need sort_index
, because the left column is called index
. The full command would be mt = mobile.PattLen.value_counts().sort_index()
. For example:
mobile = pd.DataFrame({'PattLen':[1,1,2,6,6,7,7,7,7,8]})
print (mobile)
PattLen
0 1
1 1
2 2
3 6
4 6
5 7
6 7
7 7
8 7
9 8
print (mobile.PattLen.value_counts())
7 4
6 2
1 2
8 1
2 1
Name: PattLen, dtype: int64
mt = mobile.PattLen.value_counts().sort_index()
print (mt)
1 2
2 1
6 2
7 4
8 1
Name: PattLen, dtype: int64
use sort_values if you want largest to smallest horizontal bars
df['education'].value_counts().sort_values().plot.barh()
As hinted by normanius’ comment under jezrael’s answer :
>>> df = pd.DataFrame({"a":[1,1,2,6,6,7,7,7,7,8]})
>>> df.a.value_counts()[df.a.unique()]
1 2
2 1
6 2
7 4
8 1
Name: a, dtype: int64
one can sort by any order by providing a custom index explicitely :
>>> df.a.value_counts()[[8,7,6,2,1]]
8 1
7 4
6 2
2 1
1 2
Name: a, dtype: int64
>>> df.a.value_counts()[[1,8,6,2,7]]
1 2
8 1
6 2
2 1
7 4
Name: a, dtype: int64
This is of particular interest for plotting categorical data :
>>> df.a.value_counts()[['hourly','daily','weekly','monthly']].plot(type="bar")
Anecdotically, it can be used to remove some entries or to make others appear several times :
>>> df.a.value_counts()[[1,1,1,8]]
1 2
1 2
1 2
8 1
Name: a, dtype: int64
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# load the 'tips' dataset from seaborn
tips_data = sns.load_dataset('tips')
tips_data['size'].value_counts().**sort_index(0)**
If I do
mt = mobile.PattLen.value_counts() # sort True by default
I get
4 2831
3 2555
5 1561
[...]
If I do
mt = mobile.PattLen.value_counts(sort=False)
I get
8 225
9 120
2 1234
[...]
What I am trying to do is get the output in 2, 3, 4 ascending order (the left numeric column). Can I change value_counts somehow or do I need to use a different function.
I think you need sort_index
, because the left column is called index
. The full command would be mt = mobile.PattLen.value_counts().sort_index()
. For example:
mobile = pd.DataFrame({'PattLen':[1,1,2,6,6,7,7,7,7,8]})
print (mobile)
PattLen
0 1
1 1
2 2
3 6
4 6
5 7
6 7
7 7
8 7
9 8
print (mobile.PattLen.value_counts())
7 4
6 2
1 2
8 1
2 1
Name: PattLen, dtype: int64
mt = mobile.PattLen.value_counts().sort_index()
print (mt)
1 2
2 1
6 2
7 4
8 1
Name: PattLen, dtype: int64
use sort_values if you want largest to smallest horizontal bars
df['education'].value_counts().sort_values().plot.barh()
As hinted by normanius’ comment under jezrael’s answer :
>>> df = pd.DataFrame({"a":[1,1,2,6,6,7,7,7,7,8]})
>>> df.a.value_counts()[df.a.unique()]
1 2
2 1
6 2
7 4
8 1
Name: a, dtype: int64
one can sort by any order by providing a custom index explicitely :
>>> df.a.value_counts()[[8,7,6,2,1]]
8 1
7 4
6 2
2 1
1 2
Name: a, dtype: int64
>>> df.a.value_counts()[[1,8,6,2,7]]
1 2
8 1
6 2
2 1
7 4
Name: a, dtype: int64
This is of particular interest for plotting categorical data :
>>> df.a.value_counts()[['hourly','daily','weekly','monthly']].plot(type="bar")
Anecdotically, it can be used to remove some entries or to make others appear several times :
>>> df.a.value_counts()[[1,1,1,8]]
1 2
1 2
1 2
8 1
Name: a, dtype: int64
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# load the 'tips' dataset from seaborn
tips_data = sns.load_dataset('tips')
tips_data['size'].value_counts().**sort_index(0)**