Sorting a defaultdict by value in python
Question:
I have a data-structure which is something like this:
The population of three cities for different year are as follows.
Name 1990 2000 2010
A 10 20 30
B 20 30 10
C 30 10 20
I am using a defaultdict
to store the data.
from collections import defaultdict
cityPopulation=defaultdict(list)
cityPopulation['A']=[10,20,30]
cityPopulation['B']=[20,30,10]
cityPopulation['C']=[30,10,20]
I want to sort the defaultdict
based on a particular column of the list (the year).
Say, sorting for 1990, should give C,B,A
, while sorting for 2010 should give A,C,B
.
Also, is this the best way to store the data? As I am changing the population values, I want it to be mutable.
Answers:
A defaultdict
doesn’t hold order. You might need to use a OrderedDict
, or sort the keys each time as a list.
E.g:
from operator import itemgetter
sorted_city_pop = OrderedDict(sorted(cityPopulation.items()))
Edit: If you just want to print the order, simply use the sorted
builtin:
for key, value in sorted(cityPopulation.items()):
print(key, value)
>>> sorted(cityPopulation.iteritems(),key=lambda (k,v): v[0],reverse=True) #1990
[('C', [30, 10, 20]), ('B', [20, 30, 10]), ('A', [10, 20, 30])]
>>> sorted(cityPopulation.iteritems(),key=lambda (k,v): v[2],reverse=True) #2010
[('A', [10, 20, 30]), ('C', [30, 10, 20]), ('B', [20, 30, 10])]
Note in python 3 you can’t automagically unpack lambda arguments so you would have to change the code
sorted(cityPopulation.items(), key=lambda k_v: k_v[1][2], reverse=True) #2010
If you want to sort based on the values, not in the keys, use data.items()
and set the key with lambda kv: kv[1]
so that it picks the value.
See an example with this defaultdict
:
>>> from collections import defaultdict
>>> data = defaultdict(int)
>>> data['ciao'] = 17
>>> data['bye'] = 14
>>> data['hello'] = 23
>>> data
defaultdict(<type 'int'>, {'ciao': 17, 'bye': 14, 'hello': 23})
Now, let’s sort by value:
>>> sorted(data.items(), lambda kv: kv[1])
[('bye', 14), ('ciao', 17), ('hello', 23)]
Finally use reverse=True
if you want the bigger numbers to come first:
>>> sorted(data.items(), lambda kv: kv[1], reverse=True)
[('hello', 23), ('ciao', 17), ('bye', 14)]
Note that key=lambda(k,v): v
is a clearer (to me) way to say key=lambda(v): v[1]
, only that the later is the only way Python 3 allows it, since auto tuple unpacking in lambda is not available.
In Python 2 you could say:
>>> sorted(d.items(), key=lambda(k,v): v)
[('bye', 14), ('ciao', 17), ('hello', 23)]
Late answer, and not a direct answer to the question, but if you end-up here from a "Sorting a defaultdict by value in python" google search, this is how I sort ( normal python dictionaries cannot be sorted, but they can be printed sorted) a defaultdict
by its values:
orders = {
'cappuccino': 54,
'latte': 56,
'espresso': 72,
'americano': 48,
'cortado': 41
}
sort_orders = sorted(orders.items(), key=lambda x: x[1], reverse=True)
for i in sort_orders:
print(i[0], i[1])
I have a data-structure which is something like this:
The population of three cities for different year are as follows.
Name 1990 2000 2010
A 10 20 30
B 20 30 10
C 30 10 20
I am using a defaultdict
to store the data.
from collections import defaultdict
cityPopulation=defaultdict(list)
cityPopulation['A']=[10,20,30]
cityPopulation['B']=[20,30,10]
cityPopulation['C']=[30,10,20]
I want to sort the defaultdict
based on a particular column of the list (the year).
Say, sorting for 1990, should give C,B,A
, while sorting for 2010 should give A,C,B
.
Also, is this the best way to store the data? As I am changing the population values, I want it to be mutable.
A defaultdict
doesn’t hold order. You might need to use a OrderedDict
, or sort the keys each time as a list.
E.g:
from operator import itemgetter
sorted_city_pop = OrderedDict(sorted(cityPopulation.items()))
Edit: If you just want to print the order, simply use the sorted
builtin:
for key, value in sorted(cityPopulation.items()):
print(key, value)
>>> sorted(cityPopulation.iteritems(),key=lambda (k,v): v[0],reverse=True) #1990
[('C', [30, 10, 20]), ('B', [20, 30, 10]), ('A', [10, 20, 30])]
>>> sorted(cityPopulation.iteritems(),key=lambda (k,v): v[2],reverse=True) #2010
[('A', [10, 20, 30]), ('C', [30, 10, 20]), ('B', [20, 30, 10])]
Note in python 3 you can’t automagically unpack lambda arguments so you would have to change the code
sorted(cityPopulation.items(), key=lambda k_v: k_v[1][2], reverse=True) #2010
If you want to sort based on the values, not in the keys, use data.items()
and set the key with lambda kv: kv[1]
so that it picks the value.
See an example with this defaultdict
:
>>> from collections import defaultdict
>>> data = defaultdict(int)
>>> data['ciao'] = 17
>>> data['bye'] = 14
>>> data['hello'] = 23
>>> data
defaultdict(<type 'int'>, {'ciao': 17, 'bye': 14, 'hello': 23})
Now, let’s sort by value:
>>> sorted(data.items(), lambda kv: kv[1])
[('bye', 14), ('ciao', 17), ('hello', 23)]
Finally use reverse=True
if you want the bigger numbers to come first:
>>> sorted(data.items(), lambda kv: kv[1], reverse=True)
[('hello', 23), ('ciao', 17), ('bye', 14)]
Note that key=lambda(k,v): v
is a clearer (to me) way to say key=lambda(v): v[1]
, only that the later is the only way Python 3 allows it, since auto tuple unpacking in lambda is not available.
In Python 2 you could say:
>>> sorted(d.items(), key=lambda(k,v): v)
[('bye', 14), ('ciao', 17), ('hello', 23)]
Late answer, and not a direct answer to the question, but if you end-up here from a "Sorting a defaultdict by value in python" google search, this is how I sort ( normal python dictionaries cannot be sorted, but they can be printed sorted) a defaultdict
by its values:
orders = {
'cappuccino': 54,
'latte': 56,
'espresso': 72,
'americano': 48,
'cortado': 41
}
sort_orders = sorted(orders.items(), key=lambda x: x[1], reverse=True)
for i in sort_orders:
print(i[0], i[1])