Grouping items in lists based on keys
Question:
Given an iteratable with (key, value) pairs, return a dict with the keys and a list with all values for each specific key, including duplicates.
Example:
Input: [
('germany', 'john'),
('finland', 'olavi'),
('france', 'alice'),
('germany', 'gerd'),
('germany', 'john')
]
Output: {
'germany': ['john', 'gerd', 'john'],
'finland': ['olavi'],
'france': ['alice']
}
I am looking for some elegant solutions. I also posted what I had in mind.
Answers:
This is just one of the many possible solutions.
input_data = [
('germany', 'john'),
('finland', 'olavi'),
('france', 'alice'),
('germany', 'gerd'),
('germany', 'john')
]
output_data = {}
for k, v in input_data:
output_data[k] = output_data.get(k, []) + [v]
input_data=[
('germany', 'john'),
('finland', 'olavi'),
('france', 'alice'),
('germany', 'gerd'),
('germany', 'john')
]
# Creating unique Keys with list as values
output={key:[] for key in dict.fromkeys([i[0] for i in input_data])}
# Fill the Lists with the correspondig Keys
for key,value in input_data:
output[key].append(value)
print(output)
Another variant:
given = [
('germany', 'john'),
('finland', 'olavi'),
('france', 'alice'),
('germany', 'gerd'),
('germany', 'john')
]
result = dict()
for k, v in given:
try:
result[k].append(v)
except KeyError:
result[k] = [v]
Edit: Picking up the suggestion in the comments. It’s one line shorter and perhaps the easiest to read from all variants:
result = dict()
for k, v in given:
if k not in result:
result[k] = []
result[k].append(v)
Hope it will usefull.
input_data=[
('germany', 'john'),
('finland', 'olavi'),
('france', 'alice'),
('germany', 'gerd'),
('germany', 'john')
]
final_dict = {}
key = []
for inp in input:
if inp[0] not in key:
key.append(inp[0])
final_dict[inp[0]] = [inp[1]]
else:
final_dict[inp[0]].append(inp[1])
Alternatively, you can try this – using dict.setdefault:
data= [
('germany', 'john'),
('finland', 'olavi'),
('france', 'alice'),
('germany', 'gerd'),
('germany', 'john')
]
groups = {}
for country, name in data:
groups.setdefault(country, []).append(name)
print(groups)
Output:
{'germany': ['john', 'gerd', 'john'], 'finland': ['olavi'], 'france': ['alice']}
A good way is to use collections.defaultdict here:
import collections
from typing import Iterable, Tuple, Dict, List
def group_data(matches: Iterable[Tuple[str, str]]) -> Dict[str, List[str]]:
res = collections.defaultdict(list)
for key, value in matches:
res[key].append(value)
return dict(res)
Testing
input_data = [
('germany', 'john'),
('finland', 'olavi'),
('france', 'alice'),
('germany', 'gerd'),
('germany', 'john')
]
print(group_data(input_data))
Result
{'germany': ['john', 'gerd', 'john'], 'finland': ['olavi'], 'france': ['alice']}
Given an iteratable with (key, value) pairs, return a dict with the keys and a list with all values for each specific key, including duplicates.
Example:
Input: [
('germany', 'john'),
('finland', 'olavi'),
('france', 'alice'),
('germany', 'gerd'),
('germany', 'john')
]
Output: {
'germany': ['john', 'gerd', 'john'],
'finland': ['olavi'],
'france': ['alice']
}
I am looking for some elegant solutions. I also posted what I had in mind.
This is just one of the many possible solutions.
input_data = [
('germany', 'john'),
('finland', 'olavi'),
('france', 'alice'),
('germany', 'gerd'),
('germany', 'john')
]
output_data = {}
for k, v in input_data:
output_data[k] = output_data.get(k, []) + [v]
input_data=[
('germany', 'john'),
('finland', 'olavi'),
('france', 'alice'),
('germany', 'gerd'),
('germany', 'john')
]
# Creating unique Keys with list as values
output={key:[] for key in dict.fromkeys([i[0] for i in input_data])}
# Fill the Lists with the correspondig Keys
for key,value in input_data:
output[key].append(value)
print(output)
Another variant:
given = [
('germany', 'john'),
('finland', 'olavi'),
('france', 'alice'),
('germany', 'gerd'),
('germany', 'john')
]
result = dict()
for k, v in given:
try:
result[k].append(v)
except KeyError:
result[k] = [v]
Edit: Picking up the suggestion in the comments. It’s one line shorter and perhaps the easiest to read from all variants:
result = dict()
for k, v in given:
if k not in result:
result[k] = []
result[k].append(v)
Hope it will usefull.
input_data=[
('germany', 'john'),
('finland', 'olavi'),
('france', 'alice'),
('germany', 'gerd'),
('germany', 'john')
]
final_dict = {}
key = []
for inp in input:
if inp[0] not in key:
key.append(inp[0])
final_dict[inp[0]] = [inp[1]]
else:
final_dict[inp[0]].append(inp[1])
Alternatively, you can try this – using dict.setdefault:
data= [
('germany', 'john'),
('finland', 'olavi'),
('france', 'alice'),
('germany', 'gerd'),
('germany', 'john')
]
groups = {}
for country, name in data:
groups.setdefault(country, []).append(name)
print(groups)
Output:
{'germany': ['john', 'gerd', 'john'], 'finland': ['olavi'], 'france': ['alice']}
A good way is to use collections.defaultdict here:
import collections
from typing import Iterable, Tuple, Dict, List
def group_data(matches: Iterable[Tuple[str, str]]) -> Dict[str, List[str]]:
res = collections.defaultdict(list)
for key, value in matches:
res[key].append(value)
return dict(res)
Testing
input_data = [
('germany', 'john'),
('finland', 'olavi'),
('france', 'alice'),
('germany', 'gerd'),
('germany', 'john')
]
print(group_data(input_data))
Result
{'germany': ['john', 'gerd', 'john'], 'finland': ['olavi'], 'france': ['alice']}