Grouping items in lists based on keys

Question:

Given an iteratable with (key, value) pairs, return a dict with the keys and a list with all values for each specific key, including duplicates.

Example:

Input: [
    ('germany', 'john'), 
    ('finland', 'olavi'), 
    ('france', 'alice'), 
    ('germany', 'gerd'),
    ('germany', 'john')
]

Output: {
    'germany': ['john', 'gerd', 'john'], 
    'finland': ['olavi'], 
    'france': ['alice']
}

I am looking for some elegant solutions. I also posted what I had in mind.

Asked By: Chris

||

Answers:

This is just one of the many possible solutions.

input_data = [
    ('germany', 'john'), 
    ('finland', 'olavi'), 
    ('france', 'alice'), 
    ('germany', 'gerd'),
    ('germany', 'john')
]

output_data = {}
for k, v in input_data:
    output_data[k] = output_data.get(k, []) + [v]
Answered By: Jobo Fernandez
input_data=[
    ('germany', 'john'), 
    ('finland', 'olavi'), 
    ('france', 'alice'), 
    ('germany', 'gerd'),
    ('germany', 'john')
]
# Creating unique Keys with list as values
output={key:[] for key in dict.fromkeys([i[0] for i in input_data])}
# Fill the Lists with the correspondig Keys
for key,value in input_data:
    output[key].append(value)
print(output)
Answered By: jack

Another variant:

given = [
    ('germany', 'john'), 
    ('finland', 'olavi'), 
    ('france', 'alice'), 
    ('germany', 'gerd'),
    ('germany', 'john')
]
    
result = dict()
for k, v in given:
    try:
        result[k].append(v)
    except KeyError:
        result[k] = [v]

Edit: Picking up the suggestion in the comments. It’s one line shorter and perhaps the easiest to read from all variants:

result = dict()
for k, v in given:
    if k not in result:
        result[k] = []
    result[k].append(v)
Answered By: Robert Haas

Hope it will usefull.

 input_data=[
        ('germany', 'john'), 
        ('finland', 'olavi'), 
        ('france', 'alice'), 
        ('germany', 'gerd'),
        ('germany', 'john')
    ]
    
    final_dict = {}
    key = []
    for inp in input:
        if inp[0] not in key:
           key.append(inp[0])
           final_dict[inp[0]] = [inp[1]]
        else:
            final_dict[inp[0]].append(inp[1])
Answered By: Wwe Cena
input_data = [
    ('germany', 'john'), 
    ('finland', 'olavi'), 
    ('france', 'alice'), 
    ('germany', 'gerd'),
    ('germany', 'john')
]

output_data = {}
for k, v in input_data:
    output_data[k] = output_data.get(k, []) + [v]

enter image description here

Answered By: Syed Ibtehaj Ali

Alternatively, you can try this – using dict.setdefault:


data= [
    ('germany', 'john'), 
    ('finland', 'olavi'), 
    ('france', 'alice'), 
    ('germany', 'gerd'),
    ('germany', 'john')
]

groups = {}

for country, name in data:
    groups.setdefault(country, []).append(name)

print(groups)

Output:

{'germany': ['john', 'gerd', 'john'], 'finland': ['olavi'], 'france': ['alice']}
Answered By: Daniel Hao

A good way is to use collections.defaultdict here:

import collections
from typing import Iterable, Tuple, Dict, List

def group_data(matches: Iterable[Tuple[str, str]]) -> Dict[str, List[str]]:
    res = collections.defaultdict(list)
    for key, value in matches:
        res[key].append(value)
    return dict(res)

Testing

input_data = [
    ('germany', 'john'), 
    ('finland', 'olavi'), 
    ('france', 'alice'), 
    ('germany', 'gerd'),
    ('germany', 'john')
]

print(group_data(input_data))

Result

{'germany': ['john', 'gerd', 'john'], 'finland': ['olavi'], 'france': ['alice']}
Answered By: Chris
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.