How to group a list of tuples/objects by similar index/attribute in python?

Question:

Given a list

old_list = [obj_1, obj_2, obj_3, ...]

I want to create a list:

new_list = [[obj_1, obj_2], [obj_3], ...]

where obj_1.some_attr == obj_2.some_attr.

I could throw some for loops and if checks together, but this is ugly. Is there a pythonic way for this? by the way, the attributes of the objects are all strings.

Alternatively a solution for a list containing tuples (of the same length) instead of objects is appreciated, too.

Asked By: Aufwind

||

Answers:

defaultdict is how this is done.

While for loops are largely essential, if statements aren’t.

from collections import defaultdict


groups = defaultdict(list)

for obj in old_list:
    groups[obj.some_attr].append(obj)

new_list = groups.values()
Answered By: S.Lott

Think you can also try to use itertools.groupby. Please note that code below is just a sample and should be modified according to your needs:

data = [[1,2,3],[3,2,3],[1,1,1],[7,8,9],[7,7,9]]

from itertools import groupby

# for example if you need to get data grouped by each third element you can use the following code
res = [list(v) for l,v in groupby(sorted(data, key=lambda x:x[2]), lambda x: x[2])]# use third element for grouping
Answered By: Artsiom Rudzenka

Here are two cases. Both require the following imports:

import itertools
import operator

You’ll be using itertools.groupby and either operator.attrgetter or operator.itemgetter.

For a situation where you’re grouping by obj_1.some_attr == obj_2.some_attr:

get_attr = operator.attrgetter('some_attr')
new_list = [list(g) for k, g in itertools.groupby(sorted(old_list, key=get_attr), get_attr)]

For a[some_index] == b[some_index]:

get_item = operator.itemgetter(some_index)
new_list = [list(g) for k, g in itertools.groupby(sorted(old_list, key=get_item), get_item)]

Note that you need the sorting because itertools.groupby makes a new group when the value of the key changes.


Note that you can use this to create a dict like S.Lott’s answer, but don’t have to use collections.defaultdict.

Using a dictionary comprehension (only works with Python 3+, and possibly Python 2.7 but I’m not sure):

groupdict = {k: g for k, g in itertools.groupby(sorted_list, keyfunction)}

For previous versions of Python, or as a more succinct alternative:

groupdict = dict(itertools.groupby(sorted_list, keyfunction))
Answered By: JAB
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.