Grouping an array of objects by key in python

Question:

Suppose I have an array of objects.

arr = [
        {'grade': 'A', 'name': 'James'},
        {'grade': 'B', 'name': 'Tom'},
        {'grade': 'A', 'name': 'Zelda'}
      ]

I want this result

{
   'A': [
            {'grade': 'A', 'name': 'James'},
            {'grade': 'A', 'name': 'Zelda'}
        ],
   'B': [ {'grade': 'B', 'name': 'Tom'} ]
}
Asked By: hendrixchord

||

Answers:

Using dict.setdefault we can do this:

import json
gradeList = [
    {"grade": 'A', "name": 'James'},
    {"grade": 'B', "name": 'Tom'},
    {"grade": 'A', "name": 'Zelda'}
]
gradeDict = {}
for d in gradeList:
    gradeDict.setdefault(d["grade"], []).append(d)

print(json.dumps(gradeDict, indent=4))

Output:

{
    "A": [
        {
            "grade": "A",
            "name": "James"
        },
        {
            "grade": "A",
            "name": "Zelda"
        }
    ],
    "B": [
        {
            "grade": "B",
            "name": "Tom"
        }
    ]
}
Answered By: Hampus Larsson

Use a dict and setdefault:

setdefault(key[, default])

If key is in the dictionary, return its value. If not, insert key with a value of default and return default. default defaults to None.

arr2 = {}
for d in arr:
    t = arr2.setdefault(d['grade'], [])
    t.append(d)
>>> arr2
{'A': [{'grade': 'A', 'name': 'James'}, {'grade': 'A', 'name': 'Zelda'}],
 'B': [{'grade': 'B', 'name': 'Tom'}]}
Answered By: Corralien

I would use a pd.Dataframe and do it like this:

import pandas as pd
df = pd.Dataframe(arr)    
for index, group in df.groupby('grade'):
    print(group)

Instead of print(group) you can write the data to whatever you need it, I suppose it is not necessarily a dict like you described.

Answered By: qkfsbxjayiedbe

I would do a simple loop like this:

arr = [{'grade': 'A', 'name': 'James'}, {'grade': 'B', 'name': 'Tom'}, {'grade': 'A', 'name': 'Zelda'}]

grouped_grades = {}

for item in arr:
    if item['grade'] not in grouped_grades:
        grouped_grades[item['grade']] = []
        
    grouped_grades[item['grade']].append(item)

print(grouped_grades)

Output:

{'A': [{'grade': 'A', 'name': 'James'}, {'grade': 'A', 'name': 'Zelda'}], 'B': [{'grade': 'B', 'name': 'Tom'}]}
Answered By: sjjk001

I think that the easiest way is to use defaultdict. Then you could convert the result back into an ordinary dict if you need to by simply passing it in the constructor like dict(output).

from collections import defaultdict
output = defaultdict(lambda: [])

for item in arr:
    output[item['grade']].append(item)
Answered By: emarcus

You can use itertools.groupby

>>> keyfunc = lambda item: item['grade']
>>> {k:list(v) for k,v in itertools.groupby( sorted(arr,key=keyfunc) , keyfunc) }
{'A': [{'grade': 'A', 'name': 'James'}, {'grade': 'A', 'name': 'Zelda'}], 'B': [{'grade': 'B', 'name': 'Tom'}]}
Answered By: napuzba
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.