Python: How to sum dict values with shared key

Question:

I have json format key value pairs need to sum only values of another key inside same set if same key.

For example,

obj=[{'A': 1, 'X': 5}, {'B' : 5, 'X': 2 },{'A': 1, 'X': 8}]

If above A key matches, I would like to sum X key values like 5+8 = 13. I’m expecting remove duplicate key of A and sum only X values finally get output like below.

obj=[{'A': 1, 'X': 13}, {'B' : 5, 'X': 2 }]

I have tried something like below, but not working.

>>> for i in range(0, len(obj)):
...   for z in range(0, len(obj)):
...     if obj[i] == obj[z]:
...          print(obj[i]['A'])
Asked By: Vasanth M.Vasanth

||

Answers:

Convert key-value pairs to tuple (except for "X"), and then use that tuple as the key in a new dict to add up values for "X". After that, it’s just reformatting to get the answer.

d = dict.fromkeys(((k, v) for el in obj for k, v in el.items() if k != "X"), 0)
for k, v in d.keys():
    for item in obj:
        if item.get(k) and item[k] == v:
            d[(k, v)] += item["X"]

ans = []
for k, v in d.items():
    curr = {}
    curr[k[0]] = k[1]
    curr["X"] = v
    ans.append(curr)

ans
# [{'A': 1, 'X': 13}, {'B': 5, 'X': 2}]
Answered By: d.b

Here’s what I came up with. It sorts the list. Then uses itertools.groupby to group by the key. Then builds a new dictionary with that group.

obj=[{'A': 1, 'X': 5}, {'B' : 5, 'X': 2 },{'A': 1, 'X': 8}]
sorted_list = sorted(obj, key=lambda x: next(iter(x.items())))
res = []
for key,group in itertools.groupby(sorted_list, key=lambda x: next(iter(x.items()))):
    d = next(group).copy()
    for o in group:
        d['X'] += o['X']
    res.append(d)
Answered By: Johnny Mopp

If it’s a large(ish) dataset pandas might provide some efficiency gains and save some of the nested iteration.

For example:

  • Read the obj list into a DataFrame
  • Only the columns need to be iterated
  • Create a view for each column exposing the non-null values
  • Append a dict containing the column value and the summed 'X' values

import pandas as pd

l = []
d = {}
df = pd.DataFrame(obj, dtype=object)

for col in df:
    if col == 'X': continue
    tmp = df.loc[~df[col].isnull(), [col, 'X']]
    l.append({col: tmp[col].iloc[0],
              'X': tmp['X'].sum()})

Output:

[{'A': 1, 'X': 13}, {'B': 5, 'X': 2}]
Answered By: S3DEV
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.