Create dictionary from another dictionary with the fastest and scalable way

Question:

I have few scenarios to create a new dictionary:

  1. only take those dictionary in list with key ‘total’ is not zero
  2. delete keys from dictionary e.g ‘total’ and ‘rank’
  3. use the ‘name’ key value as key and ‘game’ key value as list of
    values in the new dict
  4. sort the list of values in new dict

My code is:

# input dictionary
data =[
           {'name': 'foo', 'rank': 3, 'game': 'football', 'total': 1},
           {'name': 'bar', 'rank': 5, 'game': 'hockey', 'total': 0},
           {'name': 'foo', 'rank': 7, 'game': 'tennis', 'total': 0},
           {'name': 'foo', 'rank': 2, 'game': 'cricket', 'total': 2},
           {'name': 'bar', 'rank': 1, 'game': 'cricket', 'total': 8},
        ]

result_list = []
merged_data = {}
result_data = {}

# Get the list of dict if key 'total' value is not zero
dict_without_total = [
    den for den in data if den.get('total')
]

for my_dict in dict_without_total:

    # deleting key 'brand' and 'total' from the
    del my_dict['rank']
    del my_dict['total']

    result_data.update({
        my_dict.get('name'): (my_dict.get('game'))
    })
    result_list.append(result_data)

# store all values of same keys in list and sort the values list
for result in result_list:
    for keys, values in result.items():
        if keys not in merged_data:
            merged_data[keys] = []

        merged_data[keys].append(values)
        merged_data[keys].sort()

print merged_data

Output of my code:

{
    'bar': ['cricket', 'cricket', 'cricket'],
    'foo': ['cricket', 'cricket', 'cricket']
}

Expected result:

{
   'foo': ['cricket', 'football'],
   'bar': ['cricket']
}

Is there a faster way to get the result, or can I use some python builtin function to handle this scenario?

Asked By: Zaheer Jan

||

Answers:

You can try:

data =[
       {'name': 'foo', 'rank': 3, 'game': 'football', 'total': 1},
       {'name': 'bar', 'rank': 5, 'game': 'hockey', 'total': 0},
       {'name': 'foo', 'rank': 7, 'game': 'tennis', 'total': 0},
       {'name': 'foo', 'rank': 2, 'game': 'cricket', 'total': 2},
       {'name': 'bar', 'rank': 1, 'game': 'cricket', 'total': 8},
    ]
final_dict={}
for single_data in data:
    if single_data['total'] > 0:
        if single_data['name'] in final_dict:
            final_dict[single_data['name']].append(single_data['game'])
        else:
            final_dict[single_data['name']]=[single_data['game']]

print final_dict

Output:

{'foo': ['football', 'cricket'], 'bar': ['cricket']}
Answered By: Harsha Biyani

You could really simplify this as there is no need to modify the existing dictionary. It’s usually a lot cleaner to leave the original data structure alone and build a new one as the result.

data = [
    {'name': 'foo', 'rank': 3, 'game': 'football', 'total': 1},
    {'name': 'bar', 'rank': 5, 'game': 'hockey', 'total': 0},
    {'name': 'foo', 'rank': 7, 'game': 'tennis', 'total': 0},
    {'name': 'foo', 'rank': 2, 'game': 'cricket', 'total': 2},
    {'name': 'bar', 'rank': 1, 'game': 'cricket', 'total': 8},
]

result = {}

for e in data:
    if e["total"]:
        name = e["name"]
        if name not in result:
            result[name] = []
        result[name].append(e["game"])

print result

The result is {'foo': ['football', 'cricket'], 'bar': ['cricket']} which is what you’re looking for.

Answered By: CadentOrange

if I understand your requirements well, this should do it:

names = set(x['name'] for x in data)
{name: sorted(list(set(x['game'] for x in data if (x['total']>0 and x['name']==name)))) for name in names}
Answered By: Julien

Addition to other answers, if you add a result_data={} inside for my_dict in dict_without_total:, it should work fine.

for my_dict in dict_without_total:
    result_data={}
    ....rest of the code...

result_data is not getting reinitialized at each iteration which is the issue.

Answered By: ham

Another solution:

To create the dictionary you want:

from collections import defaultdict
d2 = defaultdict(set)
[d2[d["name"]].add(d["game"]) for d in data if d["total"] > 0]

To sort the keys:

for key in d2.keys():   d2[key] = sorted(list(d2[key]))
Answered By: Javier

You can also go for pandas (alternative approach):

import pandas as pd

df = pd.DataFrame([i for i in data if i['total']])

{k: g['game'].tolist() for k,g in df.groupby('name')}
#Out[178]: {'bar': ['cricket'], 'foo': ['football', 'cricket']}
Answered By: Colonel Beauvel

Well, there has been a while since the last response here, but I would like to add mine too.

One important thing to have into account when resolving problems (using any language) is the concepts and abstractions involved in your problem. And then look for constructs and/or statements that match those concepts in the language of your choice (Python in this case)

Having said that, you’re trying to filter the results by the total value and then have a list of select unique games for each name provided.

list of unique elements == set

So, we are talking about filtering and the set data structure so:

from collections import defaultdict
# Default dict comes handy in this case
result = defaultdict(set)

# Iterates over the filter result.
for record in filter(lambda x: x["total"] > 0, data):
    # Add the game to the set, being a set, if the game already there
    # nothing happend
    result[record["name"]].add(record["game"])
Answered By: Raydel Miranda
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.