Create new dictionaries based on names of keys of another dictionary

Question:

I have a dictionary "A":

A = {
    "Industry1": 1,
    "Industry2": 1,
    "Industry3": 1,
    "Customer1": 1,
    "Customer2": 1,
    "LocalShop1": 1,
    "LocalShop2": 1,
}

I want to group by key names and create new dictionaries for each "category", the names should be generated automatically.

Expected Output:

Industry = {
    "Industry1": 1,
    "Industry2": 1,
    "Industry3": 1,
}
Customer = {
    "Customer1": 1,
    "Customer2": 1,
}
LocalShop = {
    "LocalShop1": 1,
    "LolcalShop2": 1,
}

Can you guys give me a hint to achieve this output, please?

Asked By: Hapzek

||

Answers:

For the purposes of this answer I’ve assumed that every key in the dictionary A has either the word "Industry", "Customer", or "Shop" in it. This allows us to detect what category each entry needs to be in by checking if each key contains a certain substring (i.e. "Industry"). If this assumption doesn’t hold for your specific circumstances, you’ll have to find a different way to write the if/elif statements in the solutions below that fits your situation better.

Here’s one way to do it. You make a new dictionary for each category and check if "Industry", "Customer", or "Shop" is in each key.

industries = {}
customers = {}
shops = {}

for key, value in A.items():
    if "Industry" in key:
        industries[key] = value
    elif "Customer" in key:
        customers[key] = value
    elif "Shop" in key:
        shops[key] = value

Another, cleaner version would be where you have a nested dictionary that stores all of your categories, and each category would have its own dictionary inside the main one. This would help in the future if you needed to add more categories. You’d only have to add them in one place (in the dictionary definition) and the code would automatically adjust.

categories = {
    "Industry": {},
    "Customer": {},
    "Shop": {},
}

for key, value in A.items():
    for category_name, category_dict in categories.items():
        if category_name in key:
            category_dict[key] = value

If you can’t detect the category from the string of an entry, then you may have to store that categorical information in the key or the value of each entry in A, so that you can detect the category when trying to filter everything.

Answered By: Raddude

You can use itertools.groupby with a key that only extracts the word without the number. I wouldn’t recommend to make variables of them, this is not scalable if there are more than 3 keys… just put them in a new dictionary or in a list.

A = {'Industry1': 1,
 'Industry2': 1,
 'Industry3': 1,
 'Customer1': 1,
 'Customer2': 1,
 'LocalShop1': 1,
 'LocalShop2': 1}

grouped = [dict(val) for k, val in itertools.groupby(A.items(), lambda x: re.match('(.+)d{1,}', x[0]).group(1))]

Output grouped:

[
    {'Industry1': 1, 'Industry2': 1, 'Industry3': 1}, 
    {'Customer1': 1, 'Customer2': 1}, 
    {'LocalShop1': 1, 'LocalShop2': 1}
]

If you are sure, that there are exactly 3 elements in that list and you really want them as variables, you can do it with tuple unpacking:

Industry, Customer, LocalShop = [dict(val) for k, val in itertools.groupby(A.items(), lambda x: re.match('(.+)d{1,}', x[0]).group(1))]

I think I would save the results in a new dictionary with the grouped key as new key and the list as value:

grouped_dict = {k: dict(val) for k, val in itertools.groupby(A.items(), lambda x: re.match('(.+)d{1,}', x[0]).group(1))}

Output grouped_dict:

{'Industry': {'Industry1': 1, 'Industry2': 1, 'Industry3': 1},
 'Customer': {'Customer1': 1, 'Customer2': 1},
 'LocalShop': {'LocalShop1': 1, 'LocalShop2': 1}}
Answered By: Rabinzel

Assuming your keys are in (KEYNAME)(NUM), you can do the following:

import re
from collections import defaultdict
from pprint import pprint

A = {
    "Industry1": 1,
    "Industry2": 1,
    "Industry3": 1,
    "Customer1": 1,
    "Customer2": 1,
    "LocalShop1": 1,
    "LocalShop2": 1,
}

key_pattern = re.compile(r"[a-zA-Z]+")

result = defaultdict(dict)
for k, v in A.items():
    key = key_pattern.search(k).group()
    result[key][k] = v

pprint(dict(result))

output:

{'Customer': {'Customer1': 1, 'Customer2': 1},
 'Industry': {'Industry1': 1, 'Industry2': 1, 'Industry3': 1},
 'LocalShop': {'LocalShop1': 1, 'LocalShop2': 1}}

I created a dictionary of dictionaries instead of having individual variables for each dictionary. Its easier to manage and it doesn’t pollute the global namespace.

Basically you iterate through the key value pairs and with r"[a-zA-Z]+" pattern, you grab the part without number. This is what is gonna be used for the key in outer dictionary.

Answered By: S.B
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.