Convert 1D list into dictionary

Question:

I have a list with categories followed by some elements. Given that I know all the category names, is there a way to turn this into a dictionary of lists, i.e. convert:

l1 = ['cat1', 'a', 'b', 'c', 'cat2', 1, 2, 3, 'cat3', 4, 5, 6, 7, 8]

into:

l1_dic = {'cat1': ['a', 'b', 'c'], 'cat2': [1, 2, 3], 'cat3': [4, 5, 6, 7, 8]}

Edit: It is possible that the categories do NOT have a common string e.g. ‘cat1’ could be replaced by ‘Name’ while ‘cat2’ could be ‘Address’.

Like I said, in my original post, we do know the category names i.e. we do potentially have a list l2 such that:

l2 = ['cat1', 'cat2', 'cat3'] 

Once again, the category names need not necessarily have a common string.

Asked By: R Walser

||

Answers:

You can do this,

d = {}
keys = ['cat1', 'cat2', 'cat3']
for i in l1:
    if i in keys:
        key = i
        d.setdefault(i, [])
    else:
        d[key].append(i)

# Output 
{'cat1': ['a', 'b', 'c'], 'cat2': [1, 2, 3], 'cat3': [4, 5, 6, 7, 8]}

You can iterate through the l1 and assign a value to the dictionary that a specific keyword exists in keys.

Edit:

There has to be some condition to distinguish between key and value you can replace the corresponding condition with this if 'cat' in str(i)
For ex:

values = {'address_1', 'location_1', 'name_1'}
...
if i in values:
..
Answered By: Rahul K P

This can be done most efficiently with a while and RegEx loop. I am assuming the key would be the same pattern.

import re
from collections import defaultdict

#l1 is your list
pat = r"pattern_string"
i = 0
output = defaultdict(list)

while i < len(l1):
    if re.match(pat,l1[i]):
        key = l1[i]
        i += 1
    while not re.match(pat, l1[i]) and i < len(l1):
        output[key].append(l1[i])
        i += 1
Answered By: Mudassir

As you know the categories, a simple loop with tracking of the last key should work:

categories = {'cat1', 'cat2', 'cat3'}

out = {}
key = None
for item in l1:
    if item in categories:
        out[item] = []
        key = item
    else:
        out[key].append(item)

output:

{'cat1': ['a', 'b', 'c'],
 'cat2': [1, 2, 3],
 'cat3': [4, 5, 6, 7, 8]}
Answered By: mozway

Just for fun, a functional approach to this using functools.reduce.

from functools import reduce

categories = {'cat1', 'cat2', 'cat3'}

reduce(lambda acc, x: (x, {x: [], **acc[1]}) if x in categories else 
                      (k:=acc[0], {**(d:=acc[1]), k: d[k] + [x]}), 
       l1, (None, dict()))[1]
# {'cat3': [4, 5, 6, 7, 8], 'cat2': [1, 2, 3], 'cat1': ['a', 'b', 'c']}

We need a tuple to track two pieces of information as we iterate: The last "key" and a dictionary storing the parsed data so far. If the current item is a key we update the current key info in the tuple dictionary with the new key, and we add an empty list to the dictionary using the new key.

If the current else otherwise is not a key, we obviously don’t need to change the first element in the tuple, but we do update the dictionary with the updated list for that key.

Answered By: Chris

Not as much of an efficient solution, but in a comment I saw you wanted a one-liner solution.

Here I have a two-liner:

l1 = ['cat1', 'a', 'b', 'c', 'cat2', 1,2,3, 'cat3',4,5,6,7,8]
l2 = ['cat1','cat2','cat3']

dct = { l2[i] : l1[l1.index(l2[i]) + 1:l1.index(l2[i+1])] for i in range(len(l2) - 1) }
dct[l2[-1]] = l1[l1.index(l2[-1]):]

print(dct)

Output:

{'cat1': ['a', 'b', 'c'], 'cat2': [1, 2, 3], 'cat3': ['cat3', 4, 5, 6, 7, 8]}

Basically, this code goes through every element in l2, initializes it as a key of dct, and then finds the sublist of l1 between every key and makes that the corresponding list.

I hope this helps! Please let me know if you have any further questions/clarifications 🙂

Answered By: Aniketh Malyala

itertools.groupby gives us an elegant way to parse the list into the keys and the subsequent values into chunks, which we can then iterate over to create the desired result:

from itertools import groupby
def make_dict(data, key_names):
    result = {}
    for is_key, elements in groupby(data, lambda d: d in key_names):
        if is_key:
            for key in elements:
                result[key] = []
        else:
            result[key] = list(elements)
    return result

Let’s test it:

>>> make_dict(['cat1', 'a', 'b', 'c', 'cat2', 1, 2, 3, 'cat3', 4, 5, 6, 7, 8],
...           ['cat1', 'cat2', 'cat3'])
{'cat1': ['a', 'b', 'c'], 'cat2': [1, 2, 3], 'cat3': [4, 5, 6, 7, 8]}
>>> make_dict(['a', 'b', 'c', 'd'], ['a', 'b', 'c', 'd'])
{'a': [], 'b': [], 'c': [], 'd': []}
>>> make_dict(['a', 'b', 'c', 'd'], ['a', 'b', 'c'])
{'a': [], 'b': [], 'c': ['d']}
>>> make_dict(['a', 'b', 'c', 'd'], ['a', 'c', 'd'])
{'a': ['b'], 'c': [], 'd': []}
>>> make_dict(['a', 'b', 'c', 'd'], ['a', 'b'])
{'a': [], 'b': ['c', 'd']}

Each of the elements chunks created by groupby is either a sequence of keys or a sequence of values (is_key becomes the result from the lambda, so that tells us which kind of chunk we have). Iterating with l1_dic[key] = [] covers the case where there are consecutive keys in the data – since there are no intervening values, the keys in that group except for the last must have an empty list of values. When a group of values is found, it is assigned to the most recent key – exploiting the fact that for loops don’t create a scope for the iteration variable.

Answered By: Karl Knechtel
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.