Create a dictionary based on a function and a list of strings

Question:

I have a list of strings, lostrings, and a function, splitter, which splits a string.

lostrings = 
    ['308 921 q53 C13 0000000200',
     '300 920 q51 C13 000000199',
     '318 921 q53 C12 0000000199']

def slitter(s: str) -> list:
    value1 = s[:3]
    value2 = s[4:7]
    value3 = s[9:12]
    value4 = s[14:17]
    value5 = s[25:]
    return [value1, value2, value3, value4, value5] 

Example: splitter(lostrings[0]) will output ['308', '921', 'q53', 'C13', '200'].

What I am trying to do is to create a key-value dictionary where keys are 'value1, 'value2', 'value3', 'value4', 'value5' and values are lists. Desired output is as follows:

{'value1': ['308', '300', '318'],
 'value2': ['912', '920', '921'],
 'value3': ['q53', 'q51', 'q53'],
 'value4': ['C13', 'C13', 'C12'],
 'value5': ['200', '199', '199']}

I tried the following:
1.

dict(zip(['value1', 'value2', 'value3', 'value4', 'value5'], [splitter[lostrings[row]] for row in range(len(lostrings))]))

This does not give the correct output. I am not sure how to create a dictionary of 'str':list out of a list of strings based on a function.

Asked By: Joe

||

Answers:

You can use lostrings[0].split(' ') instead of using your splitter function.

lostrings = ['308 921 q53 C13 0000000200', '300 920 q51 C13 0000000199', '318 921 q53 C12 0000000199']

mydict = {'value1': [], 'value2': [], 'value3': [], 'value4': [], 'value5': []}

for targetstr in lostrings:
    for idx, targetsubstr in enumerate(targetstr.split(' ')):
        mydict['value%d' % (idx+1)].append(targetsubstr)
mydict = 
{'value1': ['308', '300', '318'],
 'value2': ['921', '920', '921'],
 'value3': ['q53', 'q51', 'q53'],
 'value4': ['C13', 'C13', 'C12'],
 'value5': ['0000000200', '0000000199', '0000000199']}
Answered By: J. Choi

Here is a solution, using defaultdict to initialize & use enumerate to track index.

from collections import defaultdict

lo_strings = [
    '308 921 q53 C13 0000000200',
    '300 920 q51 C13 0000000199',
    '318 921 q53 C12 0000000199'
]

collect_ = defaultdict(list)

for i in lo_strings:
    for k, v in enumerate(i.split(), 1):
        collect_[f'value{k}'].append(v)

print(collect_)

defaultdict(<class 'list'>, {'value1': ['308', '300', '318'], 'value2': ['921', '920', '921'], 'value3': ['q53', 'q51', 'q53'], 'value4': ['C13', 'C13', 'C12'], 'value5': ['0000000200', '0000000199', '0000000199']})
Answered By: sushanth

We could split the lostring by space. This would give us all the substrings. Next, we could put them in a dictionary by their indices.

defaultdict is essentially a wrapper over the normal dict. Whenever a key doesn’t exist, it returns a default value. In this case, it will return a list.

from collections import defaultdict


lostrings = [
    '308 921 q53 C13 0000000200',
    '300 920 q51 C13 0000000199',
    '318 921 q53 C12 0000000199'
]

values = defaultdict(list)

for lostring in lostrings:
    substrings = lostring.split(" ")
    
    for index, substring in enumerate(substrings, start=1):
        values[f"value{index}"].append(substring)

print(values)
Answered By: Preet Mishra

You can map the list of strings to the splitter function, transpose the output so that the lists align with the keys, which you can then zip together to construct a dict:

dict(
    zip(
        ['value1', 'value2', 'value3', 'value4', 'value5'],
        map(
            list,
            zip(*map(splitter, lostrings))
        )
    )
)

Demo: https://replit.com/@blhsing/OutstandingWeakGnudebugger#main.py

Answered By: blhsing