Convert string to dictionary while adding values together for identical keys

Question:

I have a dataset of names and activities. This data is in one long string format. The data is divided into multiple lines (separated by line break "n"). Each line has a name and an activity separated by a colon. The last line does not have the line break.

Example: "Jack:travelnPeter:cyclingnJack:fishingnPeter:running"

The goal is to create a dictionary from this string, but if names are duplicates, then add activities together into a list after this name:

In the current example the output should be:

{"Jack": ["travel", "fishing"], "Peter": ["cycling", "running"]}

How can I do that?

Answers:

You can just use str.split() to loop over every line, then get the name and activity, adding or appending them depending on whether or not it is already in the dictionary. Like this:

data = 'Jack:travelnPeter:cyclingnJack:fishingnPeter:runningnJack:fishing'

dic = {}
for line in data.split('n'):
    [name, activity] = line.split(':')
    if name not in dic:
        dic[name] = [activity]
    elif activity not in dic[name]:
        dic[name].append(activity)

print(dic) # => {'Jack': ['travel', 'fishing'], 'Peter': ['cycling', 'running']}

However, as a comment pointed out, it may be better to use a set that will automatically drop duplicates. Like this:

data = 'Jack:travelnPeter:cyclingnJack:fishingnPeter:runningnJack:fishing'

dic = {}
for line in data.split('n'):
    [name, activity] = line.split(':')
    if name not in dic:
        dic[name] = {activity}
    elif activity:
        dic[name].add(activity)

print(dic)
# => {'Jack': {'travel', 'fishing'}, 'Peter': {'cycling', 'running'}}
Answered By: Michael M.
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.