How to split and convert into dictionary and swap the key value
Question:
i have list list
["Germany + A", "France + A", "England + B", "Germany + A" ]
- I need to convert Dictionary
- I need to split by +
- Convert into dictionary and swap the values
- if the values are present then no need to process
Expected is dictionary {"A":["Germany", "France"],"B":["England"] }
code is below, i got the output as dictionary only and need to insert one condition if the values are present then no need to process
l = ["Germany + A", "France + A", "England + B", "Germany + A" ]
m = []
for i in l:
m.append(i.split('+'))
for k,v in m:
n ={k:v}
print({v: k for k, v in n.items()}
Answers:
l = ["Germany + A", "France + A", "England + B", "Germany + A", "Nigeria" ]
m = {}
for s in l:
try:
country = country.strip()
category = category.strip()
foo = m.setdefault(category, [])
if country not in foo:
foo.append(country)
except ValueError as e:
pass
print(m)
my_list = ["Germany + A", "France + A", "England + B", "Germany + A" ]
result = {}
for item in my_list:
country, key = item.split(' + ')
if country not in result.setdefault(key, []):
result[key].append(country)
print(result)
As a side note – use meaningful names, not cryptic one-char names.
As an alternative to using dict.setdefault()
one can use collections.defaultdict
with default value of list
or if the order is not important – set
EDIT: comparison between using dict.setdefault
and collections.defaultdict(list)
from collections import defaultdict
from timeit import timeit
my_list = ["Germany + A", "France + A", "England + B", "Germany + A" ]
def test1(my_list):
result = {}
for item in my_list:
country, key = item.split(' + ')
if country not in result.setdefault(key, []):
result[key].append(country)
return result
def test2(my_list):
result = defaultdict(list)
for item in my_list:
country, key = item.split(' + ')
if country not in result[key]:
result[key].append(country)
return result
print(timeit('test1(my_list)', setup='from __main__ import test1, my_list', number=100000))
print(timeit('test2(my_list)', setup='from __main__ import test2, my_list', number=100000))
output
0.2819225169987476
0.3298255940026138
at least with small sample data setdefault
is a bit faster.
I think opting for a readable solution here is best.
Loop through the list l
, and then do the split on ' + '
.
Then, append the country names to the appropriate key the first time they are encountered.
Notice the use of collections.defaultdict
to initialize the dictionary as a dict of lists.
import collections
l = ["Germany + A", "France + A", "England + B", "Germany + A"]
d = collections.defaultdict(list)
for i in l:
k, v = i.split(' + ')
if k not in d[v]:
d[v].append(k)
print(dict(d))
This gives the output:
{'A': ['Germany', 'France'], 'B': ['England']}
If you want to stick with your original approach, you could again split using ' + '
, and put the result into a list using a list comprehension:
m = [i.split(' + ') for i in l]
Then, you would loop through m
like this:
for k, v in m:
if k not in d[v]:
d[v].append(k)
This is useful if you want the intermediate m
list.
l = ["Germany + A", "France + A", "England + B", "Germany + A" ]
m = []
dicct = {}
for i in l:
m.append(i.split('+'))
for k,v in m:
if v in dicct:
if k not in dicct[v]:
dicct[v].append(k)
else:
dicct[v] = []
dicct[v].append(k)
print(dicct)
Another way to achieve this
There’s a great Python lib called pandas which can also do some nice job and give you some flexibility to play with:
# Input
L = ["Germany + A", "France + A", "England + B", "Germany + A" ]
# Preprocessing
L = [l.split(' + ') for l in L]
import pandas as pd
df = pd.DataFrame(L, columns=['country','type']) # give the columns some names
See what’s in df:
>>> df
country type
0 Germany A
1 France A
2 England B
3 Germany A
# Then, drop duplicate records:
df.drop_duplicates(['country', 'type'], inplace=True)
# Group by type, convert to list for each record and dump to a dict in one shot
grouped = df.groupby('type').apply(lambda x: x['country'].tolist()).to_dict()
And the result:
>>> grouped
{'A': ['Germany', 'France'], 'B': ['England']}
That’s it.
It is not the efficient solution but there are multiple solutions to
multiple questions framed into one
from itertools import groupby
s=["Germany + A", "France + A", "England + B", "Germany + A" ]
m=[i.strip(' ').split('+') for i in s]
[['Germany ', ' A'], ['France ', ' A'], ['England ', ' B'], ['Germany ', ' A']]
#Grouping based on alphabets 'A' , B
new=[list(g) for k, g in groupby(sorted(m,reverse=True), lambda x:x[1])]
[[['Germany ', ' A'], ['Germany ', ' A'], ['France ', ' A']],
[['England ', ' B']]]
#swapping alphabet and Countries position
new=[item[::-1] for sublist in new for item in sublist ]
[[' A', 'Germany '], [' A', 'Germany '], [' A', 'France '], [' B', 'England ']]
dct = dict((key, tuple(v for (k, v) in pairs))
for (key, pairs) in itertools.groupby(new, lambda pair: pair[0]))
{' A': ('Germany ', 'Germany ', 'France '), ' B': ('England ',)}
{k:list(set(v)) for k,v in dct.items()}
i have list list
["Germany + A", "France + A", "England + B", "Germany + A" ]
- I need to convert Dictionary
- I need to split by +
- Convert into dictionary and swap the values
- if the values are present then no need to process
Expected is dictionary {"A":["Germany", "France"],"B":["England"] }
code is below, i got the output as dictionary only and need to insert one condition if the values are present then no need to process
l = ["Germany + A", "France + A", "England + B", "Germany + A" ]
m = []
for i in l:
m.append(i.split('+'))
for k,v in m:
n ={k:v}
print({v: k for k, v in n.items()}
l = ["Germany + A", "France + A", "England + B", "Germany + A", "Nigeria" ]
m = {}
for s in l:
try:
country = country.strip()
category = category.strip()
foo = m.setdefault(category, [])
if country not in foo:
foo.append(country)
except ValueError as e:
pass
print(m)
my_list = ["Germany + A", "France + A", "England + B", "Germany + A" ]
result = {}
for item in my_list:
country, key = item.split(' + ')
if country not in result.setdefault(key, []):
result[key].append(country)
print(result)
As a side note – use meaningful names, not cryptic one-char names.
As an alternative to using dict.setdefault()
one can use collections.defaultdict
with default value of list
or if the order is not important – set
EDIT: comparison between using dict.setdefault
and collections.defaultdict(list)
from collections import defaultdict
from timeit import timeit
my_list = ["Germany + A", "France + A", "England + B", "Germany + A" ]
def test1(my_list):
result = {}
for item in my_list:
country, key = item.split(' + ')
if country not in result.setdefault(key, []):
result[key].append(country)
return result
def test2(my_list):
result = defaultdict(list)
for item in my_list:
country, key = item.split(' + ')
if country not in result[key]:
result[key].append(country)
return result
print(timeit('test1(my_list)', setup='from __main__ import test1, my_list', number=100000))
print(timeit('test2(my_list)', setup='from __main__ import test2, my_list', number=100000))
output
0.2819225169987476
0.3298255940026138
at least with small sample data setdefault
is a bit faster.
I think opting for a readable solution here is best.
Loop through the list l
, and then do the split on ' + '
.
Then, append the country names to the appropriate key the first time they are encountered.
Notice the use of collections.defaultdict
to initialize the dictionary as a dict of lists.
import collections
l = ["Germany + A", "France + A", "England + B", "Germany + A"]
d = collections.defaultdict(list)
for i in l:
k, v = i.split(' + ')
if k not in d[v]:
d[v].append(k)
print(dict(d))
This gives the output:
{'A': ['Germany', 'France'], 'B': ['England']}
If you want to stick with your original approach, you could again split using ' + '
, and put the result into a list using a list comprehension:
m = [i.split(' + ') for i in l]
Then, you would loop through m
like this:
for k, v in m:
if k not in d[v]:
d[v].append(k)
This is useful if you want the intermediate m
list.
l = ["Germany + A", "France + A", "England + B", "Germany + A" ]
m = []
dicct = {}
for i in l:
m.append(i.split('+'))
for k,v in m:
if v in dicct:
if k not in dicct[v]:
dicct[v].append(k)
else:
dicct[v] = []
dicct[v].append(k)
print(dicct)
Another way to achieve this
There’s a great Python lib called pandas which can also do some nice job and give you some flexibility to play with:
# Input
L = ["Germany + A", "France + A", "England + B", "Germany + A" ]
# Preprocessing
L = [l.split(' + ') for l in L]
import pandas as pd
df = pd.DataFrame(L, columns=['country','type']) # give the columns some names
See what’s in df:
>>> df
country type
0 Germany A
1 France A
2 England B
3 Germany A
# Then, drop duplicate records:
df.drop_duplicates(['country', 'type'], inplace=True)
# Group by type, convert to list for each record and dump to a dict in one shot
grouped = df.groupby('type').apply(lambda x: x['country'].tolist()).to_dict()
And the result:
>>> grouped
{'A': ['Germany', 'France'], 'B': ['England']}
That’s it.
It is not the efficient solution but there are multiple solutions to
multiple questions framed into one
from itertools import groupby
s=["Germany + A", "France + A", "England + B", "Germany + A" ]
m=[i.strip(' ').split('+') for i in s]
[['Germany ', ' A'], ['France ', ' A'], ['England ', ' B'], ['Germany ', ' A']]
#Grouping based on alphabets 'A' , B
new=[list(g) for k, g in groupby(sorted(m,reverse=True), lambda x:x[1])]
[[['Germany ', ' A'], ['Germany ', ' A'], ['France ', ' A']],
[['England ', ' B']]]
#swapping alphabet and Countries position
new=[item[::-1] for sublist in new for item in sublist ]
[[' A', 'Germany '], [' A', 'Germany '], [' A', 'France '], [' B', 'England ']]
dct = dict((key, tuple(v for (k, v) in pairs))
for (key, pairs) in itertools.groupby(new, lambda pair: pair[0]))
{' A': ('Germany ', 'Germany ', 'France '), ' B': ('England ',)}
{k:list(set(v)) for k,v in dct.items()}