How to iterate through list of dictionary, extract values and fill in another data dictionary in python
Question:
I have a list of dictionary as below.
[
{'name':['mallesh'],'email':['[email protected]']},
{'name':['bhavik'],'ssn':['1000011']},
{'name':['jagarini'],'email':['[email protected]'],'phone':['111111']},
{'name':['mallesh'],'email':['[email protected]'],'phone':['1234556'],'ssn':['10000012']}
]
I would like to extract the information from these dictionary based on keys, hold on its information in another dictionary as.
xml_master_dict={'name':[],'email':[],'phone':[],'ssn':[]}
Here xml_master_dict should be filled in with the respective key information as below.
In a fist dictionary we have this:
{'name':['mallesh'],'email':['[email protected]']}
In xml_master_dict name and email keys only will be updated with the current value, if any of key is not existed in the dictionary it should be filled in with None. in this case phone and ssn will be None
Here is an expected output:
{
'name':['mallesh','bhavik','jagarini','mallesh'],
'email':['[email protected]',None,'[email protected]','[email protected]'],
'phone':[None,None,'111111','1234556'],
'ssn':[None,'1000011',None,'10000012'],
}
pd.DataFrame({
'name':['mallesh','bhavik','jagarini','mallesh'],
'email':['[email protected]',None,'[email protected]','[email protected]'],
'phone':[None,None,'111111','1234556'],
'ssn':[None,'1000011',None,'10000012'],
})
Answers:
Here is one way you could accomplish this using a for loop and the update
method of the dictionary:
data = [
{'name': ['mallesh'], 'email': ['[email protected]']},
{'name': ['bhavik'], 'ssn': ['1000011']},
{'name': ['jagarini'], 'email': ['[email protected]'], 'phone': ['111111']},
{'name': ['mallesh'], 'email': ['[email protected]'], 'phone': ['1234556'], 'ssn': ['10000012']}
]
# create the xml_master_dict with empty lists for each key
xml_master_dict = {'name':[], 'email':[], 'phone':[], 'ssn':[]}
# loop through the list of dictionaries
for item in data:
# loop through the keys in xml_master_dict
for key in xml_master_dict.keys():
# if the key exists in the current dictionary, append its value to the xml_master_dict
if key in item:
xml_master_dict[key].append(item[key])
# if the key does not exist in the current dictionary, append None to the xml_master_dict
else:
xml_master_dict[key].append(None)
# print the xml_master_dict to see the resulting values
print(xml_master_dict)
This code will produce the following output:
{'name': [['mallesh'], ['bhavik'], ['jagarini'], ['mallesh']],
'email': [['[email protected]'], None, ['[email protected]'], ['[email protected]']],
'phone': [None, None, ['111111'], ['1234556']],
'ssn': [None, ['1000011'], None, ['10000012']]}
You can then use this dictionary to create a DataFrame using the pd.DataFrame
function from the Pandas library. For example:
import pandas as pd
# Create a DataFrame from the xml_master_dict
df = pd.DataFrame(xml_master_dict)
# Print the DataFrame
print(df)
This code will produce the following output:
name email phone ssn
0 [mallesh] [[email protected]] None None
1 [bhavik] None None [1000011]
2 [jagarini] [[email protected]] [111111] None
3 [mallesh] [[email protected]] [1234556] [10000012]
You can define a function to get the first element of a dictionary value (or None
if the key doesn’t exist):
def first_elem_of_value(record: dict, key: str):
try:
return record[key][0]
except KeyError:
return None
and then build the master dict with a single comprehension:
xml_master_dict = {
key: [
first_elem_of_value(record, key)
for record in data
]
for key in ('name', 'email', 'phone', 'ssn')
}
>>> xml_master_dict
{'name': ['mallesh', 'bhavik', 'jagarini', 'mallesh'], 'email': ['[email protected]', None, '[email protected]', '[email protected]'], 'phone': [None, None, '111111', '1234556'], 'ssn': [None, '1000011', None, '10000012']}
I have a list of dictionary as below.
[
{'name':['mallesh'],'email':['[email protected]']},
{'name':['bhavik'],'ssn':['1000011']},
{'name':['jagarini'],'email':['[email protected]'],'phone':['111111']},
{'name':['mallesh'],'email':['[email protected]'],'phone':['1234556'],'ssn':['10000012']}
]
I would like to extract the information from these dictionary based on keys, hold on its information in another dictionary as.
xml_master_dict={'name':[],'email':[],'phone':[],'ssn':[]}
Here xml_master_dict should be filled in with the respective key information as below.
In a fist dictionary we have this:
{'name':['mallesh'],'email':['[email protected]']}
In xml_master_dict name and email keys only will be updated with the current value, if any of key is not existed in the dictionary it should be filled in with None. in this case phone and ssn will be None
Here is an expected output:
{
'name':['mallesh','bhavik','jagarini','mallesh'],
'email':['[email protected]',None,'[email protected]','[email protected]'],
'phone':[None,None,'111111','1234556'],
'ssn':[None,'1000011',None,'10000012'],
}
pd.DataFrame({
'name':['mallesh','bhavik','jagarini','mallesh'],
'email':['[email protected]',None,'[email protected]','[email protected]'],
'phone':[None,None,'111111','1234556'],
'ssn':[None,'1000011',None,'10000012'],
})
Here is one way you could accomplish this using a for loop and the update
method of the dictionary:
data = [
{'name': ['mallesh'], 'email': ['[email protected]']},
{'name': ['bhavik'], 'ssn': ['1000011']},
{'name': ['jagarini'], 'email': ['[email protected]'], 'phone': ['111111']},
{'name': ['mallesh'], 'email': ['[email protected]'], 'phone': ['1234556'], 'ssn': ['10000012']}
]
# create the xml_master_dict with empty lists for each key
xml_master_dict = {'name':[], 'email':[], 'phone':[], 'ssn':[]}
# loop through the list of dictionaries
for item in data:
# loop through the keys in xml_master_dict
for key in xml_master_dict.keys():
# if the key exists in the current dictionary, append its value to the xml_master_dict
if key in item:
xml_master_dict[key].append(item[key])
# if the key does not exist in the current dictionary, append None to the xml_master_dict
else:
xml_master_dict[key].append(None)
# print the xml_master_dict to see the resulting values
print(xml_master_dict)
This code will produce the following output:
{'name': [['mallesh'], ['bhavik'], ['jagarini'], ['mallesh']],
'email': [['[email protected]'], None, ['[email protected]'], ['[email protected]']],
'phone': [None, None, ['111111'], ['1234556']],
'ssn': [None, ['1000011'], None, ['10000012']]}
You can then use this dictionary to create a DataFrame using the pd.DataFrame
function from the Pandas library. For example:
import pandas as pd
# Create a DataFrame from the xml_master_dict
df = pd.DataFrame(xml_master_dict)
# Print the DataFrame
print(df)
This code will produce the following output:
name email phone ssn
0 [mallesh] [[email protected]] None None
1 [bhavik] None None [1000011]
2 [jagarini] [[email protected]] [111111] None
3 [mallesh] [[email protected]] [1234556] [10000012]
You can define a function to get the first element of a dictionary value (or None
if the key doesn’t exist):
def first_elem_of_value(record: dict, key: str):
try:
return record[key][0]
except KeyError:
return None
and then build the master dict with a single comprehension:
xml_master_dict = {
key: [
first_elem_of_value(record, key)
for record in data
]
for key in ('name', 'email', 'phone', 'ssn')
}
>>> xml_master_dict
{'name': ['mallesh', 'bhavik', 'jagarini', 'mallesh'], 'email': ['[email protected]', None, '[email protected]', '[email protected]'], 'phone': [None, None, '111111', '1234556'], 'ssn': [None, '1000011', None, '10000012']}