Python: From dictionary to csv with replicating keys by the length of values
Question:
I have the following dictionary:
import pandas as pd
dict_item = {'item1': ['bag', 'phone', 'laptop'],'item2': ['sofa', 'TV', 'bed', 'door', 'window'] }
and I would like to transform it to csv, so far I tried:
df = df.append(pd.DataFrame(data={'item_number':dict_item.keys(),'items':dict_item.values()}))
df
but it gives me following:
But I would like to get following:
So, in other words I would like to replicate the first column which is the keys of dictionary by length of corresponding values
P.S. I created the desired output manually.
Thanks, I would appreciate any help
Answers:
Just using DataFrame
constructor then stack
s=pd.DataFrame(list(dict_item.values()),index=dict_item.keys()).stack().reset_index(level=0)
s.columns=['item_number','items']
s
Out[609]:
item_number items
0 item1 bag
1 item1 phone
2 item1 laptop
0 item2 sofa
1 item2 TV
2 item2 bed
3 item2 door
4 item2 window
Use json_normalize
+ melt
df = pd.io.json.json_normalize(dict_item)
expand(df.melt(), 'value')
where
def expand(df, col):
d = {c: df[c].values.repeat(df[col].str.len(), axis=0) for c in df.columns}
d[col] = [i for sub in df[col] for i in sub]
return pd.DataFrame(d)
Outputs
variable value
0 item1 bag
1 item1 phone
2 item1 laptop
3 item2 sofa
4 item2 TV
5 item2 bed
6 item2 door
7 item2 window
Another option, with pd.DataFrame
constructor + melt
pd.DataFrame(dict_item.values(), index=dict_item.keys()).T.melt().dropna()
I have the following dictionary:
import pandas as pd
dict_item = {'item1': ['bag', 'phone', 'laptop'],'item2': ['sofa', 'TV', 'bed', 'door', 'window'] }
and I would like to transform it to csv, so far I tried:
df = df.append(pd.DataFrame(data={'item_number':dict_item.keys(),'items':dict_item.values()}))
df
but it gives me following:
But I would like to get following:
So, in other words I would like to replicate the first column which is the keys of dictionary by length of corresponding values
P.S. I created the desired output manually.
Thanks, I would appreciate any help
Just using DataFrame
constructor then stack
s=pd.DataFrame(list(dict_item.values()),index=dict_item.keys()).stack().reset_index(level=0)
s.columns=['item_number','items']
s
Out[609]:
item_number items
0 item1 bag
1 item1 phone
2 item1 laptop
0 item2 sofa
1 item2 TV
2 item2 bed
3 item2 door
4 item2 window
Use json_normalize
+ melt
df = pd.io.json.json_normalize(dict_item)
expand(df.melt(), 'value')
where
def expand(df, col):
d = {c: df[c].values.repeat(df[col].str.len(), axis=0) for c in df.columns}
d[col] = [i for sub in df[col] for i in sub]
return pd.DataFrame(d)
Outputs
variable value
0 item1 bag
1 item1 phone
2 item1 laptop
3 item2 sofa
4 item2 TV
5 item2 bed
6 item2 door
7 item2 window
Another option, with pd.DataFrame
constructor + melt
pd.DataFrame(dict_item.values(), index=dict_item.keys()).T.melt().dropna()