Merge a large dictionary into a single dataframe
Question:
I have a dictionary.
import pandas as pd
d = {
'A':pd.DataFrame(
{'Age' : [5,5,5],
'Weight' : [5,5,5]}),
'B':pd.DataFrame(
{'Age' : [10,10,10],
'Weight' : [10,10,10]}),
'C':pd.DataFrame(
{'Age' : [7,7,7],
'Weight' : [10,10,100]}),
}
d
I would like to convert that to a single dataframe.
data = [
['A',5,5],
['A',5,5],
['A',5,5],
['B',10,10],
['B',10,10],
['B',10,10],
['C',7,10],
['C',7,10],
['C',7,100],
]
df = pd.DataFrame(data, columns=['Team', 'Age', 'Weight'])
df
Answers:
With pd.concat
and augmenting the initial dataframes with dict keys:
df = pd.concat([df.assign(Team=[k] * df.index.size) for k, df in d.items()],
axis=0, ignore_index=True)
Age Weight Team
0 5 5 A
1 5 5 A
2 5 5 A
3 10 10 B
4 10 10 B
5 10 10 B
6 7 10 C
7 7 10 C
8 7 100 C
import pandas as pd
d = {
'A':pd.DataFrame(
{'Age' : [5,5,5],
'Weight' : [5,5,5]}),
'B':pd.DataFrame(
{'Age' : [10,10,10],
'Weight' : [10,10,10]}),
'C':pd.DataFrame(
{'Age' : [7,7,7],
'Weight' : [10,10,100]}),
}
r = (pd.concat([d[k] for k,v in list(d.items()) if d[k].insert(0, "Team", k) is None])
.reset_index(drop=True)
)
print(r)
Team Age Weight
0 A 5 5
1 A 5 5
2 A 5 5
3 B 10 10
4 B 10 10
5 B 10 10
6 C 7 10
7 C 7 10
8 C 7 100
Try using pd.concat()
k,v = zip(*d.items())
pd.concat(v,keys = k,names = ['Team']).reset_index(level=0)
Output:
Team Age Weight
0 A 5 5
1 A 5 5
2 A 5 5
0 B 10 10
1 B 10 10
2 B 10 10
0 C 7 10
1 C 7 10
2 C 7 100
concat
can handles dictionaries whose values are DataFrames
quite nicely. The keys of the dictionary become the outer level of the resultant DataFrame.
out = pd.concat(d, names=['Team'])
print(out)
Age Weight
Team
A 0 5 5
1 5 5
2 5 5
B 0 10 10
1 10 10
2 10 10
C 0 7 10
1 7 10
2 7 100
From there we can insert those keys back as a specific column via .reset_index
.
out = pd.concat(d, names=['Team']).reset_index('Team').reset_index(drop=True)
print(out)
Team Age Weight
0 A 5 5
1 A 5 5
2 A 5 5
3 B 10 10
4 B 10 10
5 B 10 10
6 C 7 10
7 C 7 10
8 C 7 100
I have a dictionary.
import pandas as pd
d = {
'A':pd.DataFrame(
{'Age' : [5,5,5],
'Weight' : [5,5,5]}),
'B':pd.DataFrame(
{'Age' : [10,10,10],
'Weight' : [10,10,10]}),
'C':pd.DataFrame(
{'Age' : [7,7,7],
'Weight' : [10,10,100]}),
}
d
I would like to convert that to a single dataframe.
data = [
['A',5,5],
['A',5,5],
['A',5,5],
['B',10,10],
['B',10,10],
['B',10,10],
['C',7,10],
['C',7,10],
['C',7,100],
]
df = pd.DataFrame(data, columns=['Team', 'Age', 'Weight'])
df
With pd.concat
and augmenting the initial dataframes with dict keys:
df = pd.concat([df.assign(Team=[k] * df.index.size) for k, df in d.items()],
axis=0, ignore_index=True)
Age Weight Team
0 5 5 A
1 5 5 A
2 5 5 A
3 10 10 B
4 10 10 B
5 10 10 B
6 7 10 C
7 7 10 C
8 7 100 C
import pandas as pd
d = {
'A':pd.DataFrame(
{'Age' : [5,5,5],
'Weight' : [5,5,5]}),
'B':pd.DataFrame(
{'Age' : [10,10,10],
'Weight' : [10,10,10]}),
'C':pd.DataFrame(
{'Age' : [7,7,7],
'Weight' : [10,10,100]}),
}
r = (pd.concat([d[k] for k,v in list(d.items()) if d[k].insert(0, "Team", k) is None])
.reset_index(drop=True)
)
print(r)
Team Age Weight
0 A 5 5
1 A 5 5
2 A 5 5
3 B 10 10
4 B 10 10
5 B 10 10
6 C 7 10
7 C 7 10
8 C 7 100
Try using pd.concat()
k,v = zip(*d.items())
pd.concat(v,keys = k,names = ['Team']).reset_index(level=0)
Output:
Team Age Weight
0 A 5 5
1 A 5 5
2 A 5 5
0 B 10 10
1 B 10 10
2 B 10 10
0 C 7 10
1 C 7 10
2 C 7 100
concat
can handles dictionaries whose values are DataFrames
quite nicely. The keys of the dictionary become the outer level of the resultant DataFrame.
out = pd.concat(d, names=['Team'])
print(out)
Age Weight
Team
A 0 5 5
1 5 5
2 5 5
B 0 10 10
1 10 10
2 10 10
C 0 7 10
1 7 10
2 7 100
From there we can insert those keys back as a specific column via .reset_index
.
out = pd.concat(d, names=['Team']).reset_index('Team').reset_index(drop=True)
print(out)
Team Age Weight
0 A 5 5
1 A 5 5
2 A 5 5
3 B 10 10
4 B 10 10
5 B 10 10
6 C 7 10
7 C 7 10
8 C 7 100