Pandas Dataframe from matrix-like dictionary where keys are tuples of indices
Question:
I have a dictionary whose keys are tuples of the form (i,j) and whose values are matrix entries.
So if you think of a mathematical matrix $A = (a_{i,j})$ then matrix_dict[(i,j)]
would give the value of row i and column j.
I would like to have a pandas dataframe where the values of matrix_dict[(i,0)]
for i in range 1 to m+1 are the names of the rows, matrix_dict[(0,j)]
for j in range 1 to n+1 the names of the columns and all values where none of the tuple indices (i,j) are 0 to be the entries of the df with the corresponding row and column index.
The dictionary would look like this:
matrix_dict = {
(0, 0): 'RowIndexColumnIndex',
(0, 1): 'Column1',
(0, 2): 'Column2',
(1, 0): 'Row1',
(1, 1): 1,
(1, 2): 2,
(2, 0): 'Row2',
(2, 1): 3,
(2, 2): 4
}
I thought it would be easy to convert that into a pandas dataframe as the structure already matches in a way, but the solutions I found on here using pd.DataFrame.from_dict
are for different problems where the key tuple is supposed to become part of the dataframe or multi-indices.
Answers:
If I understood correctly, use pandas.Series
and unstack
:
dic = {(0, 0): 1, (0, 1): 2, (1, 0): 3, (1, 1): 4, (2, 2): 5}
df = pd.Series(dic).unstack(fill_value=0)
Output:
0 1 2
0 1 2 0
1 3 4 0
2 0 0 5
You can also reindex
using m
and n
:
m, n = 4, 5
df = (pd.Series(dic).unstack(fill_value=0)
.reindex(index=range(m), columns=range(n), fill_value=0)
)
Output:
0 1 2 3 4
0 1 2 0 0 0
1 3 4 0 0 0
2 0 0 5 0 0
3 0 0 0 0 0
updated question:
matrix_dict = {
(0, 0): 'RowIndexColumnIndex',
(0, 1): 'Column1',
(0, 2): 'Column2',
(1, 0): 'Row1',
(1, 1): 1,
(1, 2): 2,
(2, 0): 'Row2',
(2, 1): 3,
(2, 2): 4
}
m, n = 2, 2
df = (pd.Series(matrix_dict).unstack(fill_value=0)
.reindex(index=range(m+1), columns=range(n+1), fill_value=0)
.set_index(0)
.pipe(lambda d: d.set_axis(d.iloc[0], axis=1).iloc[1:])
.rename_axis(index=None, columns=None)
)
Output:
Column1 Column2
Row1 1 2
Row2 3 4
Bonus:
df = (pd.Series(matrix_dict).unstack(fill_value=0)
.reindex(index=range(m+1), columns=range(n+1), fill_value=0)
.set_index(0)
.pipe(lambda d: d.set_axis(d.iloc[0], axis=1).iloc[1:])
.rename_axis(**dict(zip(('index', 'columns'),
matrix_dict[(0, 0)].split('\'))))
)
Output:
ColumnIndex Column1 Column2
RowIndex
Row1 1 2
Row2 3 4
This should work:
import pandas as pd
n #from the question
matrix_dict #from the question
df = pd.DataFrame()
for j in range(1,n+1):
df[matrix_dict[(0,j)]] = [matrix_dict[(i,j)] for i in range(1,m+1)]
df.index = [matrix_dict[(i,0)] for i in range(1,m+1)]
I have a dictionary whose keys are tuples of the form (i,j) and whose values are matrix entries.
So if you think of a mathematical matrix $A = (a_{i,j})$ then matrix_dict[(i,j)]
would give the value of row i and column j.
I would like to have a pandas dataframe where the values of matrix_dict[(i,0)]
for i in range 1 to m+1 are the names of the rows, matrix_dict[(0,j)]
for j in range 1 to n+1 the names of the columns and all values where none of the tuple indices (i,j) are 0 to be the entries of the df with the corresponding row and column index.
The dictionary would look like this:
matrix_dict = {
(0, 0): 'RowIndexColumnIndex',
(0, 1): 'Column1',
(0, 2): 'Column2',
(1, 0): 'Row1',
(1, 1): 1,
(1, 2): 2,
(2, 0): 'Row2',
(2, 1): 3,
(2, 2): 4
}
I thought it would be easy to convert that into a pandas dataframe as the structure already matches in a way, but the solutions I found on here using pd.DataFrame.from_dict
are for different problems where the key tuple is supposed to become part of the dataframe or multi-indices.
If I understood correctly, use pandas.Series
and unstack
:
dic = {(0, 0): 1, (0, 1): 2, (1, 0): 3, (1, 1): 4, (2, 2): 5}
df = pd.Series(dic).unstack(fill_value=0)
Output:
0 1 2
0 1 2 0
1 3 4 0
2 0 0 5
You can also reindex
using m
and n
:
m, n = 4, 5
df = (pd.Series(dic).unstack(fill_value=0)
.reindex(index=range(m), columns=range(n), fill_value=0)
)
Output:
0 1 2 3 4
0 1 2 0 0 0
1 3 4 0 0 0
2 0 0 5 0 0
3 0 0 0 0 0
updated question:
matrix_dict = {
(0, 0): 'RowIndexColumnIndex',
(0, 1): 'Column1',
(0, 2): 'Column2',
(1, 0): 'Row1',
(1, 1): 1,
(1, 2): 2,
(2, 0): 'Row2',
(2, 1): 3,
(2, 2): 4
}
m, n = 2, 2
df = (pd.Series(matrix_dict).unstack(fill_value=0)
.reindex(index=range(m+1), columns=range(n+1), fill_value=0)
.set_index(0)
.pipe(lambda d: d.set_axis(d.iloc[0], axis=1).iloc[1:])
.rename_axis(index=None, columns=None)
)
Output:
Column1 Column2
Row1 1 2
Row2 3 4
Bonus:
df = (pd.Series(matrix_dict).unstack(fill_value=0)
.reindex(index=range(m+1), columns=range(n+1), fill_value=0)
.set_index(0)
.pipe(lambda d: d.set_axis(d.iloc[0], axis=1).iloc[1:])
.rename_axis(**dict(zip(('index', 'columns'),
matrix_dict[(0, 0)].split('\'))))
)
Output:
ColumnIndex Column1 Column2
RowIndex
Row1 1 2
Row2 3 4
This should work:
import pandas as pd
n #from the question
matrix_dict #from the question
df = pd.DataFrame()
for j in range(1,n+1):
df[matrix_dict[(0,j)]] = [matrix_dict[(i,j)] for i in range(1,m+1)]
df.index = [matrix_dict[(i,0)] for i in range(1,m+1)]