Pandas Dataframe from matrix-like dictionary where keys are tuples of indices

Question:

I have a dictionary whose keys are tuples of the form (i,j) and whose values are matrix entries.

So if you think of a mathematical matrix $A = (a_{i,j})$ then matrix_dict[(i,j)] would give the value of row i and column j.

I would like to have a pandas dataframe where the values of matrix_dict[(i,0)] for i in range 1 to m+1 are the names of the rows, matrix_dict[(0,j)] for j in range 1 to n+1 the names of the columns and all values where none of the tuple indices (i,j) are 0 to be the entries of the df with the corresponding row and column index.

The dictionary would look like this:

matrix_dict = {
    (0, 0): 'RowIndexColumnIndex',
    (0, 1): 'Column1',
    (0, 2): 'Column2',
    (1, 0): 'Row1',
    (1, 1): 1,
    (1, 2): 2,
    (2, 0): 'Row2',
    (2, 1): 3,
    (2, 2): 4
}

I thought it would be easy to convert that into a pandas dataframe as the structure already matches in a way, but the solutions I found on here using pd.DataFrame.from_dict are for different problems where the key tuple is supposed to become part of the dataframe or multi-indices.

Asked By: work flow

||

Answers:

If I understood correctly, use pandas.Series and unstack:

dic = {(0, 0): 1, (0, 1): 2, (1, 0): 3, (1, 1): 4, (2, 2): 5}

df = pd.Series(dic).unstack(fill_value=0)

Output:

   0  1  2
0  1  2  0
1  3  4  0
2  0  0  5

You can also reindex using m and n:

m, n = 4, 5

df = (pd.Series(dic).unstack(fill_value=0)
        .reindex(index=range(m), columns=range(n), fill_value=0)
     )

Output:

   0  1  2  3  4
0  1  2  0  0  0
1  3  4  0  0  0
2  0  0  5  0  0
3  0  0  0  0  0

updated question:

matrix_dict = {
    (0, 0): 'RowIndexColumnIndex',
    (0, 1): 'Column1',
    (0, 2): 'Column2',
    (1, 0): 'Row1',
    (1, 1): 1,
    (1, 2): 2,
    (2, 0): 'Row2',
    (2, 1): 3,
    (2, 2): 4
}

m, n = 2, 2

df = (pd.Series(matrix_dict).unstack(fill_value=0)
        .reindex(index=range(m+1), columns=range(n+1), fill_value=0)
        .set_index(0)
        .pipe(lambda d: d.set_axis(d.iloc[0], axis=1).iloc[1:])
        .rename_axis(index=None, columns=None)
     )

Output:

     Column1 Column2
Row1       1       2
Row2       3       4

Bonus:

df = (pd.Series(matrix_dict).unstack(fill_value=0)
        .reindex(index=range(m+1), columns=range(n+1), fill_value=0)
        .set_index(0)
        .pipe(lambda d: d.set_axis(d.iloc[0], axis=1).iloc[1:])
        .rename_axis(**dict(zip(('index', 'columns'),
                                matrix_dict[(0, 0)].split('\'))))
     )

Output:

ColumnIndex Column1 Column2
RowIndex                   
Row1              1       2
Row2              3       4
Answered By: mozway

This should work:

import pandas as pd

n #from the question
matrix_dict #from the question

df = pd.DataFrame()

for j in range(1,n+1):
    df[matrix_dict[(0,j)]] = [matrix_dict[(i,j)] for i in range(1,m+1)]

df.index = [matrix_dict[(i,0)] for i in range(1,m+1)]

Answered By: Pepe