Convert a Pandas DataFrame to a NumPy 2d Array based on 2 columns as dimensions and 1 as value

Question:

I have this dataframe:

    v  t  value
0   1  1  21662
1   1  2  18338
2   1  3  17400
3   2  1  21925
4   2  2  18328
5   2  3  25106
6   3  1  22017
7   3  2  18526
8   3  3  15896
9   4  1  16300
10  4  2  16826
11  4  3  21097
12  5  1  22497
13  5  2  14131
14  5  3  14302

and I need to convert it into an array equal to this:

[
    [21662, 18338, 17400],
    [21925, 18328, 25106],
    [22017, 18526, 15896],
    [16300, 16826, 21097],
    [22497, 14131, 14302]
]

The idea is to create sub-arrays based on the value of the "v" column.

Asked By: SoleeL

||

Answers:

You need a pivot:

df.pivot(index='v', columns='t', values='value').to_numpy()

If you can have missing values, reindex:

(df.pivot(index='v', columns='t', values='value')
   .reindex(index=range(1, df['v'].max()+1),
            columns=range(1, df['t'].max()+1))
   .to_numpy()
)

Output:

array([[21662, 18338, 17400],
       [21925, 18328, 25106],
       [22017, 18526, 15896],
       [16300, 16826, 21097],
       [22497, 14131, 14302]])
Answered By: mozway