Convert a Pandas DataFrame to a NumPy 2d Array based on 2 columns as dimensions and 1 as value
Question:
I have this dataframe:
v t value
0 1 1 21662
1 1 2 18338
2 1 3 17400
3 2 1 21925
4 2 2 18328
5 2 3 25106
6 3 1 22017
7 3 2 18526
8 3 3 15896
9 4 1 16300
10 4 2 16826
11 4 3 21097
12 5 1 22497
13 5 2 14131
14 5 3 14302
and I need to convert it into an array equal to this:
[
[21662, 18338, 17400],
[21925, 18328, 25106],
[22017, 18526, 15896],
[16300, 16826, 21097],
[22497, 14131, 14302]
]
The idea is to create sub-arrays based on the value of the "v" column.
Answers:
You need a pivot
:
df.pivot(index='v', columns='t', values='value').to_numpy()
If you can have missing values, reindex
:
(df.pivot(index='v', columns='t', values='value')
.reindex(index=range(1, df['v'].max()+1),
columns=range(1, df['t'].max()+1))
.to_numpy()
)
Output:
array([[21662, 18338, 17400],
[21925, 18328, 25106],
[22017, 18526, 15896],
[16300, 16826, 21097],
[22497, 14131, 14302]])
I have this dataframe:
v t value
0 1 1 21662
1 1 2 18338
2 1 3 17400
3 2 1 21925
4 2 2 18328
5 2 3 25106
6 3 1 22017
7 3 2 18526
8 3 3 15896
9 4 1 16300
10 4 2 16826
11 4 3 21097
12 5 1 22497
13 5 2 14131
14 5 3 14302
and I need to convert it into an array equal to this:
[
[21662, 18338, 17400],
[21925, 18328, 25106],
[22017, 18526, 15896],
[16300, 16826, 21097],
[22497, 14131, 14302]
]
The idea is to create sub-arrays based on the value of the "v" column.
You need a pivot
:
df.pivot(index='v', columns='t', values='value').to_numpy()
If you can have missing values, reindex
:
(df.pivot(index='v', columns='t', values='value')
.reindex(index=range(1, df['v'].max()+1),
columns=range(1, df['t'].max()+1))
.to_numpy()
)
Output:
array([[21662, 18338, 17400],
[21925, 18328, 25106],
[22017, 18526, 15896],
[16300, 16826, 21097],
[22497, 14131, 14302]])