how to convert a dataframe to tensor

Question

I have a dataframe like this:

         ids  dim
0         1    2
1         1    0
2         1    1
3         2    1
4         2    2
5         3    0
6         3    2
7         4    1
8         4    2
9         Nan  0
10        Nan  1
11        Nan  0

I want to build a tensorflow tensor out of it so that the result look like this:
Here the columns are correspond to the dim column in df, as we have three distinct value (0, 1,2) the equivalent tensor would have three column.

And the values of the tensor are the associated id s in the df.

1   1   1
Nan 2   2
3   Nan 3
Nan 4   4

What I did:

I tried to convert the df to a numpy and then convert it to the tensor, however, the result does not look like what I want:

tf.constant(df[['ids', 'dim']].values, dtype=tf.int32)

Asked By: sariii

||

Source

Answer 1

Check my code:

import numpy as np
import pandas as pd
import tensorflow as tf


df = pd.DataFrame([[1, 2],
                   [1, 0],
                   [1, 1],
                   [2, 1],
                   [2, 2],
                   [3, 0],
                   [3, 2],
                   [4, 1],
                   [4, 2],
                   [np.nan, 0],
                   [np.nan, 1],
                   [np.nan, 0]], columns=['ids', 'dim'])
dim_array = np.array(df['dim'])
sort = dim_array.argsort()
final = np.array([df.ids[sort]]).reshape((3, 4)).T
final_result = tf.constant(final, dtype=tf.int32) # use tf.float32 to retain nan in tensor
print(final_result)

# <tf.Tensor: shape=(4, 3), dtype=int32, numpy=
# array([[          1,           1,           1],
#        [          3,           2,           2],
#        [-2147483648,           4,           3],
#        [-2147483648, -2147483648,           4]], 
# dtype=int32)>

In tensorflow nan will loss by some value.

Answered By: Davinder Singh

Answer 2

You can use pd.pivot_table() for a concise computation

df = pd.DataFrame([[1, 2],
                   [1, 0],
                   [1, 1],
                   [2, 1],
                   [2, 2],
                   [3, 0],
                   [3, 2],
                   [4, 1],
                   [4, 2],
                   [np.nan, 0],
                   [np.nan, 1],
                   [np.nan, 0]], columns=['ids', 'dim'])

df['val'] = 1
df = df.pivot_table(index='ids',columns='dim',values='val') 
df = df.multiply(np.array(df.index), axis=0)

tensor = tf.constant(df)

Answered By: thushv89

Answer 3

try torch

item=torch.tensor(df.values)

Answered By: Golden Lion

how to convert a dataframe to tensor

Question:

Answers: