How can I transform 3-column pandas dataframe to matrix format in Python?

Question

I have 3 columns in my data-frame, namely X, Y, Z. I want to transform Z into a matrix based on X, Y (all columns having numerical values). X and Y have duplicate entries, hence a pivot table doesn’t work.
My code (n = #rows) :

mat = numpy.zeros((n, n))
for i in range (0, n):
   for j in range (0, n):
        if (Y[j] == Y[i]):
            mat[i, j] = Z[j]
        if (X[j] == X[i]):
            mat[i, j] = Z[i]

yields

[[6 10  0 0]
 [6 10 10 0]
 [0 10 10 0]
 [0  0  0 6]]

Data looks like:

X = array([100, 10, 10, 50]); 
Y = array([20, 20, 40, 60]); 
Z = array([6, 10, 10, 6]);

So the correct matrix should be:

[[6 10 10 0]
 [6 10 10 0]
 [0 10 10 0]
 [0  0  0 6]]

which is obtained by:

   | 100  10  10  50
--------------------
20 | 6   10  10   0
--------------------
20 | 6   10  10   0
--------------------
40 | 0   10  10   0
--------------------
60 | 0    0   0   6
--------------------

Asked By: Ranja Sarkar

||

Source

Answer 1

I currently don’t see how to do this faster than with two for-loop. This should work:

data = pd.DataFrame({
    'X': np.array([100, 10, 10, 50]),
    'Y': np.array([20, 20, 40, 60]),
    'Z': np.array([6, 10, 10, 6])
})

mapping = {(x, y): z for (x, y, z) in data[["X", "Y", "Z"]].values}

n = len(data)
mat = np.zeros((n, n))

for i, x in np.ndenumerate(data["X"]):
    for j, y in np.ndenumerate(data["Y"]):
        mat[j, i] = mapping.get((x, y), 0)

print(mat)

Output:

[[ 6. 10. 10.  0.]
 [ 6. 10. 10.  0.]
 [ 0. 10. 10.  0.]
 [ 0.  0.  0.  6.]]

I am creating a mapping that corresponds to the assignment of (x, y) ⟶ z.

With this is in place, filling the result matrix mat is pretty straight-forward.

Note however, that if there exist multiple columns with the same values for both x and y, the corresponding z value of the last column would be taken.

Answered By: Lydia van Dyke

How can I transform 3-column pandas dataframe to matrix format in Python?

Question:

Answers: