array of arrays. How can I use the items in one array to match the items in another array using NumPy. ML

Question:

I have an (n,1) dimensional array containing my ids. another array is (n,p) dimensional. I want to use each item of the first array to match the items of my second array.

Example.

Input
Arr_1 = ([[100], [200], [300]])

Arr_2 = ([[1,2,3], [4,5,6], [7,8,9]])

Output

Arr_3 = ([[100],[1,2,3]], [[200],[4,5,6]], [[300][7,8,9]]])

In my code ‘Arr_1′(8000, 1) corresponds to the user_id, and ‘Arr_2′(8000, 1000) corresponds to tokenized text data that now are an np array. both of these arrays are meant to be my X input for a NN model.

Asked By: Giovanni Meoño

||

Answers:

for i in range(len(Arr1)):
    Arr2[i].append(Arr1[i])

display(Arr2)
[[1, 2, 3, [100], [100]], [4, 5, 6, [200], [200]], [7, 8, 9, [300]]]

Hope this helps

Answered By: Uday Raj

I can’t see how this can be done with a comprehension. If Arr_1 can be modified then this works:

for a in Arr_1:
    a.append(next(iter(Arr_2)))

Otherwise first make a deep copy of Arr_1.

Answered By: user19077881
x = [[100], [200], [300]]
y = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

z = [[i, j] for i, j in zip(x, y)]
print(z)
# [[[100], [1, 2, 3]], [[200], [4, 5, 6]], [[300], [7, 8, 9]]]

Answered By: Djinn

If you want this data to be used in an NN model, you could put it into a Dataset, together with the Y data.

from tensorflow.data import Dataset

Arr_1 = [100, 200, 300]
Arr_2 = [[1,2,3], [4,5,6], [7,8,9]]
Y = [5, 6, 7]

dataset = Dataset.from_tensor_slices(((Arr_1, Arr_2), Y)).batch(1)

You can print out some values:

for x, y in dataset.take(3):
    print(f'x = {x}')
    print(f'y = {y}')

Output:

x = (<tf.Tensor: shape=(1,), dtype=int32, numpy=array([100], dtype=int32)>, <tf.Tensor: shape=(1, 3), dtype=int32, numpy=array([[1, 2, 3]], dtype=int32)>)
y = [5]
x = (<tf.Tensor: shape=(1,), dtype=int32, numpy=array([200], dtype=int32)>, <tf.Tensor: shape=(1, 3), dtype=int32, numpy=array([[4, 5, 6]], dtype=int32)>)
y = [6]
x = (<tf.Tensor: shape=(1,), dtype=int32, numpy=array([300], dtype=int32)>, <tf.Tensor: shape=(1, 3), dtype=int32, numpy=array([[7, 8, 9]], dtype=int32)>)
y = [7]

The dataset contains X and Y, so you run the fit method with just providing the dataset (without y parameter):

model.fit(dataset, epochs=2000)

Full example:

import tensorflow.keras.layers as L
from tensorflow.keras import Model
from tensorflow.data import Dataset

Arr_1 = [100, 200, 300]
Arr_2 = [[1,2,3], [4,5,6], [7,8,9]]
Y = [5, 6, 7]

dataset = Dataset.from_tensor_slices(((Arr_1, Arr_2), Y)).batch(1)

for x, y in dataset.take(3):
    print(f'x = {x}')
    print(f'y = {y}')

input_1 = L.Input(shape=(1,))
input_2 = L.Input(shape=(3,))
concat = L.Concatenate(axis=1)([input_1, input_2])
output = L.Dense(1)(concat)

model = Model(inputs=[input_1, input_2], outputs=output)
model.compile(loss='mse', optimizer='Adam')
model.fit(dataset, epochs=2000)

x, y = next(iter(dataset))
print(f'x = {x}')
print(f'y_true = {y}')
print(f'model prediction: {model.predict(x)}')
Answered By: AndrzejO
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.