array of arrays. How can I use the items in one array to match the items in another array using NumPy. ML
Question:
I have an (n,1) dimensional array containing my ids. another array is (n,p) dimensional. I want to use each item of the first array to match the items of my second array.
Example.
Input
Arr_1 = ([[100], [200], [300]])
Arr_2 = ([[1,2,3], [4,5,6], [7,8,9]])
Output
Arr_3 = ([[100],[1,2,3]], [[200],[4,5,6]], [[300][7,8,9]]])
In my code ‘Arr_1′(8000, 1) corresponds to the user_id, and ‘Arr_2′(8000, 1000) corresponds to tokenized text data that now are an np array. both of these arrays are meant to be my X input for a NN model.
Answers:
for i in range(len(Arr1)):
Arr2[i].append(Arr1[i])
display(Arr2)
[[1, 2, 3, [100], [100]], [4, 5, 6, [200], [200]], [7, 8, 9, [300]]]
Hope this helps
I can’t see how this can be done with a comprehension. If Arr_1 can be modified then this works:
for a in Arr_1:
a.append(next(iter(Arr_2)))
Otherwise first make a deep copy of Arr_1.
x = [[100], [200], [300]]
y = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
z = [[i, j] for i, j in zip(x, y)]
print(z)
# [[[100], [1, 2, 3]], [[200], [4, 5, 6]], [[300], [7, 8, 9]]]
If you want this data to be used in an NN model, you could put it into a Dataset
, together with the Y
data.
from tensorflow.data import Dataset
Arr_1 = [100, 200, 300]
Arr_2 = [[1,2,3], [4,5,6], [7,8,9]]
Y = [5, 6, 7]
dataset = Dataset.from_tensor_slices(((Arr_1, Arr_2), Y)).batch(1)
You can print out some values:
for x, y in dataset.take(3):
print(f'x = {x}')
print(f'y = {y}')
Output:
x = (<tf.Tensor: shape=(1,), dtype=int32, numpy=array([100], dtype=int32)>, <tf.Tensor: shape=(1, 3), dtype=int32, numpy=array([[1, 2, 3]], dtype=int32)>)
y = [5]
x = (<tf.Tensor: shape=(1,), dtype=int32, numpy=array([200], dtype=int32)>, <tf.Tensor: shape=(1, 3), dtype=int32, numpy=array([[4, 5, 6]], dtype=int32)>)
y = [6]
x = (<tf.Tensor: shape=(1,), dtype=int32, numpy=array([300], dtype=int32)>, <tf.Tensor: shape=(1, 3), dtype=int32, numpy=array([[7, 8, 9]], dtype=int32)>)
y = [7]
The dataset contains X
and Y
, so you run the fit
method with just providing the dataset (without y
parameter):
model.fit(dataset, epochs=2000)
Full example:
import tensorflow.keras.layers as L
from tensorflow.keras import Model
from tensorflow.data import Dataset
Arr_1 = [100, 200, 300]
Arr_2 = [[1,2,3], [4,5,6], [7,8,9]]
Y = [5, 6, 7]
dataset = Dataset.from_tensor_slices(((Arr_1, Arr_2), Y)).batch(1)
for x, y in dataset.take(3):
print(f'x = {x}')
print(f'y = {y}')
input_1 = L.Input(shape=(1,))
input_2 = L.Input(shape=(3,))
concat = L.Concatenate(axis=1)([input_1, input_2])
output = L.Dense(1)(concat)
model = Model(inputs=[input_1, input_2], outputs=output)
model.compile(loss='mse', optimizer='Adam')
model.fit(dataset, epochs=2000)
x, y = next(iter(dataset))
print(f'x = {x}')
print(f'y_true = {y}')
print(f'model prediction: {model.predict(x)}')
I have an (n,1) dimensional array containing my ids. another array is (n,p) dimensional. I want to use each item of the first array to match the items of my second array.
Example.
Input
Arr_1 = ([[100], [200], [300]])
Arr_2 = ([[1,2,3], [4,5,6], [7,8,9]])
Output
Arr_3 = ([[100],[1,2,3]], [[200],[4,5,6]], [[300][7,8,9]]])
In my code ‘Arr_1′(8000, 1) corresponds to the user_id, and ‘Arr_2′(8000, 1000) corresponds to tokenized text data that now are an np array. both of these arrays are meant to be my X input for a NN model.
for i in range(len(Arr1)):
Arr2[i].append(Arr1[i])
display(Arr2)
[[1, 2, 3, [100], [100]], [4, 5, 6, [200], [200]], [7, 8, 9, [300]]]
Hope this helps
I can’t see how this can be done with a comprehension. If Arr_1 can be modified then this works:
for a in Arr_1:
a.append(next(iter(Arr_2)))
Otherwise first make a deep copy of Arr_1.
x = [[100], [200], [300]]
y = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
z = [[i, j] for i, j in zip(x, y)]
print(z)
# [[[100], [1, 2, 3]], [[200], [4, 5, 6]], [[300], [7, 8, 9]]]
If you want this data to be used in an NN model, you could put it into a Dataset
, together with the Y
data.
from tensorflow.data import Dataset
Arr_1 = [100, 200, 300]
Arr_2 = [[1,2,3], [4,5,6], [7,8,9]]
Y = [5, 6, 7]
dataset = Dataset.from_tensor_slices(((Arr_1, Arr_2), Y)).batch(1)
You can print out some values:
for x, y in dataset.take(3):
print(f'x = {x}')
print(f'y = {y}')
Output:
x = (<tf.Tensor: shape=(1,), dtype=int32, numpy=array([100], dtype=int32)>, <tf.Tensor: shape=(1, 3), dtype=int32, numpy=array([[1, 2, 3]], dtype=int32)>)
y = [5]
x = (<tf.Tensor: shape=(1,), dtype=int32, numpy=array([200], dtype=int32)>, <tf.Tensor: shape=(1, 3), dtype=int32, numpy=array([[4, 5, 6]], dtype=int32)>)
y = [6]
x = (<tf.Tensor: shape=(1,), dtype=int32, numpy=array([300], dtype=int32)>, <tf.Tensor: shape=(1, 3), dtype=int32, numpy=array([[7, 8, 9]], dtype=int32)>)
y = [7]
The dataset contains X
and Y
, so you run the fit
method with just providing the dataset (without y
parameter):
model.fit(dataset, epochs=2000)
Full example:
import tensorflow.keras.layers as L
from tensorflow.keras import Model
from tensorflow.data import Dataset
Arr_1 = [100, 200, 300]
Arr_2 = [[1,2,3], [4,5,6], [7,8,9]]
Y = [5, 6, 7]
dataset = Dataset.from_tensor_slices(((Arr_1, Arr_2), Y)).batch(1)
for x, y in dataset.take(3):
print(f'x = {x}')
print(f'y = {y}')
input_1 = L.Input(shape=(1,))
input_2 = L.Input(shape=(3,))
concat = L.Concatenate(axis=1)([input_1, input_2])
output = L.Dense(1)(concat)
model = Model(inputs=[input_1, input_2], outputs=output)
model.compile(loss='mse', optimizer='Adam')
model.fit(dataset, epochs=2000)
x, y = next(iter(dataset))
print(f'x = {x}')
print(f'y_true = {y}')
print(f'model prediction: {model.predict(x)}')