Extract only a portion of a numpy array from tf.data

Question:

I have a NumPy array of shape 500,36,24,72. Now I want to create a data pipeline for a problem using tf.data. For every iteration, only a subset of the array is required, for example, first the model is trained over [500,x:y,24,72], wherein only a subset of the second dimension is taken.

ds1 = tf.data.Dataset.zip((tf.data.Dataset.from_tensor_slices(data))

Applying a filter over the above dataset doesn’t seem to work

ds2 = ds1.filter(lambda x: x[1:3][:][:])
Asked By: Manvendra

||

Answers:

Use tf.data.Dataset.map:

import numpy as np
import tensorflow as tf

data = np.random.random((500,36,24,72))
ds1 = tf.data.Dataset.zip((tf.data.Dataset.from_tensor_slices(data)))
ds2 = ds1.map(lambda x: x[1:3, ...])
Answered By: AloneTogether