4x change in number of numpy array elements resulting in 32X increase in numpy file size

Question

I have a numpy array of shape (1684, 129, 522). Basically 1684 frames of dimensions 129X522 (only 1 channel so I have not specified it in the array.)

I was writing a function that would take 4 of these frames (each of 129 X 522) at a time and create a new input numpy array of size (4,129,522).

Hence, the net result would be a numpy array of shape (1684 X 4 X 129 X 522) from an original array shape of (1684 X 129 X 522)
Code below:

Function definition:

def create_frame_windows(episode, frame_window_length=4):
    episode_length, dim1, dim2=episode.shape
    new_episode=np.zeros((episode_length,frame_window_length,dim1, dim2))
    data_q_deque=deque(maxlen=4)
    for _ in range(frame_window_length):
        data_q_deque.append(np.zeros((dim1, dim2)))
    data_q=np.array(data_q_deque)
    print('Initial data queue',data_q.shape)
    for frame_no in range(len(episode)):
        frame=episode[frame_no]
        data_q[:-1]=data_q[1:]; data_q[-1]=frame
        new_episode[frame_no]=data_q
    print('New episode length',new_episode.shape)
    return new_episode

Run the function:

episode=np.load(os.path.join(paths.INPUT_DATA_PATH,epi_file))
print('Episode shape',episode.shape)
print('Initial size',sys.getsizeof(episode))
final_episode=create_frame_windows(episode,4)
print('Final episode shape',final_episode.shape)
print('Final size',sys.getsizeof(final_episode))

Code output:

Episode shape (1684, 129, 522)
Initial size 113397336
Final episode shape (1684, 4, 129, 522)
Final size 3628710304

My issue is that while the shape of the episodes are as expected, the size of the final episode array is 32X the size of the original episode array (3628710304 / 113397336 = 31.99). For just an increase of 4X increase in the number of elements of the array.

Have I written the function wrong or is there a more logical explanation for why this is happening? i.e. a 32X increase in numpy size (on disk) for a 4X increase in the number of elements

Asked By: Swami

||

Source

Answer 1

It’s likely that the original array consisted of integers. np.zeros by default creates floats, which are larger.

You can pass a datatype to np.zeros.

Answered By: Tim Roberts

4x change in number of numpy array elements resulting in 32X increase in numpy file size

Question:

Answers: