Average consective nth row in numpy

Question:

I have a 3D numpy array.

x=np.random.randint(low=0,high=10,size=(100,64,1000))

I want to average every 4th row, for example first 4, then 4-8, 8-12 and so on.
I tried the following way

x =np.split(x,len(x)/4)
np.mean(np.stack(x),1)

I am bit confused if its the correct way? Or if there is a better way. Also how to do if first dimension is not completely divisible by 4.
For example, I can do this way

x =np.array_split(x,len(x)/4)
np.stack([np.mean(i,0) for i in x],0)

Thanks

EDIT:
Here is my use case.
This is sensor data, where 100 is the number of time data been collected (trials), 64 is the number of channels of sensor and 1000 is sensor signal (length). I want sensor signal to be average for first 4 trials, then next 4 trials and so on.

Asked By: Talha Anwar

||

Answers:

The fastes way is probably reshaping the array, then using mean.

import numpy as np
x=np.random.randint(low=0,high=10,size=(100,64,6144))
# the following will crash if shape is not divisible by 4!
temp = x.reshape((4,-1)).mean(axis=0)
# you have to identify for your use case how to re-shape the array after averaging
result = temp.reshape((x.shape[0]//4, x.shape[1], x.shape[2]))

You will need to check if the shapes are devisible by 4 and if not make them divisible.

you can also do it in any one dimension:

result= x.reshape((100,4,-1,6144)).mean(axis=1)

Simply split one dimension into a 4 and a -1 (i.e. the let numpy calculate it by itself) and then average over the dimension where you have your 4 samples.

Answered By: user_na

Using np.reshape and np.mean for multiples

Try this with a reshape and then mean over the specific axis –

x = np.random.randint(low=0,high=10,size=(100,64,1000))

#Reshape to (4, 25, 65, 1000) and then reduce 0th axis with a mean
x = x.reshape(4,-1,64,1000).mean(0)
x.shape
(25, 64, 1000)

Using np.pad, np.reshape and np.nanmean for non multiples

Adding this general solution just incase the axis is not a multiple of 4. This starts with padding the numpy array to the next highest multiple of 4 with np.nan and then uses reshape as before, followed by np.nanmean which ignores nan values for its mean.

x = np.random.randint(low=0,high=10,size=(109,64,1000))

n = 4 - x.shape[0]%4     #remainder to next multiple of 4
p = ((0,n),(0,0),(0,0))  #padding config for each axis

x_padded = np.pad(x.astype(float), p, 'constant', constant_values=np.nan)
x_reshaped = x_padded.reshape((4,-1,64,1000))
x_avg = np.nanmean(x_reshaped, 0)
x_avg.shape
(28, 64, 1000)
Answered By: Akshay Sehgal
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.