Average consective nth row in numpy
Question:
I have a 3D numpy array.
x=np.random.randint(low=0,high=10,size=(100,64,1000))
I want to average every 4th row, for example first 4, then 4-8, 8-12 and so on.
I tried the following way
x =np.split(x,len(x)/4)
np.mean(np.stack(x),1)
I am bit confused if its the correct way? Or if there is a better way. Also how to do if first dimension is not completely divisible by 4.
For example, I can do this way
x =np.array_split(x,len(x)/4)
np.stack([np.mean(i,0) for i in x],0)
Thanks
EDIT:
Here is my use case.
This is sensor data, where 100 is the number of time data been collected (trials), 64 is the number of channels of sensor and 1000 is sensor signal (length). I want sensor signal to be average for first 4 trials, then next 4 trials and so on.
Answers:
The fastes way is probably reshaping the array, then using mean
.
import numpy as np
x=np.random.randint(low=0,high=10,size=(100,64,6144))
# the following will crash if shape is not divisible by 4!
temp = x.reshape((4,-1)).mean(axis=0)
# you have to identify for your use case how to re-shape the array after averaging
result = temp.reshape((x.shape[0]//4, x.shape[1], x.shape[2]))
You will need to check if the shapes are devisible by 4 and if not make them divisible.
you can also do it in any one dimension:
result= x.reshape((100,4,-1,6144)).mean(axis=1)
Simply split one dimension into a 4 and a -1 (i.e. the let numpy calculate it by itself) and then average over the dimension where you have your 4 samples.
Using np.reshape
and np.mean
for multiples
Try this with a reshape
and then mean
over the specific axis –
x = np.random.randint(low=0,high=10,size=(100,64,1000))
#Reshape to (4, 25, 65, 1000) and then reduce 0th axis with a mean
x = x.reshape(4,-1,64,1000).mean(0)
x.shape
(25, 64, 1000)
Using np.pad
, np.reshape
and np.nanmean
for non multiples
Adding this general solution just incase the axis is not a multiple of 4. This starts with padding the numpy array to the next highest multiple of 4 with np.nan
and then uses reshape as before, followed by np.nanmean
which ignores nan values for its mean.
x = np.random.randint(low=0,high=10,size=(109,64,1000))
n = 4 - x.shape[0]%4 #remainder to next multiple of 4
p = ((0,n),(0,0),(0,0)) #padding config for each axis
x_padded = np.pad(x.astype(float), p, 'constant', constant_values=np.nan)
x_reshaped = x_padded.reshape((4,-1,64,1000))
x_avg = np.nanmean(x_reshaped, 0)
x_avg.shape
(28, 64, 1000)
I have a 3D numpy array.
x=np.random.randint(low=0,high=10,size=(100,64,1000))
I want to average every 4th row, for example first 4, then 4-8, 8-12 and so on.
I tried the following way
x =np.split(x,len(x)/4)
np.mean(np.stack(x),1)
I am bit confused if its the correct way? Or if there is a better way. Also how to do if first dimension is not completely divisible by 4.
For example, I can do this way
x =np.array_split(x,len(x)/4)
np.stack([np.mean(i,0) for i in x],0)
Thanks
EDIT:
Here is my use case.
This is sensor data, where 100 is the number of time data been collected (trials), 64 is the number of channels of sensor and 1000 is sensor signal (length). I want sensor signal to be average for first 4 trials, then next 4 trials and so on.
The fastes way is probably reshaping the array, then using mean
.
import numpy as np
x=np.random.randint(low=0,high=10,size=(100,64,6144))
# the following will crash if shape is not divisible by 4!
temp = x.reshape((4,-1)).mean(axis=0)
# you have to identify for your use case how to re-shape the array after averaging
result = temp.reshape((x.shape[0]//4, x.shape[1], x.shape[2]))
You will need to check if the shapes are devisible by 4 and if not make them divisible.
you can also do it in any one dimension:
result= x.reshape((100,4,-1,6144)).mean(axis=1)
Simply split one dimension into a 4 and a -1 (i.e. the let numpy calculate it by itself) and then average over the dimension where you have your 4 samples.
Using np.reshape
and np.mean
for multiples
Try this with a reshape
and then mean
over the specific axis –
x = np.random.randint(low=0,high=10,size=(100,64,1000))
#Reshape to (4, 25, 65, 1000) and then reduce 0th axis with a mean
x = x.reshape(4,-1,64,1000).mean(0)
x.shape
(25, 64, 1000)
Using np.pad
, np.reshape
and np.nanmean
for non multiples
Adding this general solution just incase the axis is not a multiple of 4. This starts with padding the numpy array to the next highest multiple of 4 with np.nan
and then uses reshape as before, followed by np.nanmean
which ignores nan values for its mean.
x = np.random.randint(low=0,high=10,size=(109,64,1000))
n = 4 - x.shape[0]%4 #remainder to next multiple of 4
p = ((0,n),(0,0),(0,0)) #padding config for each axis
x_padded = np.pad(x.astype(float), p, 'constant', constant_values=np.nan)
x_reshaped = x_padded.reshape((4,-1,64,1000))
x_avg = np.nanmean(x_reshaped, 0)
x_avg.shape
(28, 64, 1000)