Numpy array loses shape after applying mask across axis
Question:
Problem
I have np.array and mask which are of the same shape. Once I apply the mask, the array loses it shape and becomes 1D – flattened one dimensional.
Question
I am wanting to reduce my array across some axis, based on a mask of axis length 1D.
How can I apply a mask, but keep dimensionality of the array?
Example
A small example in code:
# data ...
>>> data = np.ones((4, 4))
>>> data.shape
(4, 4)
# mask ...
>>> mask = np.ones((4, 4), dtype=bool)
>>> mask.shape
(4, 4)
# apply mask ...
>>> data[mask].shape
(16,)
My ideal shape would be (4, 4)
.
An example with array dimension reduction across an axis:
# data, mask ...
>>> data = np.ones((4, 4))
>>> mask = np.ones((4, 4), dtype=bool)
# remove last column from data ...
>>> mask[:, 3] = False
>>> mask
array([[ True, True, True, False],
[ True, True, True, False],
[ True, True, True, False],
[ True, True, True, False]])
# equivalent mask in 1D ...
>>> mask[0]
array([ True, True, True, False])
# apply mask ...
>>> data[mask].shape
(12,)
The ideal dimensions of the array would be (4, 3)
without reshape.
Help is appreciated, thanks!
Answers:
I believe what you want can be done by calling new_data.reshape(837, -1)
. Here’s a brief example:
arr = np.arange(8*6).reshape(8,6)
maskpiece = np.array([True, False]*3)
mask = np.broadcast_to(maskpiece, (8,6))
print('the original arrayn%sn' % arr)
print('the flat masked arrayn%sn' % arr[mask])
print('the masked array reshaped into 2Dn%sn' % arr[mask].reshape(8, -1))
Output:
the original array
[[ 0 1 2 3 4 5]
[ 6 7 8 9 10 11]
[12 13 14 15 16 17]
[18 19 20 21 22 23]
[24 25 26 27 28 29]
[30 31 32 33 34 35]
[36 37 38 39 40 41]
[42 43 44 45 46 47]]
the flat masked array
[ 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46]
the masked array reshaped into 2D
[[ 0 2 4]
[ 6 8 10]
[12 14 16]
[18 20 22]
[24 26 28]
[30 32 34]
[36 38 40]
[42 44 46]]
The ‘correct’ way of achieving your goal is to not expand the mask to 2D. Instead index with [:, mask]
with the 1D mask. This indicates to numpy that you want axis 0 unchanged and mask
applied along axis 1.
a = np.arange(12).reshape(3, 4)
b = np.array((1,0,1,0),'?')
a
# array([[ 0, 1, 2, 3],
# [ 4, 5, 6, 7],
# [ 8, 9, 10, 11]])
b
# array([ True, False, True, False])
a[:, b]
# array([[ 0, 2],
# [ 4, 6],
# [ 8, 10]])
If your mask
is already 2D, numpy won’t check whether all its rows are the same because that would be inefficient. But obviously you can use [:, mask[0]]
in that case.
If your mask
is 2D and just happens to have the same number of True
s in each row then either use @tel’s answer. Or create an index array:
B = b^b[:3, None]
B
# array([[False, True, False, True],
# [ True, False, True, False],
# [False, True, False, True]])
J = np.where(B)[1].reshape(len(B), -1)
And now either
np.take_along_axis(a, J, 1)
# array([[ 1, 3],
# [ 4, 6],
# [ 9, 11]])
or
I = np.arange(len(J))[:, None]
IJ = I, J
a[IJ]
# #array([[ 1, 3],
# [ 4, 6],
# [ 9, 11]])
Problem
I have np.array and mask which are of the same shape. Once I apply the mask, the array loses it shape and becomes 1D – flattened one dimensional.
Question
I am wanting to reduce my array across some axis, based on a mask of axis length 1D.
How can I apply a mask, but keep dimensionality of the array?
Example
A small example in code:
# data ...
>>> data = np.ones((4, 4))
>>> data.shape
(4, 4)
# mask ...
>>> mask = np.ones((4, 4), dtype=bool)
>>> mask.shape
(4, 4)
# apply mask ...
>>> data[mask].shape
(16,)
My ideal shape would be (4, 4)
.
An example with array dimension reduction across an axis:
# data, mask ...
>>> data = np.ones((4, 4))
>>> mask = np.ones((4, 4), dtype=bool)
# remove last column from data ...
>>> mask[:, 3] = False
>>> mask
array([[ True, True, True, False],
[ True, True, True, False],
[ True, True, True, False],
[ True, True, True, False]])
# equivalent mask in 1D ...
>>> mask[0]
array([ True, True, True, False])
# apply mask ...
>>> data[mask].shape
(12,)
The ideal dimensions of the array would be (4, 3)
without reshape.
Help is appreciated, thanks!
I believe what you want can be done by calling new_data.reshape(837, -1)
. Here’s a brief example:
arr = np.arange(8*6).reshape(8,6)
maskpiece = np.array([True, False]*3)
mask = np.broadcast_to(maskpiece, (8,6))
print('the original arrayn%sn' % arr)
print('the flat masked arrayn%sn' % arr[mask])
print('the masked array reshaped into 2Dn%sn' % arr[mask].reshape(8, -1))
Output:
the original array
[[ 0 1 2 3 4 5]
[ 6 7 8 9 10 11]
[12 13 14 15 16 17]
[18 19 20 21 22 23]
[24 25 26 27 28 29]
[30 31 32 33 34 35]
[36 37 38 39 40 41]
[42 43 44 45 46 47]]
the flat masked array
[ 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46]
the masked array reshaped into 2D
[[ 0 2 4]
[ 6 8 10]
[12 14 16]
[18 20 22]
[24 26 28]
[30 32 34]
[36 38 40]
[42 44 46]]
The ‘correct’ way of achieving your goal is to not expand the mask to 2D. Instead index with [:, mask]
with the 1D mask. This indicates to numpy that you want axis 0 unchanged and mask
applied along axis 1.
a = np.arange(12).reshape(3, 4)
b = np.array((1,0,1,0),'?')
a
# array([[ 0, 1, 2, 3],
# [ 4, 5, 6, 7],
# [ 8, 9, 10, 11]])
b
# array([ True, False, True, False])
a[:, b]
# array([[ 0, 2],
# [ 4, 6],
# [ 8, 10]])
If your mask
is already 2D, numpy won’t check whether all its rows are the same because that would be inefficient. But obviously you can use [:, mask[0]]
in that case.
If your mask
is 2D and just happens to have the same number of True
s in each row then either use @tel’s answer. Or create an index array:
B = b^b[:3, None]
B
# array([[False, True, False, True],
# [ True, False, True, False],
# [False, True, False, True]])
J = np.where(B)[1].reshape(len(B), -1)
And now either
np.take_along_axis(a, J, 1)
# array([[ 1, 3],
# [ 4, 6],
# [ 9, 11]])
or
I = np.arange(len(J))[:, None]
IJ = I, J
a[IJ]
# #array([[ 1, 3],
# [ 4, 6],
# [ 9, 11]])