Split a numpy array into nonoverlapping arrays

Question:

I am trying to split a 2D numpy array that is not square into nonoverlapping chunks of smaller 2D numpy arrays. Example – split a 3×4 array into chunks of 2×2:

The array to be split:

[[ 34  15  16  17]
 [ 78  98  99 100]
 [ 23  78  79  80]]

This should output:

[[34 15]
 [78 98]]

[[ 16  17]
 [ 99 100]]

So [23 78 79 80] are dropped because they do not match the 2×2 requirement.

My current code is this:

new_array = np.array([[34,15,16,17], [78,98,99,100], [23,78,79,80]])
window = 2
for x in range(0, new_array.shape[0], window):
    for y in range(0, new_array.shape[1], window):
        patch_im1 = new_array[x:x+window,y:y+window]

This outputs:

[[34 15]
 [78 98]]
[[ 16  17]
 [ 99 100]]
[[23 78]]
[[79 80]]

Ideally, I would like to have the chunks stored in a list.

Asked By: mummy

||

Answers:

Not sure of all possible cases (input array sizes) you may have in your problem, but this approach should be flexible to work with any 2D input size and any chunk shape. It utilizes view_as_blocks from the skimage library to get non-overlapping views of an array.

import numpy as np
from skimage.util.shape import view_as_blocks

new_array = np.array([[34,15,16,17], [78,98,99,100], [23,78,79,80]])

First, you need to trim the original array to get a size that is evenly divisible by the shape of your desired chunks. So, this 3x4 array will become a 2x4 array when we remove the last row.

chunk_shape = (2,2)
chunk_rows, chunk_cols = chunk_shape
rows_to_keep = new_array.shape[0] - new_array.shape[0] % chunk_rows
cols_to_keep = new_array.shape[1] - new_array.shape[1] % chunk_cols
temp = new_array[:rows_to_keep, :cols_to_keep]
print(temp)
# [[34  15  16  17]
#  [78  98  99 100]]

Now, we can use the view_as_blocks function to obtain the chunks of desired size and convert the result to a list of lists as you want:

res = view_as_blocks(temp, chunk_shape).reshape(-1, np.prod(chunk_shape)).tolist()
print(res)
# [[34, 15, 78, 98], [16, 17, 99, 100]]
Answered By: AlexK

This should work on any number of dimensions. Let’s take an array:

new_array = np.array(range(25)).reshape((5,5))

Output:

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

First calculate the number of last rows/columns which you don’t need:

N = 2
rem = np.array(new_array.shape) % N

Output:

array([1, 1])

Then remove the number of rows/columns from the end of your array on each dimension:

for ax, v in enumerate(rem):
if v != 0:
    new_array = np.delete(new_array, range(-1, -v-1, -1), axis=ax)

Output:

array([[ 0,  1,  2,  3],
       [ 5,  6,  7,  8],
       [10, 11, 12, 13],
       [15, 16, 17, 18]])

Then use np.split on each dimension:

arr_list = [new_array]
for ax in range(len(new_array.shape)):
    arr_list_list = [np.split(arr, arr.shape[ax] / N, axis=ax) for arr in arr_list]
    arr_list = [arr for j in arr_list_list for arr in j]

Output:

[array([[0, 1],
        [5, 6]]),
 array([[2, 3],
        [7, 8]]),
 array([[10, 11],
        [15, 16]]),
 array([[12, 13],
        [17, 18]])]

Then transform into a list:

[list(i.reshape(i.size)) for i in arr_list]

Output:

[[0, 1, 5, 6], [2, 3, 7, 8], [10, 11, 15, 16], [12, 13, 17, 18]]
Answered By: AndrzejO

You can add another dimension to your array. There are a bunch of ways of doing this. A relatively simple one is to get a view of the data that is the nearest multiple of the size that you want, then reshape and possibly transpose. The issue with doing it this way is of course that the moment you try to ravel the leading dimensions together, you will copy the data, since otherwise the strides could not work out because of the subset:

data = np.array([[34, 15, 16, 17],
                 [78, 98, 99, 100],
                 [23, 78, 79, 80]])
window = (2, 2)

trim = np.array(data.shape)
view_shape = trim - (-trim) % window
view = data[tuple(slice(None, v) for v in view_shape)]

new_shape = np.stack((view_shape // window, window), -1).ravel()
axes = np.arange(len(new_shape))

result = view.reshape(new_shape).transpose(*axes[::2], *axes[1::2])

The shape of result starts with the number of times window fits into data, which is (1, 2). The remaining dimensions are the window. This should work for any number of dimensions, as long as the length of window match the dimensions data.

If you need a 1D outer container instead of ND, you have two options. If you are OK copying data and want a monolithic array, you can do

result.reshape(-1, *result.shape[:-data.ndim])

If you want views into the original data for each window, you’ll have to use a list:

[result[i, j] for i in range(result.shape[0]) for j in range(result.shape[1])]
Answered By: Mad Physicist
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.