Reshape a 3d array to a 2d array with leading points
Question:
I want to reshape this array (Python)
[[[0, 1, 2], [3, 4, 5], [6, 7, 8]],
[[0, 1, 2], [3, 4, 5], [6, 7, 8]],
[[0, 1, 2], [3, 4, 5], [6, 7, 8]]]
To this:
[
[0,0,0],
[1,1,1],
[2,2,2],
[3,3,3],
[4,4,4],
[5,5,5],
[6,6,6],
[7,7,7],
[8,8,8],
]
And then back
Couldn’t figure out how to do it with np.reshape
Its a series of height maps, and I want to interpolate each point with the corresponding one at the next map to create a smooth transition between them
Answers:
import numpy as np
a = np.array(
[[[0, 1, 2], [3, 4, 5], [6, 7, 8]],
[[0, 1, 2], [3, 4, 5], [6, 7, 8]],
[[0, 1, 2], [3, 4, 5], [6, 7, 8]]]
)
b = np.vstack(np.moveaxis(a, 0, 2))
Reverse operation:
a2 = np.moveaxis(np.vsplit(b, 3), 2, 0)
I think the easiest way to understand how this works is to look at the examples for vstack
and then figuring out how do we need to modify array a
so that vstack
can produce the desired output.
In this case,
>>> np.moveaxis(a, 0, 2)
array([[[0, 0, 0],
[1, 1, 1],
[2, 2, 2]],
[[4, 4, 4],
[5, 5, 5],
[6, 6, 6]],
[[7, 7, 7],
[8, 8, 8],
[9, 9, 9]]])
prepares the array a
in such a way that now vstack
can simply "stack" (glue? concatenate?) the 3 "subarrays" on top of each other, producing the desired 2D array.
EDIT: Second solution and Timings
This solution is an order of magnitude faster than any previous solution:
import numpy as np
a = np.array(
[[[0, 1, 2], [3, 4, 5], [6, 7, 8]],
[[0, 1, 2], [3, 4, 5], [6, 7, 8]],
[[0, 1, 2], [3, 4, 5], [6, 7, 8]]]
)
b = a.reshape(3, 1).T
and for reverse:
a2 = b.T.reshape(3, 3, 1)
Some timings:

This solution:
In [3]: %timeit a.reshape(3, 1).T 277 ns ± 7.89 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

My previous solution (
vstack
andmoveaxis
):In [4]: %timeit np.vstack(np.moveaxis(a, 0, 2)) 9.56 µs ± 116 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

@chrsig’s solution (
reshape
andmoveaxis
):In [5]: %timeit np.moveaxis(a, 0,1).reshape(1,3) 3.83 µs ± 147 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

@chrsig’s 2nd solution (stride tricks):
In [6]: %timeit np.lib.stride_tricks.as_strided(a, shape=(len(a)*len(a[0]), 3), strides=(a.strides[2], a.strides[0])) 4.38 µs ± 116 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
With last correction, it seems that what you want is something like
np.moveaxis(a, 0,1).reshape(1,3)
Result
array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4],
[5, 5, 5],
[6, 6, 6],
[7, 7, 7],
[8, 8, 8]])
You probably know how to use reshape
. It reinterprets the data as an array of as many lines as needed and 3 columns. The reason why reshape
alone won’t do exactly what you want is because you would need the 0s to be consecutive in memory, then the 1s then the 2s, … Which they are not.
But that is solved by moveaxis
: those 0s, 1s, 2s, … are consecutive when you iterate along axis 0 of your input array. So all you have to do is move axis 0 to the end, so that iterating the last axis does that (visiting 0s, then 1s, then 2s, …).
Note that moveaxis
is very fast. Because it does not really build a new array. It is just a different view of the existing array. Some tricks with strides, so that visiting order appears changed.
Since you also asked for the other way, here it is (but it is just the same 2 operations, reversed and in reverse order. So undo the reshape, then undo the move axis)
res=np.moveaxis(a, 0,1).reshape(1,3) # Just to start from here
np.moveaxis(res.reshape(1,3,3), 1, 0)
Result
array([[[0, 1, 2],
[3, 4, 5],
[6, 7, 8]],
[[0, 1, 2],
[3, 4, 5],
[6, 7, 8]],
[[0, 1, 2],
[3, 4, 5],
[6, 7, 8]]])
as expected
Another answer (it is rare that I post 2 answers to the same question. But this is really a different answer, and it is not obvious which one is the best, so I think both deserve their own independent post) is to rely on stride_tricks
. It is a little bit what moveaxis
already does. But not reshape
.
A numpy array is just a bunch of data in memory. That are iterated with a given memory offset for each axis. Called stride
.
For example
a=np.array([[[0, 1, 2], [3, 4, 5], [6, 7, 8]],
[[0, 1, 2], [3, 4, 5], [6, 7, 8]],
[[0, 1, 2], [3, 4, 5], [6, 7, 8]]])
is, internally, a memory array with integers 0,1,2,3,4,5,6,7,8,0,1,2,…
And to iterate them we use strides
a.strides
# (72, 24, 8)
Meaning that a[i+1,j,k]
is 72 bytes after a[i,j,k]
, that a[i,j+1,k]
is 24 bytes after a[i,j,k]
and a[i,j,k+1]
is 8 bytes after a[i,j,k]
.
Or, said otherwise, that a[i,j,k]
is at address 72*i+24*j+8*k
Usually data are just contiguous, so strides is just 8 for the last axis (when data are 64 bits integers), 83 for the axis before, because there are 3 of those 8 bytes integer per elements of 2nd axis, and 83*3 for the first.
But you can have arrays with different strides. That is what happens with moveaxis
np.moveaxis(a, 0, 2).strides
# (24, 8, 72)
That is even all what moveaxis
does: just change the strides so that np.moveaxis(a,0,2)[i,j,k]
is at memory 24*i+8*j+72*k
in other words, where a[k,i,j]
is.
Numpy provides lower level function np.lib.stride_tricks.as_strided
to manipulate those strides as we want (not just moving them as with moveaxis
).
Equivalent of that np.moveaxis
for example is
np.lib.stride_tricks.as_strided(a, strides=(24,8,72))
We can force that result to be a 2d array, like this
res = np.lib.stride_tricks.as_strided(a, shape=(len(a)*len(a[0]), 3), strides=(a.strides[2], a.strides[0]))
One limitation for that: it works only if a
is a contiguous array (that is, if a is not already a result of some strides manipulation).
One advantage: it is not a new array. No new memory is used here. res
is just the same as a
, viewed differently.
In our case result is
>>> res
array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4],
[5, 5, 5],
[6, 6, 6],
[7, 7, 7],
[8, 8, 8]])
But here, you don’t need to go back and forth. Those two visions of the data designate the same array.
So for example, if you change
res[0,1]=12
a[1,2,2]=15
Both operations impact both arrays
>>>a
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[12, 1, 2],
[ 3, 4, 5],
[ 6, 7, 15]],
[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]]])
As you see, a[1,2,2]
is now 15
as it should. But also a[1,0,0]
(that is what was the 2nd 0) is now 12.
Likewise
>>> b
array([[ 0, 12, 0],
[ 1, 1, 1],
[ 2, 2, 2],
[ 3, 3, 3],
[ 4, 4, 4],
[ 5, 5, 5],
[ 6, 6, 6],
[ 7, 7, 7],
[ 8, 15, 8]])
b[0,1]
is now 12
, as expected. But also b[8,1]
is now 15.
So don’t know if this is useful for you. But I suspect it might, since you wanted to be able to go back and forth both format. With this, no need to. You can have them both at the same time, without conversion, without building arrays.
And of course, that is even faster than moveaxis/reshape
So, tl;dr:
If having two views of the same array is ok for you, and if a
is a contiguous array (not something that you obtain by other strides manipulation), then
res = np.lib.stride_tricks.as_strided(a, shape=(len(a)*len(a[0]), 3), strides=(a.strides[2], a.strides[0]))
might be an even better solution for you than moveaxis/reshape