How can I append a multidimensional array to a new dimension with Numpy?
Question:
I have an empty list: x = []
.
I have a numpy array, y
, of shape: (180, 161)
. I can’t necessarily define x
to be an np.empty
of a particular shape, because I won’t know the shape of y
ahead of time.
I want to append y
to x
so that x
will have a .shape
of (1, 180, 161)
.
Then if I append more, I want it to be (n, 180, 161)
I tried .append
and .stack
, but I’ve had a variety of errors:
TypeError: only size-1 arrays can be converted to Python scalars
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 3 dimension(s) and the array at index 1 has 2 dimension(s)
And so on. It seems that this should be simple, but it’s strangely difficult.
Answers:
You can reshape y
to be (1, *y.shape)
.
Then for appending an array you can say:
y_1 = np.vstack((y, new_arr))
where y_1.shape
produces a (2, *y.shape)
numpy array.
To save memory you can say y = np.vstack((y, new_arr))
You might have to reshape your array to (1, *y.shape)
however.
This is a very basic example:
import numpy as np
a = np.ones((1,2,3))
b = np.ones((1,2,3))
np.vstack((a,b)).shape # (2,2,3)
Let me know if this helps!
If you keep x
as a list
then if you just want to maintain the shape by appending, it is possible:
>>> import numpy as np
>>> x = []
>>> y = np.arange(12).reshape(3,4)
>>> x.append(y)
>>> np.shape(x)
(1, 3, 4)
>>> x.append(y)
>>> np.shape(x)
(2, 3, 4)
>>> for i in range(10):
... x.append(y)
>>> np.shape(x)
(12, 3, 4)
But considering you are dealing with np.array
s it may not be convenient for you to keep x
as list
, so you may try this:
>>> x = np.array(x)
>>> x.shape
(12, 3, 4)
>>> y[None,...].shape
(1, 3, 4)
>>> np.append(x, y[None,...],axis=0).shape
(13, 3, 4)
Word of caution:
As pointed out by @hpaulj :
np.append
should be avoided, as it is extremely slow, probably only faster than:
x = np.array([*x, y])
The correct usage would be:
x = np.concatenate([x, y[None,...]], axis=0)
Either way, concatenating or appending is generally a speed bump in numpy
. So unless you absolutely need to create an array this way, you should work with lists. Also most functions applied to np.arrays
work on list
s as well. Note, functions applied to arrays, not methods of an np.array
object. For example:
>>> x = list((1, 2, 3, 4))
>>> np.shape(x)
(4,)
>>> x.shape
Traceback (most recent call last):
File "<ipython-input-100-9f2b259887ef>", line 1, in <module>
x.shape
AttributeError: 'list' object has no attribute 'shape'
So I would suggest appending to list, and then after you have done appending all the arrays, convert the list to np.array
if you require.
Assuming all items in x
have the same shape, you can first construct a list
and then construct the NumPy array from the list.
There, you have two options:
np.array()
which is faster but not flexible
np.stack()
which is slower but allows you to choose over which axis should the stack happen (it is roughly equivalent to np.array().transpose(...).copy()
The code would look like:
import numpy as np
n = 100
x = [np.random.randint(0, 10, (10, 20)) for _ in range(n)]
# same as: y = np.stack(x, 0)
y = np.array(x)
print(y.shape)
# (100, 10, 20)
Of course this line:
x = [np.random.randint(0, 10, (10, 20)) for _ in range(n)]
can be replaced with:
x = []
for _ in range(n):
x.append(np.random.randint(0, 10, (10, 20)))
You could also use np.append()
, e.g.:
def stacker(arrs):
result = arrs[0][None, ...]
for arr in arrs[1:]:
result = np.append(result, arr[None, ...], 0)
return result
but with horrific performances:
n = 1000
shape = (100, 100)
x = [np.random.randint(0, n, shape) for _ in range(n)]
%timeit np.array(x)
# 10 loops, best of 3: 21.1 ms per loop
%timeit np.stack(x)
# 10 loops, best of 3: 21.6 ms per loop
%timeit stacker(x)
# 1 loop, best of 3: 11 s per loop
and, as you can see, performance-wise, the list
-based method is way faster.
I have an empty list: x = []
.
I have a numpy array, y
, of shape: (180, 161)
. I can’t necessarily define x
to be an np.empty
of a particular shape, because I won’t know the shape of y
ahead of time.
I want to append y
to x
so that x
will have a .shape
of (1, 180, 161)
.
Then if I append more, I want it to be (n, 180, 161)
I tried .append
and .stack
, but I’ve had a variety of errors:
TypeError: only size-1 arrays can be converted to Python scalars
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 3 dimension(s) and the array at index 1 has 2 dimension(s)
And so on. It seems that this should be simple, but it’s strangely difficult.
You can reshape y
to be (1, *y.shape)
.
Then for appending an array you can say:
y_1 = np.vstack((y, new_arr))
where y_1.shape
produces a (2, *y.shape)
numpy array.
To save memory you can say y = np.vstack((y, new_arr))
You might have to reshape your array to (1, *y.shape)
however.
This is a very basic example:
import numpy as np
a = np.ones((1,2,3))
b = np.ones((1,2,3))
np.vstack((a,b)).shape # (2,2,3)
Let me know if this helps!
If you keep x
as a list
then if you just want to maintain the shape by appending, it is possible:
>>> import numpy as np
>>> x = []
>>> y = np.arange(12).reshape(3,4)
>>> x.append(y)
>>> np.shape(x)
(1, 3, 4)
>>> x.append(y)
>>> np.shape(x)
(2, 3, 4)
>>> for i in range(10):
... x.append(y)
>>> np.shape(x)
(12, 3, 4)
But considering you are dealing with np.array
s it may not be convenient for you to keep x
as list
, so you may try this:
>>> x = np.array(x)
>>> x.shape
(12, 3, 4)
>>> y[None,...].shape
(1, 3, 4)
>>> np.append(x, y[None,...],axis=0).shape
(13, 3, 4)
Word of caution:
As pointed out by @hpaulj :
np.append
should be avoided, as it is extremely slow, probably only faster than:
x = np.array([*x, y])
The correct usage would be:
x = np.concatenate([x, y[None,...]], axis=0)
Either way, concatenating or appending is generally a speed bump in numpy
. So unless you absolutely need to create an array this way, you should work with lists. Also most functions applied to np.arrays
work on list
s as well. Note, functions applied to arrays, not methods of an np.array
object. For example:
>>> x = list((1, 2, 3, 4))
>>> np.shape(x)
(4,)
>>> x.shape
Traceback (most recent call last):
File "<ipython-input-100-9f2b259887ef>", line 1, in <module>
x.shape
AttributeError: 'list' object has no attribute 'shape'
So I would suggest appending to list, and then after you have done appending all the arrays, convert the list to np.array
if you require.
Assuming all items in x
have the same shape, you can first construct a list
and then construct the NumPy array from the list.
There, you have two options:
np.array()
which is faster but not flexiblenp.stack()
which is slower but allows you to choose over which axis should the stack happen (it is roughly equivalent tonp.array().transpose(...).copy()
The code would look like:
import numpy as np
n = 100
x = [np.random.randint(0, 10, (10, 20)) for _ in range(n)]
# same as: y = np.stack(x, 0)
y = np.array(x)
print(y.shape)
# (100, 10, 20)
Of course this line:
x = [np.random.randint(0, 10, (10, 20)) for _ in range(n)]
can be replaced with:
x = []
for _ in range(n):
x.append(np.random.randint(0, 10, (10, 20)))
You could also use np.append()
, e.g.:
def stacker(arrs):
result = arrs[0][None, ...]
for arr in arrs[1:]:
result = np.append(result, arr[None, ...], 0)
return result
but with horrific performances:
n = 1000
shape = (100, 100)
x = [np.random.randint(0, n, shape) for _ in range(n)]
%timeit np.array(x)
# 10 loops, best of 3: 21.1 ms per loop
%timeit np.stack(x)
# 10 loops, best of 3: 21.6 ms per loop
%timeit stacker(x)
# 1 loop, best of 3: 11 s per loop
and, as you can see, performance-wise, the list
-based method is way faster.