When should I use hstack/vstack vs append vs concatenate vs column_stack?
Question:
Simple question: what is the advantage of each of these methods. It seems that given the right parameters (and ndarray shapes) they all work seemingly equivalently. Do some work in place? Have better performance? Which functions should I use when?
Answers:
In IPython you can look at the source code of a function by typing its name followed by ??
. Taking a look at hstack
we can see that it’s actually just a wrapper around concatenate
(similarly with vstack
and column_stack
):
np.hstack??
def hstack(tup):
...
arrs = [atleast_1d(_m) for _m in tup]
# As a special case, dimension 0 of 1-dimensional arrays is "horizontal"
if arrs[0].ndim == 1:
return _nx.concatenate(arrs, 0)
else:
return _nx.concatenate(arrs, 1)
So I guess just use whichever one has the most logical sounding name to you.
All the functions are written in Python except np.concatenate
. With an IPython shell you just use ??
.
If not, here’s a summary of their code:
vstack
concatenate([atleast_2d(_m) for _m in tup], 0)
i.e. turn all inputs in to 2d (or more) and concatenate on first
hstack
concatenate([atleast_1d(_m) for _m in tup], axis=<0 or 1>)
colstack
transform arrays with (if needed)
array(arr, copy=False, subok=True, ndmin=2).T
append
concatenate((asarray(arr), values), axis=axis)
In other words, they all work by tweaking the dimensions of the input arrays, and then concatenating on the right axis. They are just convenience functions.
And newer np.stack
:
arrays = [asanyarray(arr) for arr in arrays]
shapes = set(arr.shape for arr in arrays)
result_ndim = arrays[0].ndim + 1
axis = normalize_axis_index(axis, result_ndim)
sl = (slice(None),) * axis + (_nx.newaxis,)
expanded_arrays = [arr[sl] for arr in arrays]
concatenate(expanded_arrays, axis=axis, out=out)
That is, it expands the dims of all inputs (a bit like np.expand_dims
), and then concatenates. With axis=0
, the effect is the same as np.array
.
hstack
documentation now adds:
The functions concatenate
, stack
and
block
provide more general stacking and concatenation operations.
np.block
is also new. It, in effect, recursively concatenates along the nested lists.
numpy.vstack: stack arrays in sequence vertically (row wise).Equivalent to np.concatenate(tup, axis=0)
example see: https://docs.scipy.org/doc/numpy/reference/generated/numpy.vstack.html
numpy.hstack: Stack arrays in sequence horizontally (column wise).Equivalent to np.concatenate(tup, axis=1)
, except for 1-D arrays where it concatenates along the first axis. example see:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.hstack.html
append is a function for python’s built-in data structure list
. Each time you add an element to the list. Obviously, To add multiple elements, you will use extend
. Simply put, numpy’s functions are much more powerful.
example:
suppose gray.shape = (n0,n1)
np.vstack((gray,gray,gray))
will have shape (n0*3, n1), you can also do it by np.concatenate((gray,gray,gray),axis=0)
np.hstack((gray,gray,gray))
will have shape (n0, n1*3), you can also do it by np.concatenate((gray,gray,gray),axis=1)
np.dstack((gray,gray,gray))
will have shape (n0, n1,3).
If you have two matrices, you’re good to go with just hstack
and vstack
:
If you’re stacking a matrice and a vector, hstack
becomes tricky to use, so column_stack
is a better option:
If you’re stacking two vectors, you’ve got three options:
And concatenate
in its raw form is useful for 3D and above, see
my article Numpy Illustrated for details.
Simple question: what is the advantage of each of these methods. It seems that given the right parameters (and ndarray shapes) they all work seemingly equivalently. Do some work in place? Have better performance? Which functions should I use when?
In IPython you can look at the source code of a function by typing its name followed by ??
. Taking a look at hstack
we can see that it’s actually just a wrapper around concatenate
(similarly with vstack
and column_stack
):
np.hstack??
def hstack(tup):
...
arrs = [atleast_1d(_m) for _m in tup]
# As a special case, dimension 0 of 1-dimensional arrays is "horizontal"
if arrs[0].ndim == 1:
return _nx.concatenate(arrs, 0)
else:
return _nx.concatenate(arrs, 1)
So I guess just use whichever one has the most logical sounding name to you.
All the functions are written in Python except np.concatenate
. With an IPython shell you just use ??
.
If not, here’s a summary of their code:
vstack
concatenate([atleast_2d(_m) for _m in tup], 0)
i.e. turn all inputs in to 2d (or more) and concatenate on first
hstack
concatenate([atleast_1d(_m) for _m in tup], axis=<0 or 1>)
colstack
transform arrays with (if needed)
array(arr, copy=False, subok=True, ndmin=2).T
append
concatenate((asarray(arr), values), axis=axis)
In other words, they all work by tweaking the dimensions of the input arrays, and then concatenating on the right axis. They are just convenience functions.
And newer np.stack
:
arrays = [asanyarray(arr) for arr in arrays]
shapes = set(arr.shape for arr in arrays)
result_ndim = arrays[0].ndim + 1
axis = normalize_axis_index(axis, result_ndim)
sl = (slice(None),) * axis + (_nx.newaxis,)
expanded_arrays = [arr[sl] for arr in arrays]
concatenate(expanded_arrays, axis=axis, out=out)
That is, it expands the dims of all inputs (a bit like np.expand_dims
), and then concatenates. With axis=0
, the effect is the same as np.array
.
hstack
documentation now adds:
The functions
concatenate
,stack
and
block
provide more general stacking and concatenation operations.
np.block
is also new. It, in effect, recursively concatenates along the nested lists.
numpy.vstack: stack arrays in sequence vertically (row wise).Equivalent to np.concatenate(tup, axis=0)
example see: https://docs.scipy.org/doc/numpy/reference/generated/numpy.vstack.html
numpy.hstack: Stack arrays in sequence horizontally (column wise).Equivalent to np.concatenate(tup, axis=1)
, except for 1-D arrays where it concatenates along the first axis. example see:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.hstack.html
append is a function for python’s built-in data structure list
. Each time you add an element to the list. Obviously, To add multiple elements, you will use extend
. Simply put, numpy’s functions are much more powerful.
example:
suppose gray.shape = (n0,n1)
np.vstack((gray,gray,gray))
will have shape (n0*3, n1), you can also do it by np.concatenate((gray,gray,gray),axis=0)
np.hstack((gray,gray,gray))
will have shape (n0, n1*3), you can also do it by np.concatenate((gray,gray,gray),axis=1)
np.dstack((gray,gray,gray))
will have shape (n0, n1,3).
If you have two matrices, you’re good to go with just hstack
and vstack
:
If you’re stacking a matrice and a vector, hstack
becomes tricky to use, so column_stack
is a better option:
If you’re stacking two vectors, you’ve got three options:
And concatenate
in its raw form is useful for 3D and above, see
my article Numpy Illustrated for details.