Python array/matrix dimension

Question:

I create two matrices

import numpy as np
arrA = np.zeros((9000,3))
arrB = np.zerros((9000,6))

I want to concatenate pieces of those matrices.
But when I try to do:

arrC = np.hstack((arrA, arrB[:,1]))

I get an error:

ValueError: all the input arrays must have same number of dimensions

I guess it’s because np.shape(arrB[:,1]) is equal (9000,) instead of (9000,1), but I cannot figure out how to resolve it.

Could you please comment on this issue?

Asked By: Leopoldo

||

Answers:

I would try something like this:

np.vstack((arrA.transpose(), arrB[:,1])).transpose()
Answered By: crispamares

You could preserve dimensions by passing a list of indices, not an index:

>>> arrB[:,1].shape
(9000,)
>>> arrB[:,[1]].shape
(9000, 1)
>>> out = np.hstack([arrA, arrB[:,[1]]])
>>> out.shape
(9000, 4)
Answered By: DSM

There several ways of making your selection from arrB a (9000,1) array:

np.hstack((arrA,arrB[:,[1]]))
np.hstack((arrA,arrB[:,1][:,None]))
np.hstack((arrA,arrB[:,1].reshape(9000,1)))
np.hstack((arrA,arrB[:,1].reshape(-1,1)))

One uses the concept of indexing with an array or list, the next adds a new axis (e.g. np.newaxis), the third uses reshape. These are all basic numpy array manipulation tasks.

Answered By: hpaulj

This is easier to see visually.

Assume:

>>> arrA=np.arange(9000*3).reshape(9000,3)
>>> arrA
array([[    0,     1,     2],
       [    3,     4,     5],
       [    6,     7,     8],
       ..., 
       [26991, 26992, 26993],
       [26994, 26995, 26996],
       [26997, 26998, 26999]])
>>> arrB=np.arange(9000*6).reshape(9000,6)
>>> arrB
array([[    0,     1,     2,     3,     4,     5],
       [    6,     7,     8,     9,    10,    11],
       [   12,    13,    14,    15,    16,    17],
       ..., 
       [53982, 53983, 53984, 53985, 53986, 53987],
       [53988, 53989, 53990, 53991, 53992, 53993],
       [53994, 53995, 53996, 53997, 53998, 53999]])

If you take a slice of arrB, you are producing a series that looks more like a row:

>>> arrB[:,1]
array([    1,     7,    13, ..., 53983, 53989, 53995])

What you need is a column the same shape as a column to add to arrA:

>>> arrB[:,[1]]
array([[    1],
       [    7],
       [   13],
       ..., 
       [53983],
       [53989],
       [53995]])

Then hstack works as expected:

>>> arrC=np.hstack((arrA, arrB[:,[1]]))
>>> arrC
array([[    0,     1,     2,     1],
       [    3,     4,     5,     7],
       [    6,     7,     8,    13],
       ..., 
       [26991, 26992, 26993, 53983],
       [26994, 26995, 26996, 53989],
       [26997, 26998, 26999, 53995]])

An alternate form is to specify -1 in one dimension and the number of rows or cols desired as the other in .reshape():

>>> arrB[:,1].reshape(-1,1)  # one col
array([[    1],
       [    7],
       [   13],
       ..., 
       [53983],
       [53989],
       [53995]])
>>> arrB[:,1].reshape(-1,6)   # 6 cols
array([[    1,     7,    13,    19,    25,    31],
       [   37,    43,    49,    55,    61,    67],
       [   73,    79,    85,    91,    97,   103],
       ..., 
       [53893, 53899, 53905, 53911, 53917, 53923],
       [53929, 53935, 53941, 53947, 53953, 53959],
       [53965, 53971, 53977, 53983, 53989, 53995]])
>>> arrB[:,1].reshape(2,-1)  # 2 rows
array([[    1,     7,    13, ..., 26983, 26989, 26995],
       [27001, 27007, 27013, ..., 53983, 53989, 53995]])

There is more on array shaping and stacking here

Answered By: dawg
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.