How to add a dimension to an array and fill up the new dimension with a set of same data

Question:

I have two 1D-arrays. I need to expand the first array (a) with all lines from the second array (b) to create a new array that is a 1D-array merging the two arrays.

Example below to be clearer:

a = np.array(['x', 'y'])
b = np.array(['a', 'b', 'c'])
# how to handle the above 1D-arrays to create the below array (c)?
c = np.array(['xa', 'xb', 'xc', 'ya', 'yb', 'yc'])
print(c)

The new array c would look like:

['xa' 'xb' 'xc' 'ya' 'yb' 'yc']

Of course, I can do it with loops, but I’m looking for a smarter code.
Thank you

Asked By: tibibou

||

Answers:

You can use numpy broadcasting:

>>> np.char.add(a[:, None], b).ravel()
array(['xa', 'xb', 'xc', 'ya', 'yb', 'yc'], dtype='<U2')
Answered By: Corralien

You can achieve this using broadcasting in numpy. Here’s an example code that accomplishes your desired result:

import numpy as np

a = np.array(['x', 'y'])
b = np.array(['a', 'b', 'c'])

a = a[:, np.newaxis]  # add a new axis to a
b = b[np.newaxis, :]  # add a new axis to b
c = np.char.add(a, b)  # concatenate a and b element-wise using np.char.add

c = c.flatten()  # flatten the resulting array into a 1D array

print(c)  # output: ['xa' 'xb' 'xc' 'ya' 'yb' 'yc']

In the code above, you first add a new axis to array a using np.newaxis. This converts the 1D array a into a 2D array with shape (2,1). Similarly, you add a new axis to array b, which results in a 2D array with shape (1,3).

Next, you use np.char.add to concatenate the two arrays element-wise. This results in a 2D array with shape (2,3). Finally, you flatten the resulting array into a 1D array using the flatten() method.

Answered By: Warwulf

keep it simple, use two looping structures

a = np.array(['x', 'y'])
b = np.array(['a', 'b', 'c'])

def find_combination(a, b):
    result = []
    for i in range(len(a)):
        for j in range(len(b)):
            result.append(a[i] + b[j])
    return result


print(find_combination(a, b))
Answered By: Golden Lion

For 2 lists, a smart thing is to use a list comprehension:

In [234]: a = ['x', 'y']
     ...: b = ['a', 'b', 'c']
In [235]: [i+j for i in a for j in b]
Out[235]: ['xa', 'xb', 'xc', 'ya', 'yb', 'yc']

For arrays you can use np.char.add as shown in the other answers:

In [236]: A=np.array(a); B=np.array(b)
In [237]: np.char.add(A[:,None],B)
Out[237]: 
array([['xa', 'xb', 'xc'],
       ['ya', 'yb', 'yc']], dtype='<U2')

Timeit on such a small example has to viewed with caution. Often times for lists are better for small examples, but don’t scale nearly as well. But I expect np.char.add will hurt the array scaling (the np.char functions just apply standard string methods to the array elements.).

In [238]: timeit np.char.add(A[:,None],B)
23.2 µs ± 57.4 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
In [239]: timeit [i+j for i in a for j in b]
1.55 µs ± 35.3 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

Specifying object dtype when making the arrays, we can use the + operator, and gain some speed:

In [240]: A=np.array(a,object); B=np.array(b,object)    
In [241]: A[:,None]+B
Out[241]: 
array([['xa', 'xb', 'xc'],
       ['ya', 'yb', 'yc']], dtype=object)    
In [242]: timeit A[:,None]+B
7.39 µs ± 76.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

For reference, adding two numeric arrays:

In [245]: %%timeit x=np.arange(2); y=np.arange(3)
     ...: x[:,None]+y
5.95 µs ± 8.71 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
In [246]: %%timeit x=np.arange(200); y=np.arange(300)
     ...: x[:,None]+y
100 µs ± 533 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

The 2nd case is 10_000 larger, but time increases only 20x.

Answered By: hpaulj

Using numpy meshgrid method and then Transpose T:

np.array([x+y for x, y in np.array(np.meshgrid(a, b)).T.reshape(-1,2)])

Result

array(['xa', 'xb', 'xc', 'ya', 'yb', 'yc'], dtype='<U2')
Answered By: Laurent B.