Concatenating two string columns of numpy array into single column in python


I have a numpy array as following :

2016-07-02  10:55:01
2016-07-02  10:55:01
2016-07-02  10:55:01
2016-07-02  17:01:34
2016-07-02  17:01:34
2016-07-02  16:59:52
2016-07-02  17:01:34
2016-07-02  16:59:52
2016-07-02  16:59:52
2016-07-02  10:40:00
2016-07-02  12:01:14

this are two columns of array. date and time. but i want both into a single column concatenated by ‘t’. both the values are in string format.

I did it by a loop as follows, but that is a bad idea and taking much time. :

for D in Data:
    Data2 = np.append(Data2,np.array(D[0]+"t"+D[1]))

Please suggest an efficient solution.

Asked By: KrunalParmar



Insert the tabs t into your array using numpy.insert and then do a numpy.reshape from n by 3 to n*3 by 1

Answered By: meetaig

Neat, but not more efficient than simple loop (as Praveen pointed out in comment):

import numpy as np

np.apply_along_axis(lambda d: d[0] + 't' + d[1], 1, arr)
Answered By: featuredpeow
import numpy as np

Answered By: ahad alam
  1. Below method works for any two or more columns. It is very convenient if you want to concatenate multiple columns at a time, or even the whole row, because you don’t have to explicitly write d[0] + ‘t’ + d[1] + …

  2. On my computer it performs 50~60% faster than apply_along_axis() given above.

To concatenate the whole row delimited by ‘t’

result = list(['t'.join(row) for row in data])

Or if the actual row is larger and you only want to concatenate the first two columns:

result = list(['t'.join(row[0:2]) for row in data])

Performance Comparison of both methods for 10,000 iterations with a very tiny data-set (< 100 rows) :

Method Time (ms)
Above method 350 ms
apply_along_axis() 870 ms
Answered By: farqis
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.