I have a numpy array as following :
2016-07-02 10:55:01 2016-07-02 10:55:01 2016-07-02 10:55:01 2016-07-02 17:01:34 2016-07-02 17:01:34 2016-07-02 16:59:52 2016-07-02 17:01:34 2016-07-02 16:59:52 2016-07-02 16:59:52 2016-07-02 10:40:00 2016-07-02 12:01:14
this are two columns of array. date and time. but i want both into a single column concatenated by ‘t’. both the values are in string format.
I did it by a loop as follows, but that is a bad idea and taking much time. :
for D in Data: Data2 = np.append(Data2,np.array(D+"t"+D))
Please suggest an efficient solution.
Insert the tabs
t into your array using
numpy.insert and then do a
numpy.reshape from n by 3 to n*3 by 1
Neat, but not more efficient than simple loop (as Praveen pointed out in comment):
import numpy as np np.apply_along_axis(lambda d: d + 't' + d, 1, arr)
import numpy as np a=[,,] b=[,,] np.concatenate((a,b),axis=1)
Below method works for any two or more columns. It is very convenient if you want to concatenate multiple columns at a time, or even the whole row, because you don’t have to explicitly write d + ‘t’ + d + …
On my computer it performs 50~60% faster than
apply_along_axis() given above.
To concatenate the whole row delimited by ‘t’
result = list(['t'.join(row) for row in data])
Or if the actual row is larger and you only want to concatenate the first two columns:
result = list(['t'.join(row[0:2]) for row in data])
Performance Comparison of both methods for 10,000 iterations with a very tiny data-set (< 100 rows) :
|Above method||350 ms|