elegant way of convert a numpy array containing datetime.timedelta into seconds in python 2.7

Question:

I have a numpy array called dt. Each element is of type datetime.timedelta. For example:

>>>dt[0]
datetime.timedelta(0, 1, 36000)

how can I convert dt into the array dt_sec which contains only seconds without looping? my current solution (which works, but I don’t like it) is:

dt_sec = zeros((len(dt),1))
for i in range(0,len(dt),1):
    dt_sec[i] = dt[i].total_seconds()

I tried to use dt.total_seconds() but of course it didn’t work. any idea on how to avoid this loop?

Thanks

Asked By: otmezger

||

Answers:

import numpy as np

helper = np.vectorize(lambda x: x.total_seconds())
dt_sec = helper(dt)
Answered By: prgao

You could use a “list comprehension”:

dt_sec = [delta.total_seconds() for delta in dt]

Behind the scenes, numpy ought to translate that to a pretty speedy operation.

Answered By: Reinout van Rees

numpy has its own datetime and timedelta formats. Just use them ;).

Set-up for example:

import datetime
import numpy

times = numpy.array([datetime.timedelta(0, 1, 36000)])

Code:

times.astype("timedelta64[ms]").astype(int) / 1000
#>>> array([ 1.036])

Since people don’t seem to realise that this is the best solution, here are some timings of a timedelta64 array vs a datetime.datetime array:

SETUP="
import datetime
import numpy

times = numpy.array([datetime.timedelta(0, 1, 36000)] * 100000)
numpy_times = times.astype('timedelta64[ms]')
"

python -m timeit -s "$SETUP" "numpy_times.astype(int) / 1000"
python -m timeit -s "$SETUP" "numpy.vectorize(lambda x: x.total_seconds())(times)"
python -m timeit -s "$SETUP" "[delta.total_seconds() for delta in times]"

Results:

100 loops, best of 3: 4.54 msec per loop
10 loops, best of 3: 99.5 msec per loop
10 loops, best of 3: 67.1 msec per loop

The initial translation will take about two times as much time as the vectorized expression, but each operation from then-on into perpetuity on that timedelta array will be about 20 times faster.


If you’re never going to use those timedeltas again, consider asking yourself why you ever made the deltas (as opposed to timedelta64s) in the first place, and then use the numpy.vectorize expression. It’s less native but for some reason it’s faster.

Answered By: Veedrac

I like the use of np.vectorize as suggested by prgao. If you just want a Python list, you can also do the following:

dt_sec = map(datetime.timedelta.total_seconds, dt)
Answered By: crayzeewulf

A convenient and elegant way is using a pandas.Series and using the dt.total_seconds attribute:

import numpy as np
import pandas as pd

# create example datetime arrays
arr1 = np.array(['2007-07-13', '2006-01-13', '2010-08-13'], dtype='datetime64')
arr2 = np.array(['2007-07-15', '2006-01-18', '2010-08-22'], dtype='datetime64')

# timedelta array
td = arr2 - arr1

# get total seconds
pd.Series(td).dt.total_seconds()
0    172800.0
1    432000.0
2    777600.0
dtype: float64
Answered By: Erfan

Recommendation

It is recommended to convert as follows:

deltatime.astype("timedelta64[ms]").astype("int64")/1000

Problem of times.astype("timedelta64[ms]").astype(int)

The data type timedelta64 stores data as a 64-bit integer. The astyp(int) method will convert data into a 32-bit integer. So there is a chance that the conversion will fail, as demonstrated below:

date_rng = np.arange(
    np.datetime64("2022-09-01"),
    np.datetime64("2022-09-30"),
    np.timedelta64(1, "D")
)
deltatime = date_rng - np.datetime64("2022-01-01")

print( deltatime.astype("timedelta64[ms]").astype(int) / 1000 )
# output:
#[-479636.48 -393236.48 -306836.48 -220436.48 -134036.48  -47636.48
#   38763.52  125163.52  211563.52  297963.52  384363.52  470763.52
#  557163.52  643563.52  729963.52  816363.52  902763.52  989163.52
# 1075563.52 1161963.52 1248363.52 1334763.52 1421163.52 1507563.52
# 1593963.52 1680363.52 1766763.52 1853163.52 1939563.52]

print( deltatime.astype("timedelta64[ms]").astype("int64")/1000 )
#[20995200. 21081600. 21168000. 21254400. 21340800. 21427200. 21513600.
# 21600000. 21686400. 21772800. 21859200. 21945600. 22032000. 22118400.
# 22204800. 22291200. 22377600. 22464000. 22550400. 22636800. 22723200.
# 22809600. 22896000. 22982400. 23068800. 23155200. 23241600. 23328000.
# 23414400.]
Answered By: Ken T
Categories: questions Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.