In my data frame I have Time column, I need to convert my HH:MM:SS.SS to seconds. How can I do that in python?

Question:

            Time    volts
0   15:15:10.951    368
1   15:15:11.950    373
2   15:15:12.950    368
3   15:15:13.949    316
4   15:15:14.949    368
... ... ...
2141    15:50:54.087    337
2142    15:50:55.069    343
2143    15:50:56.085    344
2144    15:50:57.081    339
2145    15:50:58.090    347
def time_convert(x):
  h,m,s = map(int,x.split(':'))
  return int(h) * 3600 + int(m) * 60 + int(s)        

The output I get:

ValueError                                Traceback (most recent call last)
<ipython-input-17-68cf4416cc88> in <module>
----> 1 df['Time'] = df['Time'].apply(time_convert)

4 frames
<ipython-input-12-42bee45f8bd8> in time_convert(x)
      1 def time_convert(x):
----> 2   h,m,s = map(int,x.split(':'))
      3   return int(h) * 3600 + int(m) * 60 + int(s)
      4 
      5 

ValueError: invalid literal for int() with base 10: '10.951'

I was expecting it to be converted to seconds. I only find HH:MM:SS format to seconds for solutions but I have not found any cases regarding SS.SS conversion.

Asked By: Manish Gurung

||

Answers:

I can only surmise what is in your dataframe. And anyway, iterating (even with apply) rows, is generally speaking a bad idea (very slow).

But, as for why it doesn’t work, it lies in your conversion function

def time_convert(x):
  h,m,s = map(int,x.split(':'))
  return int(h) * 3600 + int(m) * 60 + int(s)

you are converting to int twice here!
Once when mapping int to x.split(':').
And then, when converting each of h,m,s

So, simply

def time_convert(x):
  h,m,s = x.split(':')
  return int(h) * 3600 + int(m) * 60 + int(s)

does the same. And still doesn’t work. Because you cannot convert s to int, since it is not one

def time_convert(x):
  h,m,s = x.split(':')
  return int(h) * 3600 + int(m) * 60 + float(s)

As is, your code works. There must be more efficient way, but it works.

More efficient way

df.Time.str[:2].astype(int) is a series of int conversion of the 2 first chars of df for example.

df.Time.str[3:5].astype(int) likewise for 4th and 5th chars.

Likewise df.Time.str[6:].astype(float)

And you can do arithmetic on whole series. So
3600*df.Time.str[:2].astype(int) + 60*df.Time.str[3:5].astype(int) + df.Time.str[6:].astype(float) is the series of values you wanted.

Hence, a fastest version of what you wanted

df['Time'] = 3600*df.Time.str[:2].astype(int) + 60*df.Time.str[3:5].astype(int) + df.Time.str[6:].astype(float)
Answered By: chrslg
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.