Convert a Python dataframe date column in seconds

Question:

I am reading a .csv data file using pd.read_csv and I get these first 5 rows from my global dataframe (containing thousands of rows):

    time                   id   time_offset
0   2017-12-01 21:00:00     0   -60
1   2017-12-01 21:01:00     0   -59
2   2017-12-01 21:02:00     0   -58
3   2017-12-01 21:03:00     0   -57
4   2017-12-01 21:04:00     0   -56

I’m not very good at manipulating dates in Python and I haven’t found how to do this manipulation:

  1. create in my dataframe a new hour column from the existing time column, containing only the hours:minutes:seconds data, which should be: 21:00:00, 21:01:00, 21:02:00, etc…
  2. then create another column seconds from the newly created hour, containing the number of seconds elapsed since time 0, which should be: 75600 (calculated as 21×3600), 75601 (calculated ,as 21×3600 + 1), etc…

Any help in sorting this out would be much appreciated.

Asked By: Andrew

||

Answers:

Assignment of the datetime series as the index is typically useful. Use pd.to_datetime() converts it to a usable format.

df.index = pd.to_datetime(df['time'])
df.drop('time',axis=1)
  1. can use the strftime function – https://strftime.org/
df['time'] = df.index.strftime("%H:%M:%S")
  1. since df.index[0] is the very first time you can subtract and use .seconds attribute:
df['seconds since'] = (df.index = df.index[0]).seconds
Answered By: ColtonNeary

You can try:

# convert `time` column to datetime (if necessary):
df["time"] = pd.to_datetime(df["time"])

df["hour"] = df["time"].dt.time
df["seconds"] = (
    df["time"].dt.hour * 60 * 60
    + df["time"].dt.minute * 60
    + df["time"].dt.second
)
print(df)

Prints:

                 time  id  time_offset      hour  seconds
0 2017-12-01 21:00:00   0          -60  21:00:00    75600
1 2017-12-01 21:01:00   0          -59  21:01:00    75660
2 2017-12-01 21:02:00   0          -58  21:02:00    75720
3 2017-12-01 21:03:00   0          -57  21:03:00    75780
4 2017-12-01 21:04:00   0          -56  21:04:00    75840
Answered By: Andrej Kesely

Example

data = {'time': {0: '2017-12-01 21:00:00', 1: '2017-12-01 21:01:00', 2: '2017-12-01 21:02:00', 
                 3: '2017-12-01 21:03:00', 4: '2017-12-01 21:04:00'}, 
        'id': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0}, 
        'time_offset': {0: -60, 1: -59, 2: -58, 3: -57, 4: -56}}
df = pd.DataFrame(data)

df

  time                   id   time_offset
0   2017-12-01 21:00:00     0   -60
1   2017-12-01 21:01:00     0   -59
2   2017-12-01 21:02:00     0   -58
3   2017-12-01 21:03:00     0   -57
4   2017-12-01 21:04:00     0   -56

Code

make timedelta and use dt.total_seconds(). In the case of example, since time column is object, it can be converted to timedelta in the following way.

pd.to_timedelta(df['time'].str.split(' ').str[1])

you can convert timedelta to second by using dt.total_seconds()

s = pd.to_timedelta(df['time'].str.split(' ').str[1]).dt.total_seconds()

s

0    75600.0
1    75660.0
2    75720.0
3    75780.0
4    75840.0
Name: time, dtype: float64
Answered By: Panda Kim
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.