how to find the difference of dataframe which is given on start date and end date

Question:

I have to find the difference in data provided at 00:00:00 and 23:59:59 per day for seven days.
How to find the difference in the data frame, which is given on the start date and end date?

Sample Data

Date                         Data
2018-12-01 00:00:00          2
2018-12-01 12:00:00          5
2018-12-01 23:59:59          10
2018-12-02 00:00:00          12
2018-12-02 12:00:00          15
2018-12-02 23:59:59          22

Expected Output

Date                        Data
2018-12-01                   8
2018-12-02                   10              
Asked By: check_my_code

||

Answers:

You can use groupby and iterate over with min-max range.

import pandas as pd

df = pd.DataFrame({
    'Date': ['2018-12-01 00:00:00', '2018-12-01 12:00:00', '2018-12-01 23:59:59', 
             '2018-12-02 00:00:00', '2018-12-02 12:00:00', '2018-12-02 23:59:59'],
    'Data': [2, 5, 10, 12, 15, 22]
})

df['Date'] = pd.to_datetime(df['Date'])

df['Date_Only'] = df['Date'].dt.date
result = df.groupby('Date_Only').apply(lambda x: x['Data'].max() - x['Data'].min())

print(result)
Answered By: Hiren Namera

Example

data = {
    'Date': ['2018-12-01 00:00:00', '2018-12-01 12:00:00', '2018-12-01 23:59:59',
             '2018-12-02 00:00:00', '2018-12-02 12:00:00', '2018-12-02 23:59:59'],
    'Data': [2, 5, 10, 12, 15, 22]
}
df = pd.DataFrame(data)

Code

df['Date'] = pd.to_datetime(df['Date'])
out = (df.resample('D', on='Date')['Data']
       .agg(lambda x: x.iloc[-1] - x.iloc[0]).reset_index())

out

    Date        Data
0   2018-12-01  8
1   2018-12-02  10

Update

more efficient way

you can get same result following code:

g = df.resample('D', on='Date')['Data']
out = g.last().sub(g.first()).reset_index()
Answered By: Panda Kim
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.