calculate centroid of pandas series

Question:

I’ve got a pandas series with a datetime index (eg. precipitation). I would like to calculate the centroid of this precipitation time series (the mean of the datetime weighted by the value). My problem is that I cannot multiply a datetime object with a float. The (not working) idea looks like this (output should be a datetime object):

import pandas as pd
import datetime as dt
df = pd.DataFrame({'P':[1.2,10.1,5.6,0,0,0,0,0,3,4,6,4,8,2,0,0,0,0,0,0,0,0,0,0]}, index=[dt.datetime(2012,1,1,h) for h in range(24)])
P_sum = df['P'].sum()
prod = df.index * df['P']/P_sum
mean_date = prod.mean()
Asked By: NicoH

||

Answers:

Try converting datetime to epoch timestamp before calculating the product (and also convert the final result back to datetime). I hope this helps.

import pandas as pd
import datetime as dt
df = pd.DataFrame({'P':[1.2,10.1,5.6,0,0,0,0,0,3,4,6,4,8,2,0,0,0,0,0,0,0,0,0,0]}, index=[dt.datetime(2012,1,1,h) for h in range(24)])
P_sum = df['P'].sum()
df["epoch"] = [float(t.strftime('%s')) for t in df.index]
prod = df["epoch"] * df['P'] / P_sum
mean_date = prod.mean()
print(dt.datetime.fromtimestamp(mean_date).strftime('%Y-%m-%d %H:%M:%S'))

Update: this might be what you are looking for:

import pandas as pd
import datetime as dt
df = pd.DataFrame({'P':[1.2,10.1,5.6,0,0,0,0,0,3,4,6,4,8,2,0,0,0,0,0,0,0,0,0,0]}, index=[dt.datetime(2012,1,1,h) for h in range(24)])
df["epoch"] = [float(t.strftime('%s')) for t in df.index]
mean_date = (df["epoch"] * df['P']).sum() / df['P'].sum()
print(dt.datetime.fromtimestamp(mean_date).strftime('%Y-%m-%d %H:%M:%S'))

Output:

2012-01-01 07:00:00

Update 2: code with better datetime conversion (the same output):

import pandas as pd
import datetime as dt
df = pd.DataFrame({'P':[1.2,10.1,5.6,0,0,0,0,0,3,4,6,4,8,2,0,0,0,0,0,0,0,0,0,0]}, index=[dt.datetime(2012,1,1,h) for h in range(24)]
df["epoch"] = df.index.astype('int64')//1e9
mean_date = (df["epoch"] * df['P']).sum() / df['P'].sum()
print(dt.datetime.fromtimestamp(mean_date).strftime('%Y-%m-%d %H:%M:%S'))
Answered By: Dmitry Duplyakin