TypeError: float() argument must be a string or a number, not 'datetime.time' in relation with a scatter plot

Question

i have created the following DataFrame: date_merge as an example, because i actually want to display the temperature values throughout the day.

df_time = filtered_df_date["longtime"].dt.time
df_date = filtered_df_date["longtime"].dt.date

filtered_df_date:

index	longtime
52754	2020-01-01 00:00:00
52755	2020-01-01 00:32:00
52756	2020-01-01 00:33:00
…	…
53261	2020-01-01 23:59:00

date_merge = pd.merge(df_time, df_date, left_index=True, right_index=True)
date_merge = pd.merge(date_merge, pickd_column_df, left_index=True, right_index=True)

index	longtime_time	longtime_date	value
52755	00:32:00	2020-01-01	23.3
52757	00:34:00	2020-01-01	23.3
52759	00:37:00	2020-01-01	NaN
52760	00:38:00	2020-01-01	NaN
52761	00:39:00	2020-01-01	naN
….	…	…	…
53261	23:59:00	2020-01-01	23.9

now I plot the longtime_date on the x-axis as an example:

ax = date_merge.plot(x ="longtime_date" , y="value" , kind="scatter" ,figsize=[15, 5], linewidth=0.1, alpha=0.6, color="#003399")
plt.show()

it works no error.
If I now use longtime_time instead of longtime_date for the x-axis I get the following error message

ax = date_merge.plot(x ="longtime_time" , y="value" , kind="scatter" ,figsize=[15, 5], linewidth=0.1, alpha=0.6, color="#003399")
plt.show()

TypeError: float() argument must be a string or a number, not ‘datetime.time’

some further information:

print(date_merge["longtime_time"].dtype)

output:

object

print(date_merge["longtime_date"].dtype)

output:

object

print(date_merge["temperature_ers_lite_1_wermser_0_elsys_0"].dtype)

output:

float64

Asked By: mika

||

Source

Answer 1

Ok so I think the issue is you need to convert that column to str.

So at some point before plotting:

date_merge['longtime_time'] = date_merge['longtime_time'].astype(str)
ax = date_merge.plot(x ="longtime_time" , y="value" , kind="scatter" ,figsize=[15, 5], linewidth=0.1, alpha=0.6, color="#003399")
plt.show()

Or could do:

import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import matplotlib.dates as mdates

date_merge = pd.DataFrame(
[['2020-01-01 00:32:00' ,23.3],
['2020-01-01 00:34:00'      ,23.3],
['2020-01-01 00:37:00'      ,np.nan],
['2020-01-01 00:38:00'      ,np.nan],
['2020-01-01 00:39:00'      ,np.nan],
['2020-01-01 23:59:00'      ,23.9]],
columns = ['longtime'       ,'value'])

date_merge["longtime"] = pd.to_datetime(date_merge["longtime"])

ax = date_merge.plot(x ="longtime" , y="value" , kind="scatter" ,figsize=[15, 5], linewidth=0.1, alpha=0.6, color="#003399")

timeFmt = mdates.DateFormatter('%H:%M:%S')
ax.xaxis.set_major_formatter(timeFmt)
plt.xticks(rotation=90)
plt.show()

Answered By: chitown88

Answer 2

That TypeError arises because matplotlib doesn’t handle times or timedeltas for the axes. It has to be a complete timestamp (date + optional time) or a numeric value.

A common approach is to:

Coerce the time/timedelta values to strings
Convert strings into full timestamp objects with an arbitrary date
Use a custom date tick formatter to display just time components

You could also avoid expensive string manipulation and do arithmetic on the underlying data types (datetime64[ns] → POSIX-equivalent). But since you’re already using pandas, I’d suggest:

Promote the entire timestamp column longtime to a DatetimeIndex, then
Subtract out the date components (e.g. df.index -= df.index.floor('D'))

After step 2, the pd.DataFrame has a TimedeltaIndex. Starting in v0.20.0, pandas supports native x-axis tick labels for TimedeltaIndex which means you can simply call .plot() and obtain a plot like this:

import pandas as pd, numpy as np

idx = pd.date_range('2000-01-01', '2000-01-03T23:00', freq='H')
data = np.random.ranf(len(idx))
df = pd.DataFrame({'once': data, 'twice': data*2}, idx)
df.index -= df.index.floor('D')
df.plot()

(Note x-axis labels are automatically formatted, and also they’re labeled ‘0 days’ because date components were removed.)

However, you are probably more interested in aggregating by time of day:

import pandas as pd, numpy as np

idx = pd.date_range('2000-01-01', '2000-01-03T23:00', freq='H')
data = np.random.ranf(len(idx))
df = pd.DataFrame({'once': data, 'twice': data*2}, idx)
df.index -= df.index.floor('D')
df.groupby(level=0).mean().plot(rot=45)

Or in separately plotting individual days:

import pandas as pd, numpy as np, matplotlib.pyplot as plt

idx = pd.date_range('2000-01-01', '2000-01-03T23:00', freq='H')
data = np.random.ranf(len(idx))
df = pd.DataFrame({'once': data, 'twice': data*2}, idx)

fig = plt.figure()
ax1 = fig.add_subplot()

days = df.groupby(lambda ts: ts.dayofyear)
for doy, grp in days:
    grp.index -= grp.index.floor('D')
    grp.rename(columns=lambda s: "{0} ({1})".format(s, doy), inplace=True)
    grp.plot(rot=45, ax=ax1)

If you’re somehow constrained to building plots with matplotlib, you can approximate the above functionality with a custom tick formatter (e.g. pd.Timestamp(x).strftime('%H:%M'))

import pandas as pd, numpy as np, matplotlib.pyplot as plt

idx = pd.date_range('2000-01-01', '2000-01-03T23:00', freq='H')
data = np.random.ranf(len(idx))
df = pd.DataFrame({'once': data, 'twice': data*2}, idx)
days = df.groupby(lambda ts: ts.dayofyear)

fig = plt.figure()
ax1 = fig.add_subplot()
plt.setp(ax1.get_xticklabels(), rotation=45)
for doy, grp in days:
    grp.index -= grp.index.floor('D')
    ax1.plot(grp.index, grp)
    ax1.xaxis.set_major_formatter(lambda x,p: pd.Timestamp(x).strftime('%H:%M'))

Answered By: patricktokeeffe

TypeError: float() argument must be a string or a number, not 'datetime.time' in relation with a scatter plot

Question:

Answers: