TypeError: float() argument must be a string or a number, not 'datetime.time' in relation with a scatter plot

Question:

i have created the following DataFrame: date_merge as an example, because i actually want to display the temperature values throughout the day.

df_time = filtered_df_date["longtime"].dt.time
df_date = filtered_df_date["longtime"].dt.date

filtered_df_date:

index longtime
52754 2020-01-01 00:00:00
52755 2020-01-01 00:32:00
52756 2020-01-01 00:33:00
53261 2020-01-01 23:59:00
date_merge = pd.merge(df_time, df_date, left_index=True, right_index=True)
date_merge = pd.merge(date_merge, pickd_column_df, left_index=True, right_index=True)
index longtime_time longtime_date value
52755 00:32:00 2020-01-01 23.3
52757 00:34:00 2020-01-01 23.3
52759 00:37:00 2020-01-01 NaN
52760 00:38:00 2020-01-01 NaN
52761 00:39:00 2020-01-01 naN
….
53261 23:59:00 2020-01-01 23.9

now I plot the longtime_date on the x-axis as an example:

ax = date_merge.plot(x ="longtime_date" , y="value" , kind="scatter" ,figsize=[15, 5], linewidth=0.1, alpha=0.6, color="#003399")
plt.show()

enter image description here
it works no error.
If I now use longtime_time instead of longtime_date for the x-axis I get the following error message

ax = date_merge.plot(x ="longtime_time" , y="value" , kind="scatter" ,figsize=[15, 5], linewidth=0.1, alpha=0.6, color="#003399")
plt.show()

TypeError: float() argument must be a string or a number, not ‘datetime.time’

some further information:

print(date_merge["longtime_time"].dtype)

output:

object

print(date_merge["longtime_date"].dtype)

output:

object

print(date_merge["temperature_ers_lite_1_wermser_0_elsys_0"].dtype)

output:

float64

Asked By: mika

||

Answers:

Ok so I think the issue is you need to convert that column to str.

So at some point before plotting:

date_merge['longtime_time'] = date_merge['longtime_time'].astype(str)
ax = date_merge.plot(x ="longtime_time" , y="value" , kind="scatter" ,figsize=[15, 5], linewidth=0.1, alpha=0.6, color="#003399")
plt.show()

Or could do:

import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import matplotlib.dates as mdates

date_merge = pd.DataFrame(
[['2020-01-01 00:32:00' ,23.3],
['2020-01-01 00:34:00'      ,23.3],
['2020-01-01 00:37:00'      ,np.nan],
['2020-01-01 00:38:00'      ,np.nan],
['2020-01-01 00:39:00'      ,np.nan],
['2020-01-01 23:59:00'      ,23.9]],
columns = ['longtime'       ,'value'])

date_merge["longtime"] = pd.to_datetime(date_merge["longtime"])

ax = date_merge.plot(x ="longtime" , y="value" , kind="scatter" ,figsize=[15, 5], linewidth=0.1, alpha=0.6, color="#003399")

timeFmt = mdates.DateFormatter('%H:%M:%S')
ax.xaxis.set_major_formatter(timeFmt)
plt.xticks(rotation=90)
plt.show()
Answered By: chitown88

That TypeError arises because matplotlib doesn’t handle times or timedeltas for the axes. It has to be a complete timestamp (date + optional time) or a numeric value.

A common approach is to:

  1. Coerce the time/timedelta values to strings
  2. Convert strings into full timestamp objects with an arbitrary date
  3. Use a custom date tick formatter to display just time components

You could also avoid expensive string manipulation and do arithmetic on the underlying data types (datetime64[ns] → POSIX-equivalent). But since you’re already using pandas, I’d suggest:

  1. Promote the entire timestamp column longtime to a DatetimeIndex, then
  2. Subtract out the date components (e.g. df.index -= df.index.floor('D'))

After step 2, the pd.DataFrame has a TimedeltaIndex. Starting in v0.20.0, pandas supports native x-axis tick labels for TimedeltaIndex which means you can simply call .plot() and obtain a plot like this:

import pandas as pd, numpy as np

idx = pd.date_range('2000-01-01', '2000-01-03T23:00', freq='H')
data = np.random.ranf(len(idx))
df = pd.DataFrame({'once': data, 'twice': data*2}, idx)
df.index -= df.index.floor('D')
df.plot()

(Note x-axis labels are automatically formatted, and also they’re labeled ‘0 days’ because date components were removed.)

Line plot with time delta x-axis ticks, ten hours apart

However, you are probably more interested in aggregating by time of day:

import pandas as pd, numpy as np

idx = pd.date_range('2000-01-01', '2000-01-03T23:00', freq='H')
data = np.random.ranf(len(idx))
df = pd.DataFrame({'once': data, 'twice': data*2}, idx)
df.index -= df.index.floor('D')
df.groupby(level=0).mean().plot(rot=45)

Line plot with data averaged by time-of-day, using automatic time delta tick labels

Or in separately plotting individual days:

import pandas as pd, numpy as np, matplotlib.pyplot as plt

idx = pd.date_range('2000-01-01', '2000-01-03T23:00', freq='H')
data = np.random.ranf(len(idx))
df = pd.DataFrame({'once': data, 'twice': data*2}, idx)

fig = plt.figure()
ax1 = fig.add_subplot()

days = df.groupby(lambda ts: ts.dayofyear)
for doy, grp in days:
    grp.index -= grp.index.floor('D')
    grp.rename(columns=lambda s: "{0} ({1})".format(s, doy), inplace=True)
    grp.plot(rot=45, ax=ax1)

Line plots of series vs. time of day, grouped by day, with automatic x-axis tick label formatting

If you’re somehow constrained to building plots with matplotlib, you can approximate the above functionality with a custom tick formatter (e.g. pd.Timestamp(x).strftime('%H:%M'))

import pandas as pd, numpy as np, matplotlib.pyplot as plt

idx = pd.date_range('2000-01-01', '2000-01-03T23:00', freq='H')
data = np.random.ranf(len(idx))
df = pd.DataFrame({'once': data, 'twice': data*2}, idx)
days = df.groupby(lambda ts: ts.dayofyear)

fig = plt.figure()
ax1 = fig.add_subplot()
plt.setp(ax1.get_xticklabels(), rotation=45)
for doy, grp in days:
    grp.index -= grp.index.floor('D')
    ax1.plot(grp.index, grp)
    ax1.xaxis.set_major_formatter(lambda x,p: pd.Timestamp(x).strftime('%H:%M'))

Line plot of series vs. time-of-day, grouped by day, with tick formatters provided by matplotlib

Answered By: patricktokeeffe
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.