TypeError: float() argument must be a string or a number, not 'datetime.time' in relation with a scatter plot
Question:
i have created the following DataFrame: date_merge as an example, because i actually want to display the temperature values throughout the day.
df_time = filtered_df_date["longtime"].dt.time
df_date = filtered_df_date["longtime"].dt.date
filtered_df_date:
index
longtime
52754
2020-01-01 00:00:00
52755
2020-01-01 00:32:00
52756
2020-01-01 00:33:00
…
…
53261
2020-01-01 23:59:00
date_merge = pd.merge(df_time, df_date, left_index=True, right_index=True)
date_merge = pd.merge(date_merge, pickd_column_df, left_index=True, right_index=True)
index
longtime_time
longtime_date
value
52755
00:32:00
2020-01-01
23.3
52757
00:34:00
2020-01-01
23.3
52759
00:37:00
2020-01-01
NaN
52760
00:38:00
2020-01-01
NaN
52761
00:39:00
2020-01-01
naN
….
…
…
…
53261
23:59:00
2020-01-01
23.9
now I plot the longtime_date on the x-axis as an example:
ax = date_merge.plot(x ="longtime_date" , y="value" , kind="scatter" ,figsize=[15, 5], linewidth=0.1, alpha=0.6, color="#003399")
plt.show()
it works no error.
If I now use longtime_time instead of longtime_date for the x-axis I get the following error message
ax = date_merge.plot(x ="longtime_time" , y="value" , kind="scatter" ,figsize=[15, 5], linewidth=0.1, alpha=0.6, color="#003399")
plt.show()
TypeError: float() argument must be a string or a number, not ‘datetime.time’
some further information:
print(date_merge["longtime_time"].dtype)
output:
object
print(date_merge["longtime_date"].dtype)
output:
object
print(date_merge["temperature_ers_lite_1_wermser_0_elsys_0"].dtype)
output:
float64
Answers:
Ok so I think the issue is you need to convert that column to str
.
So at some point before plotting:
date_merge['longtime_time'] = date_merge['longtime_time'].astype(str)
ax = date_merge.plot(x ="longtime_time" , y="value" , kind="scatter" ,figsize=[15, 5], linewidth=0.1, alpha=0.6, color="#003399")
plt.show()
Or could do:
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import matplotlib.dates as mdates
date_merge = pd.DataFrame(
[['2020-01-01 00:32:00' ,23.3],
['2020-01-01 00:34:00' ,23.3],
['2020-01-01 00:37:00' ,np.nan],
['2020-01-01 00:38:00' ,np.nan],
['2020-01-01 00:39:00' ,np.nan],
['2020-01-01 23:59:00' ,23.9]],
columns = ['longtime' ,'value'])
date_merge["longtime"] = pd.to_datetime(date_merge["longtime"])
ax = date_merge.plot(x ="longtime" , y="value" , kind="scatter" ,figsize=[15, 5], linewidth=0.1, alpha=0.6, color="#003399")
timeFmt = mdates.DateFormatter('%H:%M:%S')
ax.xaxis.set_major_formatter(timeFmt)
plt.xticks(rotation=90)
plt.show()
That TypeError
arises because matplotlib doesn’t handle times or timedeltas for the axes. It has to be a complete timestamp (date + optional time) or a numeric value.
A common approach is to:
- Coerce the time/timedelta values to strings
- Convert strings into full timestamp objects with an arbitrary date
- Use a custom date tick formatter to display just time components
You could also avoid expensive string manipulation and do arithmetic on the underlying data types (datetime64[ns]
→ POSIX-equivalent). But since you’re already using pandas, I’d suggest:
- Promote the entire timestamp column longtime to a
DatetimeIndex
, then
- Subtract out the date components (e.g.
df.index -= df.index.floor('D')
)
After step 2, the pd.DataFrame
has a TimedeltaIndex
. Starting in v0.20.0, pandas supports native x-axis tick labels for TimedeltaIndex
which means you can simply call .plot()
and obtain a plot like this:
import pandas as pd, numpy as np
idx = pd.date_range('2000-01-01', '2000-01-03T23:00', freq='H')
data = np.random.ranf(len(idx))
df = pd.DataFrame({'once': data, 'twice': data*2}, idx)
df.index -= df.index.floor('D')
df.plot()
(Note x-axis labels are automatically formatted, and also they’re labeled ‘0 days’ because date components were removed.)
However, you are probably more interested in aggregating by time of day:
import pandas as pd, numpy as np
idx = pd.date_range('2000-01-01', '2000-01-03T23:00', freq='H')
data = np.random.ranf(len(idx))
df = pd.DataFrame({'once': data, 'twice': data*2}, idx)
df.index -= df.index.floor('D')
df.groupby(level=0).mean().plot(rot=45)
Or in separately plotting individual days:
import pandas as pd, numpy as np, matplotlib.pyplot as plt
idx = pd.date_range('2000-01-01', '2000-01-03T23:00', freq='H')
data = np.random.ranf(len(idx))
df = pd.DataFrame({'once': data, 'twice': data*2}, idx)
fig = plt.figure()
ax1 = fig.add_subplot()
days = df.groupby(lambda ts: ts.dayofyear)
for doy, grp in days:
grp.index -= grp.index.floor('D')
grp.rename(columns=lambda s: "{0} ({1})".format(s, doy), inplace=True)
grp.plot(rot=45, ax=ax1)
If you’re somehow constrained to building plots with matplotlib, you can approximate the above functionality with a custom tick formatter (e.g. pd.Timestamp(x).strftime('%H:%M')
)
import pandas as pd, numpy as np, matplotlib.pyplot as plt
idx = pd.date_range('2000-01-01', '2000-01-03T23:00', freq='H')
data = np.random.ranf(len(idx))
df = pd.DataFrame({'once': data, 'twice': data*2}, idx)
days = df.groupby(lambda ts: ts.dayofyear)
fig = plt.figure()
ax1 = fig.add_subplot()
plt.setp(ax1.get_xticklabels(), rotation=45)
for doy, grp in days:
grp.index -= grp.index.floor('D')
ax1.plot(grp.index, grp)
ax1.xaxis.set_major_formatter(lambda x,p: pd.Timestamp(x).strftime('%H:%M'))
i have created the following DataFrame: date_merge as an example, because i actually want to display the temperature values throughout the day.
df_time = filtered_df_date["longtime"].dt.time
df_date = filtered_df_date["longtime"].dt.date
filtered_df_date:
index | longtime |
---|---|
52754 | 2020-01-01 00:00:00 |
52755 | 2020-01-01 00:32:00 |
52756 | 2020-01-01 00:33:00 |
… | … |
53261 | 2020-01-01 23:59:00 |
date_merge = pd.merge(df_time, df_date, left_index=True, right_index=True)
date_merge = pd.merge(date_merge, pickd_column_df, left_index=True, right_index=True)
index | longtime_time | longtime_date | value |
---|---|---|---|
52755 | 00:32:00 | 2020-01-01 | 23.3 |
52757 | 00:34:00 | 2020-01-01 | 23.3 |
52759 | 00:37:00 | 2020-01-01 | NaN |
52760 | 00:38:00 | 2020-01-01 | NaN |
52761 | 00:39:00 | 2020-01-01 | naN |
…. | … | … | … |
53261 | 23:59:00 | 2020-01-01 | 23.9 |
now I plot the longtime_date on the x-axis as an example:
ax = date_merge.plot(x ="longtime_date" , y="value" , kind="scatter" ,figsize=[15, 5], linewidth=0.1, alpha=0.6, color="#003399")
plt.show()
it works no error.
If I now use longtime_time instead of longtime_date for the x-axis I get the following error message
ax = date_merge.plot(x ="longtime_time" , y="value" , kind="scatter" ,figsize=[15, 5], linewidth=0.1, alpha=0.6, color="#003399")
plt.show()
TypeError: float() argument must be a string or a number, not ‘datetime.time’
some further information:
print(date_merge["longtime_time"].dtype)
output:
object
print(date_merge["longtime_date"].dtype)
output:
object
print(date_merge["temperature_ers_lite_1_wermser_0_elsys_0"].dtype)
output:
float64
Ok so I think the issue is you need to convert that column to str
.
So at some point before plotting:
date_merge['longtime_time'] = date_merge['longtime_time'].astype(str)
ax = date_merge.plot(x ="longtime_time" , y="value" , kind="scatter" ,figsize=[15, 5], linewidth=0.1, alpha=0.6, color="#003399")
plt.show()
Or could do:
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import matplotlib.dates as mdates
date_merge = pd.DataFrame(
[['2020-01-01 00:32:00' ,23.3],
['2020-01-01 00:34:00' ,23.3],
['2020-01-01 00:37:00' ,np.nan],
['2020-01-01 00:38:00' ,np.nan],
['2020-01-01 00:39:00' ,np.nan],
['2020-01-01 23:59:00' ,23.9]],
columns = ['longtime' ,'value'])
date_merge["longtime"] = pd.to_datetime(date_merge["longtime"])
ax = date_merge.plot(x ="longtime" , y="value" , kind="scatter" ,figsize=[15, 5], linewidth=0.1, alpha=0.6, color="#003399")
timeFmt = mdates.DateFormatter('%H:%M:%S')
ax.xaxis.set_major_formatter(timeFmt)
plt.xticks(rotation=90)
plt.show()
That TypeError
arises because matplotlib doesn’t handle times or timedeltas for the axes. It has to be a complete timestamp (date + optional time) or a numeric value.
A common approach is to:
- Coerce the time/timedelta values to strings
- Convert strings into full timestamp objects with an arbitrary date
- Use a custom date tick formatter to display just time components
You could also avoid expensive string manipulation and do arithmetic on the underlying data types (datetime64[ns]
→ POSIX-equivalent). But since you’re already using pandas, I’d suggest:
- Promote the entire timestamp column longtime to a
DatetimeIndex
, then - Subtract out the date components (e.g.
df.index -= df.index.floor('D')
)
After step 2, the pd.DataFrame
has a TimedeltaIndex
. Starting in v0.20.0, pandas supports native x-axis tick labels for TimedeltaIndex
which means you can simply call .plot()
and obtain a plot like this:
import pandas as pd, numpy as np
idx = pd.date_range('2000-01-01', '2000-01-03T23:00', freq='H')
data = np.random.ranf(len(idx))
df = pd.DataFrame({'once': data, 'twice': data*2}, idx)
df.index -= df.index.floor('D')
df.plot()
(Note x-axis labels are automatically formatted, and also they’re labeled ‘0 days’ because date components were removed.)
However, you are probably more interested in aggregating by time of day:
import pandas as pd, numpy as np
idx = pd.date_range('2000-01-01', '2000-01-03T23:00', freq='H')
data = np.random.ranf(len(idx))
df = pd.DataFrame({'once': data, 'twice': data*2}, idx)
df.index -= df.index.floor('D')
df.groupby(level=0).mean().plot(rot=45)
Or in separately plotting individual days:
import pandas as pd, numpy as np, matplotlib.pyplot as plt
idx = pd.date_range('2000-01-01', '2000-01-03T23:00', freq='H')
data = np.random.ranf(len(idx))
df = pd.DataFrame({'once': data, 'twice': data*2}, idx)
fig = plt.figure()
ax1 = fig.add_subplot()
days = df.groupby(lambda ts: ts.dayofyear)
for doy, grp in days:
grp.index -= grp.index.floor('D')
grp.rename(columns=lambda s: "{0} ({1})".format(s, doy), inplace=True)
grp.plot(rot=45, ax=ax1)
If you’re somehow constrained to building plots with matplotlib, you can approximate the above functionality with a custom tick formatter (e.g. pd.Timestamp(x).strftime('%H:%M')
)
import pandas as pd, numpy as np, matplotlib.pyplot as plt
idx = pd.date_range('2000-01-01', '2000-01-03T23:00', freq='H')
data = np.random.ranf(len(idx))
df = pd.DataFrame({'once': data, 'twice': data*2}, idx)
days = df.groupby(lambda ts: ts.dayofyear)
fig = plt.figure()
ax1 = fig.add_subplot()
plt.setp(ax1.get_xticklabels(), rotation=45)
for doy, grp in days:
grp.index -= grp.index.floor('D')
ax1.plot(grp.index, grp)
ax1.xaxis.set_major_formatter(lambda x,p: pd.Timestamp(x).strftime('%H:%M'))