TypeError: cannot convert the series to <class 'int'> when process dataframe
Question:
I want to use is_holiday()
in chinesecalendar
library to judge holidays based on the information in the time column of dataframe (datetime.date
is required to be passed).
I extract three integers from the dataframe and combine them into datetime.date, but TypeError: cannot convert the series to <class ‘int’>.
init_time_data["is_holiday"] = is_holiday(
datetime.date(
init_time_data["记账日期"].dt.year.astype(int),
init_time_data["记账日期"].dt.month.astype(int),
init_time_data["记账日期"].dt.day).astype(int)
)
)
wrong:
File "F:Anaconda3envstorchlibsite-packagespandascoreseries.py", line 185, in wrapper
raise TypeError(f"cannot convert the series to {converter}")
TypeError: cannot convert the series to <class 'int'>
Answers:
According to the error
TypeError: cannot convert the series to <class 'int'>
You are trying to convert a sequence of values to an integer, which is not possible.
is_holiday( datetime.date(init_time_data["记账日期"].dt.year.astype(int) <-- incorrect
The error is happening because you are giving a series of data without giving single value to the datetime.date()
function. Instead of applying the datetime.date()
function to the full column at once.
You can do it for each row of the data frame independently in order to fix this problem. Using the dataframe’s apply()
method and passing a lambda function that applies datetime.date()
to each row.
import datetime
import chinesecalendar as cal
def is_holiday_wrap(row):
date = datetime.date(row['记账日期'].year, row['记账日期'].month, row['记账日期'].day)
return cal.is_holiday(date)
init_time_data['is_holiday'] = init_time_data.apply(lambda row: is_holiday_wrap(row), axis=1)
I think this will solve your problem.
I’m not familiar with this package but it seems the function is_holiday returns True/False based on an input datetime.
Since your column is already in datetime format, there’s no need to convert it to int – you can just apply the function:
import pandas as pd
from chinese_calendar import is_holiday
df = pd.DataFrame({
'id': ['a', 'b', 'c'],
'datetime_col': [
pd.to_datetime('2004-01-02'),
pd.to_datetime('2012-03-04'),
pd.to_datetime('2020-05-06'),
],
})
df['is_holiday'] = df['datetime_col'].apply(is_holiday)
Output:
id datetime_col is_holiday
0 a 2004-01-02 False
1 b 2012-03-04 True
2 c 2020-05-06 False
I want to use is_holiday()
in chinesecalendar
library to judge holidays based on the information in the time column of dataframe (datetime.date
is required to be passed).
I extract three integers from the dataframe and combine them into datetime.date, but TypeError: cannot convert the series to <class ‘int’>.
init_time_data["is_holiday"] = is_holiday(
datetime.date(
init_time_data["记账日期"].dt.year.astype(int),
init_time_data["记账日期"].dt.month.astype(int),
init_time_data["记账日期"].dt.day).astype(int)
)
)
wrong:
File "F:Anaconda3envstorchlibsite-packagespandascoreseries.py", line 185, in wrapper
raise TypeError(f"cannot convert the series to {converter}")
TypeError: cannot convert the series to <class 'int'>
According to the error
TypeError: cannot convert the series to <class 'int'>
You are trying to convert a sequence of values to an integer, which is not possible.
is_holiday( datetime.date(init_time_data["记账日期"].dt.year.astype(int) <-- incorrect
The error is happening because you are giving a series of data without giving single value to the datetime.date()
function. Instead of applying the datetime.date()
function to the full column at once.
You can do it for each row of the data frame independently in order to fix this problem. Using the dataframe’s apply()
method and passing a lambda function that applies datetime.date()
to each row.
import datetime
import chinesecalendar as cal
def is_holiday_wrap(row):
date = datetime.date(row['记账日期'].year, row['记账日期'].month, row['记账日期'].day)
return cal.is_holiday(date)
init_time_data['is_holiday'] = init_time_data.apply(lambda row: is_holiday_wrap(row), axis=1)
I think this will solve your problem.
I’m not familiar with this package but it seems the function is_holiday returns True/False based on an input datetime.
Since your column is already in datetime format, there’s no need to convert it to int – you can just apply the function:
import pandas as pd
from chinese_calendar import is_holiday
df = pd.DataFrame({
'id': ['a', 'b', 'c'],
'datetime_col': [
pd.to_datetime('2004-01-02'),
pd.to_datetime('2012-03-04'),
pd.to_datetime('2020-05-06'),
],
})
df['is_holiday'] = df['datetime_col'].apply(is_holiday)
Output:
id datetime_col is_holiday
0 a 2004-01-02 False
1 b 2012-03-04 True
2 c 2020-05-06 False