TypeError: cannot convert the series to <class 'int'> when process dataframe

Question:

I want to use is_holiday() in chinesecalendar library to judge holidays based on the information in the time column of dataframe (datetime.date is required to be passed).

I extract three integers from the dataframe and combine them into datetime.date, but TypeError: cannot convert the series to <class ‘int’>.

init_time_data["is_holiday"] = is_holiday(
           datetime.date(
                         init_time_data["记账日期"].dt.year.astype(int),
                         init_time_data["记账日期"].dt.month.astype(int),
                         init_time_data["记账日期"].dt.day).astype(int)
                        )
)

wrong:

File "F:Anaconda3envstorchlibsite-packagespandascoreseries.py", line 185, in wrapper
    raise TypeError(f"cannot convert the series to {converter}")
TypeError: cannot convert the series to <class 'int'>
Asked By: jaycenice

||

Answers:

According to the error

TypeError: cannot convert the series to <class 'int'>

You are trying to convert a sequence of values to an integer, which is not possible.

is_holiday( datetime.date(init_time_data["记账日期"].dt.year.astype(int) <-- incorrect

The error is happening because you are giving a series of data without giving single value to the datetime.date() function. Instead of applying the datetime.date() function to the full column at once.
You can do it for each row of the data frame independently in order to fix this problem. Using the dataframe’s apply() method and passing a lambda function that applies datetime.date() to each row.

import datetime
import chinesecalendar as cal

def is_holiday_wrap(row):
    date = datetime.date(row['记账日期'].year, row['记账日期'].month, row['记账日期'].day)
    return cal.is_holiday(date)

init_time_data['is_holiday'] = init_time_data.apply(lambda row: is_holiday_wrap(row), axis=1)

I think this will solve your problem.

Answered By: Maneesha Indrachapa

I’m not familiar with this package but it seems the function is_holiday returns True/False based on an input datetime.

Since your column is already in datetime format, there’s no need to convert it to int – you can just apply the function:

import pandas as pd
from chinese_calendar import is_holiday

df = pd.DataFrame({
    'id': ['a', 'b', 'c'],
    'datetime_col': [
        pd.to_datetime('2004-01-02'),
        pd.to_datetime('2012-03-04'),
        pd.to_datetime('2020-05-06'),
    ],
})

df['is_holiday'] = df['datetime_col'].apply(is_holiday)

Output:

    id  datetime_col    is_holiday
0   a   2004-01-02      False
1   b   2012-03-04      True
2   c   2020-05-06      False
Answered By: sharmu1