Getting Type Error Trying to create a month-year column from date ranges in pandas

Question:

I’m trying to follow the solution provided Find all months between two date columns and generate row for each month and I’m hitting a wall as I’m getting an error. What I want to do is create a Year-Month column for each year-month that exists in the startdate and enddate range for each row. When I tried to follow the above linked Stack, I get the error

TypeError: Cannot convert input … Name: ServiceStartDate, dtype: datetime64[ns]] of type <class ‘pandas.core.series.Series’> to Timestamp

I have no idea how to fix this. Please help!

Sample Data

ID StartDate EndDate
1 311566 2021-10-01 2024-09-30
2 235216 2020-11-01 2020-11-30
3 157054 2021-10-01 2023-09-30
4 159954 2021-01-01 2023-12-31
5 255815 2019-11-01 2022-10-31
Asked By: DataStraine

||

Answers:

I have found a solution to my problem (sorry for the long response delay). The problem was that my data had a time stamp associated with it. I needed to change the date field to y/m/-01 format using the following code.

df['date] = df['date'].apply(lambda x: x.strftime('%Y-%m-01'))

Then I used the solution below to get all the months/years that exist between the min and max dates as a single column.

df.merge(df.apply(lambda s: pd.date_range(df['date'].min(), 
                            df['date'].max(), freq='MS'), 1).explode("").rename('Month'),
                            left_index=True, right_index=True)
Answered By: DataStraine
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.