TypeError: '<' not supported between instances of 'int' and 'datetime.datetime'
Question:
UPDATED QUESTION:
I am trying to filter data by year. Initially I wanted to use the "YEAR" column which is of Integer data type example ranging from 1946 to 2021, but I found out my data also had a "REP_DATE" column which is a date data type ranging from 1946-06-06 to 2021-11-04. I am only interested in data points from 1980-01-01 to today. So I could either try converting the Integer column to Date or just work with the date column. Either solutions would be helpful
import pandas as pd
import datetime as dt
data.set_index('REP_DATE', inplace = True)
start = dt.datetime(1980, 1, 1)
end = dt.datetime.today()
data.loc[start:end]
print(data.loc[start:end])
I got:
KeyError: datetime.datetime(1980, 1, 1, 0, 0)
Answers:
I understand that you need to set ‘YEAR’ column as datetime before set as index
Please try below code snippet and accept this answer if this can resolve or let me know any issue.
import pandas as pd
import datetime as dt
data['YEAR'] = pd.to_datetime(data['YEAR'])
data.set_index('YEAR', inplace = True)
start = pd.to_datetime("1980-01-01").date()
end = pd.to_datetime.today()
print(data.loc[start:end])
The KeyError you are getting is because you are trying to slice the DataFrame using start and end datetime objects directly, instead of using them as bounds for the index slice. You can use the .loc accessor with a boolean condition to filter the DataFrame based on the REP_DATE column.
import pandas as pd
import datetime as dt
# assuming your DataFrame is named "data"
data['REP_DATE'] = pd.to_datetime(data['REP_DATE']) # convert REP_DATE column to datetime type
data.set_index('REP_DATE', inplace=True)
start = dt.datetime(1980, 1, 1)
end = dt.datetime.today()
# boolean condition to filter by date range
mask = (data.index >= start) & (data.index <= end)
# filter the DataFrame
filtered_data = data.loc[mask]
print(filtered_data)
UPDATED QUESTION:
I am trying to filter data by year. Initially I wanted to use the "YEAR" column which is of Integer data type example ranging from 1946 to 2021, but I found out my data also had a "REP_DATE" column which is a date data type ranging from 1946-06-06 to 2021-11-04. I am only interested in data points from 1980-01-01 to today. So I could either try converting the Integer column to Date or just work with the date column. Either solutions would be helpful
import pandas as pd
import datetime as dt
data.set_index('REP_DATE', inplace = True)
start = dt.datetime(1980, 1, 1)
end = dt.datetime.today()
data.loc[start:end]
print(data.loc[start:end])
I got:
KeyError: datetime.datetime(1980, 1, 1, 0, 0)
I understand that you need to set ‘YEAR’ column as datetime before set as index
Please try below code snippet and accept this answer if this can resolve or let me know any issue.
import pandas as pd
import datetime as dt
data['YEAR'] = pd.to_datetime(data['YEAR'])
data.set_index('YEAR', inplace = True)
start = pd.to_datetime("1980-01-01").date()
end = pd.to_datetime.today()
print(data.loc[start:end])
The KeyError you are getting is because you are trying to slice the DataFrame using start and end datetime objects directly, instead of using them as bounds for the index slice. You can use the .loc accessor with a boolean condition to filter the DataFrame based on the REP_DATE column.
import pandas as pd
import datetime as dt
# assuming your DataFrame is named "data"
data['REP_DATE'] = pd.to_datetime(data['REP_DATE']) # convert REP_DATE column to datetime type
data.set_index('REP_DATE', inplace=True)
start = dt.datetime(1980, 1, 1)
end = dt.datetime.today()
# boolean condition to filter by date range
mask = (data.index >= start) & (data.index <= end)
# filter the DataFrame
filtered_data = data.loc[mask]
print(filtered_data)