Pandas Dateframe query between two dates using dictionary value
Question:
I have Pandas dateframe which has two column; ID and dates.
Additionally, I have a dictionary with start and end date and I want to extract data from Pandas dataframe which is between dictionary values.
Below is my data.
- Dataframe (df)
ID
Dates
1
20010302
2
20091212
3
20020924
- Dictioary
date_dictionary = {‘start_date’:’20020101′, ‘end_date’:’20101231′}
If I use,
extracted = df.query('20020101<=Dates<=20101231')
then it works.
But I wonder why the following code gave me an error; UndefinedVariableError: name ‘start_date’ is not defined.
start_date=int(date_dictionary['start_date'])
end_date=int(date_dictionary['end_date'])
extracted = df.query('start_date<=Dates<=end_date')
I imported ‘date_dictionary’ using from date_dictionary import *
.
Please someone why it doesn’t work. And is there any method using dictionary’s value inside the query statement?
Answers:
You can refer to variables in the expression by prefixing them with an ‘@’:
extracted = df.query("@start_date<=Dates<=@end_date")
ID Dates
1 2 20091212
2 3 20020924
Have a look at the documentation of query
I have Pandas dateframe which has two column; ID and dates.
Additionally, I have a dictionary with start and end date and I want to extract data from Pandas dataframe which is between dictionary values.
Below is my data.
- Dataframe (df)
ID | Dates |
---|---|
1 | 20010302 |
2 | 20091212 |
3 | 20020924 |
- Dictioary
date_dictionary = {‘start_date’:’20020101′, ‘end_date’:’20101231′}
If I use,
extracted = df.query('20020101<=Dates<=20101231')
then it works.
But I wonder why the following code gave me an error; UndefinedVariableError: name ‘start_date’ is not defined.
start_date=int(date_dictionary['start_date'])
end_date=int(date_dictionary['end_date'])
extracted = df.query('start_date<=Dates<=end_date')
I imported ‘date_dictionary’ using from date_dictionary import *
.
Please someone why it doesn’t work. And is there any method using dictionary’s value inside the query statement?
You can refer to variables in the expression by prefixing them with an ‘@’:
extracted = df.query("@start_date<=Dates<=@end_date")
ID Dates
1 2 20091212
2 3 20020924
Have a look at the documentation of query