SQLAlchemy – Querying with DateTime columns to filter by month/day/year

Question:

I’m building a Flask website that involves keeping track of payments, and I’ve run into an issue where I can’t really seem to filter one of my db models by date.

For instance, if this is what my table looks like:

payment_to, amount, due_date (a DateTime object)

company A, 3000, 7-20-2018
comapny B, 3000, 7-21-2018
company C, 3000, 8-20-2018

and I want to filter it so that I get all rows that’s after July 20th, or all rows that are in August, etc.

I can think of a crude, brute-force way to filter all payments and THEN iterate through the list to filter by month/year, but I’d rather stay away from those methods.

This is my payment db model:

class Payment(db.Model, UserMixin):
    id = db.Column(db.Integer, unique = True, primary_key = True)

    payment_to = db.Column(db.String, nullable = False)
    amount = db.Column(db.Float, nullable = False)

    due_date = db.Column(db.DateTime, nullable = False, default = datetime.strftime(datetime.today(), "%b %d %Y"))
    week_of = db.Column(db.String, nullable = False)

And this is me attempting to filter Payment by date:

Payment.query.filter(Payment.due_date.month == today.month, Payment.due_date.year == today.year, Payment.due_date.day >= today.day).all()

where today is simply datetime.today().

I assumed the due_date column would have all DateTime attributes when I call it (e.g. .month), but it seems I was wrong.

What is the best way to filter the columns of Payment by date? Thank you for your help.

Asked By: LeetCoder

||

Answers:

SQLAlchemy effectively translates your query expressed in Python into SQL. But it does that at a relatively superficial level, based on the data type that you assign to the Column when defining your model.

This means that it won’t necessarily replicate Python’s datetime.datetime API on its DateTime construct – after all, those two classes are meant to do very different things! (datetime.datetime provides datetime functionality to Python, while SQLAlchemy’s DateTime tells its SQL-translation logic that it’s dealing with a SQL DATETIME or TIMESTAMP column).

But don’t worry! There are quite a few different ways for you to do achieve what you’re trying to do, and some of them are super easy. The three easiest I think are:

  1. Construct your filter using a complete datetime instance, rather than its component pieces (day, month, year).
  2. Using SQLAlchemy’s extract construct in your filter.
  3. Define three hybrid properties in your model that return the payment month, day, and year which you can then filter against.

Filtering on a datetime Object

This is the simplest of the three (easy) ways to achieve what you’re trying, and it should also perform the fastest. Basically, instead of trying to filter on each component (day, month, year) separately in your query, just use a single datetime value.

Basically, the following should be equivalent to what you’re trying to do in your query above:

from datetime import datetime

todays_datetime = datetime(datetime.today().year, datetime.today().month, datetime.today().day)

payments = Payment.query.filter(Payment.due_date >= todays_datetime).all()

Now, payments should be all payments whose due date occurs after the start (time 00:00:00) of your system’s current date.

If you want to get more complicated, like filter payments that were made in the last 30 days. You could do that with the following code:

from datetime import datetime, timedelta

filter_after = datetime.today() - timedelta(days = 30)

payments = Payment.query.filter(Payment.due_date >= filter_after).all()

You can combine multiple filter targets using and_ and or_. For example to return payments that were due within the last 30 days AND were due more than 15 ago, you can use:

from datetime import datetime, timedelta
from sqlalchemy import and_

thirty_days_ago = datetime.today() - timedelta(days = 30)
fifteen_days_ago = datetime.today() - timedelta(days = 15)

# Using and_ IMPLICITLY:
payments = Payment.query.filter(Payment.due_date >= thirty_days_ago,
                                Payment.due_date <= fifteen_days_ago).all()

# Using and_ explicitly:
payments = Payment.query.filter(and_(Payment.due_date >= thirty_days_ago,
                                     Payment.due_date <= fifteen_days_ago)).all()

The trick here – from your perspective – is to construct your filter target datetime instances correctly before executing your query.

Using the extract Construct

SQLAlchemy’s extract expression (documented here) is used to execute a SQL EXTRACT statement, which is how in SQL you can extract a month, day, or year from a DATETIME/TIMESTAMP value.

Using this approach, SQLAlchemy tells your SQL database “first, pull the month, day, and year out of my DATETIME column and then filter on that extracted value”. Be aware that this approach will be slower than filtering on a datetime value as described above. But here’s how this works:

from sqlalchemy import extract

payments = Payment.query.filter(extract('month', Payment.due_date) >= datetime.today().month,
                                extract('year', Payment.due_date) >= datetime.today().year,
                                extract('day', Payment.due_date) >= datetime.today().day).all()

Using Hybrid Attributes

SQLAlchemy Hybrid Attributes are wonderful things. They allow you to transparently apply Python functionality without modifying your database. I suspect for this specific use case they might be overkill, but they are a third way to achieve what you want.

Basically, you can think of hybrid attributes as “virtual columns” that don’t actually exist in your database, but which SQLAlchemy can calculate on-the-fly from your database columns when it needs to.

In your specific question, we would define three hybrid properties: due_date_day, due_date_month, due_date_year in your Payment model. Here’s how that would work:

... your existing import statements

from sqlalchemy import extract
from sqlalchemy.ext.hybrid import hybrid_property

class Payment(db.Model, UserMixin):
    id = db.Column(db.Integer, unique = True, primary_key = True)

    payment_to = db.Column(db.String, nullable = False)
    amount = db.Column(db.Float, nullable = False)

    due_date = db.Column(db.DateTime, nullable = False, default = datetime.strftime(datetime.today(), "%b %d %Y"))
    week_of = db.Column(db.String, nullable = False)

    @hybrid_property
    def due_date_year(self):
        return self.due_date.year

    @due_date_year.expression
    def due_date_year(cls):
        return extract('year', cls.due_date)

    @hybrid_property
    def due_date_month(self):
        return self.due_date.month

    @due_date_month.expression
    def due_date_month(cls):
        return extract('month', cls.due_date)

    @hybrid_property
    def due_date_day(self):
        return self.due_date.day

    @due_date_day.expression
    def due_date_day(cls):
        return extract('day', cls.due_date)

payments = Payment.query.filter(Payment.due_date_year >= datetime.today().year,
                                Payment.due_date_month >= datetime.today().month,
                                Payment.due_date_day >= datetime.today().day).all()

Here’s what the above is doing:

  1. You’re defining your Payment model as you already do.
  2. But then you’re adding some read-only instance attributes called due_date_year, due_date_month, and due_date_day. Using due_date_year as an example, this is an instance attribute which operates on instances of your Payment class. This means that when you execute one_of_my_payments.due_date_year the property will extract the due_date value from the Python instance. Because this is all happening within Python (i.e. not touching your database) it will operate on the already-translated datetime.datetime object that SQLAlchemy has stored in your instance. And it will return back the result of due_date.year.
  3. Then you’re adding a class attribute. This is the bit that is decorated with @due_date_year.expression. This decorator tells SQLAlchemy that when it is translating references to due_date_year into SQL expressions, it should do so as defined in in this method. So the example above tells SQLAlchemy “if you need to use due_date_year in a SQL expression, then extract('year', Payment.due_date) is how due_date_year should be expressed.

(note: The example above assumes due_date_year, due_date_month, and due_date_day are all read-only properties. You can of course define custom setters as well using @due_date_year.setter which accepts arguments (self, value) as well)

In Conclusion

Of these three approaches, I think the first approach (filtering on datetime) is both the easiest to understand, the easiest to implement, and will perform the fastest. It’s probably the best way to go. But the principles of these three approaches are very important and I think will help you get the most value out of SQLAlchemy. I hope this proves helpful!

Answered By: Chris Modzelewski