Django Query That Get Most Recent Objects From Different Categories

Question:

I have two models A and B. All B objects have a foreign key to an A object. Given a set of A objects, is there anyway to use the ORM to get a set of B objects containing the most recent object created for each A object.

Here’s an simplified example:

class Bakery(models.Model):
    town = models.CharField(max_length=255)

class Cake(models.Model):
    bakery = models.ForeignKey(Bakery, on_delete=models.CASCADE)
    baked_at = models.DateTimeField()

So I’m looking for a query that returns the most recent cake baked in each bakery in Anytown, USA.

Asked By: Zach

||

Answers:

This should do the job:

from django.db.models import Max
Bakery.objects.annotate(Max('cake__baked_at'))
Answered By: Daniel Roseman

As far as I know, there is no one-step way of doing this in Django ORM, but you can split it into two queries:

from django.db.models import Max

bakeries = Bakery.objects.annotate(
    hottest_cake_baked_at=Max('cake__baked_at')
) 
hottest_cakes = Cake.objects.filter(
    baked_at__in=[b.hottest_cake_baked_at for b in bakeries]
)

If id’s of cakes are progressing along with bake_at timestamps, you can simplify and disambiguate the above code (in case two cakes arrives at the same time you can get both of them):

from django.db.models import Max

hottest_cake_ids = Bakery.objects.annotate(
    hottest_cake_id=Max('cake__id')
).values_list('hottest_cak‌​e_id', flat=True)

hottest_cakes = Cake.objects.filter(id__in=hottest_cake_ids)

BTW credits for this goes to Daniel Roseman, who once answered similar question of mine:

http://groups.google.pl/group/django-users/browse_thread/thread/3b3cd4cbad478d34/3e4c87f336696054?hl=pl&q=

If the above method is too slow, then I know also second method – you can write custom SQL producing only those Cakes, that are hottest in relevant Bakeries, define it as database VIEW, and then write unmanaged Django model for it. It’s also mentioned in the above django-users thread. Direct link to the original concept is here:

http://web.archive.org/web/20130203180037/http://wolfram.kriesing.de/blog/index.php/2007/django-nice-and-critical-article#comment-48425

Hope this helps.

Answered By: Tomasz Zieliński

If you happen to be using PostGreSQL, you can use Django’s interface to DISTINCT ON:

recent_cakes = Cake.objects.order_by('bakery__id', '-baked_at').distinct('bakery__id')

As the docs say, you must order by the same fields that you distinct on. As Simon pointed out below, if you want to do additional sorting, you’ll have to do it in Python-space.

Answered By: dbn

I was fighting with similar problem and finally come to following solution. It does not rely on order_by and distinct so can be sorted as desired on db-side and also can be used as nested query for filtering. I also believe this implementation is db engine independent, because it’s based on standard sql HAVING clause. The only drawback is that it will return multiple hottest cakes per bakery, if they are baked in that bakery at exactly same time.

from django.db.models import Max, F

Cake.objects.annotate(
    # annotate with MAX "baked_at" over all cakes in bakery
    latest_baketime_in_bakery=Max('bakery__cake_set__baked_at')
    # compare this cake "baked_at" with annotated latest in bakery
).filter(latest_baketime_in_bakery__eq=F('baked_at'))
Answered By: Ivan Klass

Starting from Django 1.11 and thanks to Subquery and OuterRef, we can finally build a latest-per-group query using the ORM.

hottest_cakes = Cake.objects.filter(
    baked_at=Subquery(
        (Cake.objects
            .filter(bakery=OuterRef('bakery'))
            .values('bakery')
            .annotate(last_bake=Max('baked_at'))
            .values('last_bake')[:1]
        )
    )
)

#BONUS, we can now use this for prefetch_related()
bakeries = Bakery.objects.all().prefetch_related(
    Prefetch('cake_set',
        queryset=hottest_cakes,
        to_attr='hottest_cakes'
    )
)

#usage
for bakery in bakeries:
    print 'Bakery %s has %s hottest_cakes' % (bakery, len(bakery.hottest_cakes))
Answered By: Todor
Cake.objects.filter(bakery__town="Anytown").order_by("-created_at")[:1]

I haven’t built out the models on my end, but in theory this should work. Broken down:

  • Cake.objects.filter(bakery__town="Anytown") Should return any cakes whom belong to “Anytown”, assuming the country is not part of the string. The double underscores between bakery and town allow us to access the town property of bakery.
  • .order_by("-created_at") will order the results by their created date, most recent first (take note of the - (minus) sign in "-created_at". Without the minus sign, they’d be ordered by oldest to most recent.
  • [:1] on the end will return only the 1st item in the list which is returned (which would be a list of cakes from Anytown, sorted by most recent first).

Note: This answer is for Django 1.11.
This answer modified from Queries shown here in Django 1.11 Docs.

Answered By: twknab

@Tomasz Zieliński solution above did solve your problem but it did not solve mine, because I still need to filter the Cake. So here is my solution

from django.db.models import Q, Max

hottest_yellow_round_cake = Max('cake__baked_at', filter=Q(cake__color='yellow', cake__shape='round'))

bakeries = Bakery.objects.filter(town='Chicago').annotate(
    hottest_cake_baked_at=hottest_yellow_round_cake
)

hottest_cakes = Cake.objects.filter(
    baked_at__in=[b.hottest_cake_baked_at for b in bakeries]
)

With this approach, you can also implement other things like Filter, Ordering, Pagination for Cakes

Answered By: Nguyen Anh Vu