Django orm get latest for each group

Question:

I am using Django 1.6 with Mysql.

I have these models:

class Student(models.Model):
     username = models.CharField(max_length=200, unique = True)

class Score(models.Model):
     student = models.ForeignKey(Student)
     date = models.DateTimeField()
     score = models.IntegerField()

I want to get the latest score record for each student.
I have tried:

Score.objects.values('student').annotate(latest_date=Max('date'))

and:

Score.objects.values('student__username').annotate(latest_date=Max('date'))

as described Django ORM – Get the latest record for the group
but it did not help.

Asked By: yossi

||

Answers:

I believe this would give you the student and the data

Score.objects.values('student').annotate(latest_date=Max('date'))

If you want the full Score records, it seems you will have to use a raw SQL query: Filtering Django Query by the Record with the Maximum Column Value

Answered By: Paul Draper

If your DB is postgres which supports distinct() on field you can try

Score.objects.order_by('student__username', '-date').distinct('student__username')
Answered By: Rohan

This should work on Django 1.2+ and MySQL:

Score.objects.annotate(
  max_date=Max('student__score__date')
).filter(
  date=F('max_date')
)
Answered By: nitwit

Here’s an example using Greatest with a secondary annotate. I was facing and issue where annotate was returning duplicate records ( Examples ), but the last_message_time Greatest annotation was causing duplicates.

qs = (
            Example.objects.filter(
                Q(xyz=xyz)
            )
            .exclude(
                 Q(zzz=zzz)
            )
            # this annotation causes duplicate Examples in the qs
            # and distinct doesn't work, as expected
            # .distinct('id') 
            .annotate(
                last_message_time=Greatest(
                    "comments__created",
                    "files__owner_files__created",
                )
            )
            # so this second annotation selects the Max value of the various Greatest
            .annotate(
                last_message_time=Max(
                    "last_message_time"
                )
            )
            .order_by("-last_message_time")
    )

reference:

Answered By: jmunsch

Some great answers already, but none of them mentions Window functions.

The following example annotates all score objects with the latest score for the corresponding student:

from django.db.models import F, Window
from django.db.models.functions import FirstValue

scores = Score.objects.annotate(
    latest_score=Window(
        expression=FirstValue('score'),
        partition_by=['student'],
        order_by=F('date').desc(),
    )
)

This results in the following SQL (using Sqlite backend):

SELECT 
  "score"."id", 
  "score"."student_id", 
  "score"."date", 
  "score"."score", 
  FIRST_VALUE("score"."score") 
  OVER (PARTITION BY "score"."student_id" ORDER BY "score"."date" DESC) 
  AS "latest_score" 
FROM "score"

The required information is already there, but we can also reduce this queryset to a set of unique combinations of student_id and latest_score.

For example, on PostgreSQL we can use distinct with field names, as in scores.distinct('student').

On other db backends we can do something like set(scores.values_list('student_id', 'latest_score')), although this evaluates the queryset.

Unfortunately, at the time of writing, it is not yet possible to filter a windowed queryset.

EDIT: as of Django 4.2, windowed querysets can be filtered

Answered By: djvg