Django orm get latest for each group
Question:
I am using Django 1.6 with Mysql.
I have these models:
class Student(models.Model):
username = models.CharField(max_length=200, unique = True)
class Score(models.Model):
student = models.ForeignKey(Student)
date = models.DateTimeField()
score = models.IntegerField()
I want to get the latest score record for each student.
I have tried:
Score.objects.values('student').annotate(latest_date=Max('date'))
and:
Score.objects.values('student__username').annotate(latest_date=Max('date'))
as described Django ORM – Get the latest record for the group
but it did not help.
Answers:
I believe this would give you the student and the data
Score.objects.values('student').annotate(latest_date=Max('date'))
If you want the full Score
records, it seems you will have to use a raw SQL query: Filtering Django Query by the Record with the Maximum Column Value
If your DB is postgres which supports distinct()
on field you can try
Score.objects.order_by('student__username', '-date').distinct('student__username')
This should work on Django 1.2+ and MySQL:
Score.objects.annotate(
max_date=Max('student__score__date')
).filter(
date=F('max_date')
)
Here’s an example using Greatest
with a secondary annotate
. I was facing and issue where annotate was returning duplicate records ( Examples ), but the last_message_time
Greatest annotation was causing duplicates.
qs = (
Example.objects.filter(
Q(xyz=xyz)
)
.exclude(
Q(zzz=zzz)
)
# this annotation causes duplicate Examples in the qs
# and distinct doesn't work, as expected
# .distinct('id')
.annotate(
last_message_time=Greatest(
"comments__created",
"files__owner_files__created",
)
)
# so this second annotation selects the Max value of the various Greatest
.annotate(
last_message_time=Max(
"last_message_time"
)
)
.order_by("-last_message_time")
)
reference:
- https://docs.djangoproject.com/en/3.1/ref/models/database-functions/#greatest
from django.db.models import Max
Some great answers already, but none of them mentions Window functions.
The following example annotates all score objects with the latest score for the corresponding student:
from django.db.models import F, Window
from django.db.models.functions import FirstValue
scores = Score.objects.annotate(
latest_score=Window(
expression=FirstValue('score'),
partition_by=['student'],
order_by=F('date').desc(),
)
)
This results in the following SQL (using Sqlite backend):
SELECT
"score"."id",
"score"."student_id",
"score"."date",
"score"."score",
FIRST_VALUE("score"."score")
OVER (PARTITION BY "score"."student_id" ORDER BY "score"."date" DESC)
AS "latest_score"
FROM "score"
The required information is already there, but we can also reduce this queryset to a set of unique combinations of student_id
and latest_score
.
For example, on PostgreSQL we can use distinct with field names, as in scores.distinct('student')
.
On other db backends we can do something like set(scores.values_list('student_id', 'latest_score'))
, although this evaluates the queryset.
Unfortunately, at the time of writing, it is not yet possible to filter a windowed queryset.
I am using Django 1.6 with Mysql.
I have these models:
class Student(models.Model):
username = models.CharField(max_length=200, unique = True)
class Score(models.Model):
student = models.ForeignKey(Student)
date = models.DateTimeField()
score = models.IntegerField()
I want to get the latest score record for each student.
I have tried:
Score.objects.values('student').annotate(latest_date=Max('date'))
and:
Score.objects.values('student__username').annotate(latest_date=Max('date'))
as described Django ORM – Get the latest record for the group
but it did not help.
I believe this would give you the student and the data
Score.objects.values('student').annotate(latest_date=Max('date'))
If you want the full Score
records, it seems you will have to use a raw SQL query: Filtering Django Query by the Record with the Maximum Column Value
If your DB is postgres which supports distinct()
on field you can try
Score.objects.order_by('student__username', '-date').distinct('student__username')
This should work on Django 1.2+ and MySQL:
Score.objects.annotate(
max_date=Max('student__score__date')
).filter(
date=F('max_date')
)
Here’s an example using Greatest
with a secondary annotate
. I was facing and issue where annotate was returning duplicate records ( Examples ), but the last_message_time
Greatest annotation was causing duplicates.
qs = (
Example.objects.filter(
Q(xyz=xyz)
)
.exclude(
Q(zzz=zzz)
)
# this annotation causes duplicate Examples in the qs
# and distinct doesn't work, as expected
# .distinct('id')
.annotate(
last_message_time=Greatest(
"comments__created",
"files__owner_files__created",
)
)
# so this second annotation selects the Max value of the various Greatest
.annotate(
last_message_time=Max(
"last_message_time"
)
)
.order_by("-last_message_time")
)
reference:
- https://docs.djangoproject.com/en/3.1/ref/models/database-functions/#greatest
from django.db.models import Max
Some great answers already, but none of them mentions Window functions.
The following example annotates all score objects with the latest score for the corresponding student:
from django.db.models import F, Window
from django.db.models.functions import FirstValue
scores = Score.objects.annotate(
latest_score=Window(
expression=FirstValue('score'),
partition_by=['student'],
order_by=F('date').desc(),
)
)
This results in the following SQL (using Sqlite backend):
SELECT
"score"."id",
"score"."student_id",
"score"."date",
"score"."score",
FIRST_VALUE("score"."score")
OVER (PARTITION BY "score"."student_id" ORDER BY "score"."date" DESC)
AS "latest_score"
FROM "score"
The required information is already there, but we can also reduce this queryset to a set of unique combinations of student_id
and latest_score
.
For example, on PostgreSQL we can use distinct with field names, as in scores.distinct('student')
.
On other db backends we can do something like set(scores.values_list('student_id', 'latest_score'))
, although this evaluates the queryset.
Unfortunately, at the time of writing, it is not yet possible to filter a windowed queryset.