Django query annotation with boolean field
Question:
Let’s say I have a Product
model with products in a storefront, and a ProductImages
table with images of the product, which can have zero or more images. Here’s a simplified example:
class Product(models.Model):
product_name = models.CharField(max_length=255)
# ...
class ProductImage(models.Model):
product = models.ForeignKey(Product, related_name='images')
image_file = models.CharField(max_length=255)
# ...
When displaying search results for products, I want to prioritize products which have images associated with them. I can easily get the number of images:
from django.db.models import Count
Product.objects.annotate(image_count=Count('images'))
But that’s not actually what I want. I’d like to annotate it with a boolean field, have_images
, indicating whether the product has one or more images, so that I can sort by that:
Product.objects.annotate(have_images=(?????)).order_by('-have_images', 'product_name')
How can I do that? Thanks!
Answers:
Read the docs about extra
qs = Product.objects.extra(select={'has_images': 'CASE WHEN images IS NOT NULL THEN 1 ELSE 0 END' })
Tested it works
But order_by
or where
(filter) by this field doesn’t for me (Django 1.8) 0o:
If you need to order the resulting queryset using some of the new
fields or tables you have included via extra() use the order_by
parameter to extra() and pass in a sequence of strings. These strings
should either be model fields (as in the normal order_by() method on
querysets), of the form table_name.column_name or an alias for a
column that you specified in the select parameter to extra().
qs = qs.extra(order_by = ['-has_images'])
qs = qs.extra(where = ['has_images=1'])
FieldError: Cannot resolve keyword ‘has_images’ into field.
I have found https://code.djangoproject.com/ticket/19434 still opened.
So if you have such troubles like me, you can use raw
If performance matters, my suggestion is to add the hasPictures
boolean field (as editable=False
)
Then keep right value through ProductImage
model signals (or overwriting save
and delete
methods)
Advantages:
- Index friendly.
- Better performance. Avoid joins.
- Database agnostic.
- Coding it will raise your django skills to next level.
I eventually found a way to do this using django 1.8’s new conditional expressions:
from django.db.models import Case, When, Value, IntegerField
q = (
Product.objects
.filter(...)
.annotate(image_count=Count('images'))
.annotate(
have_images=Case(
When(image_count__gt=0,
then=Value(1)),
default=Value(0),
output_field=IntegerField()))
.order_by('-have_images')
)
And that’s how I finally found incentive to upgrade to 1.8 from 1.7.
When you have to annotate existence with some filters, Sum
annotation can be used. For example, following annotates if there are any GIFs in images
:
Product.objects.filter(
).annotate(
animated_images=Sum(
Case(
When(images__image_file__endswith='gif', then=Value(1)),
default=Value(0),
output_field=IntegerField()
)
)
)
This will actually count them, but any pythonic if product.animated_images:
will work same as it was boolean.
Use conditional expressions and cast outputfield to BooleanField
Product.objects.annotate(image_count=Count('images')).annotate(has_image=Case(When(image_count=0, then=Value(False)), default=Value(True), output_field=BooleanField())).order_by('-has_image')
As from Django 1.11 it is possible to use Exists
. Example below comes from Exists documentation:
>>> from django.db.models import Exists, OuterRef
>>> from datetime import timedelta
>>> from django.utils import timezone
>>> one_day_ago = timezone.now() - timedelta(days=1)
>>> recent_comments = Comment.objects.filter(
... post=OuterRef('pk'),
... created_at__gte=one_day_ago,
... )
>>> Post.objects.annotate(recent_comment=Exists(recent_comments))
Let’s say I have a Product
model with products in a storefront, and a ProductImages
table with images of the product, which can have zero or more images. Here’s a simplified example:
class Product(models.Model):
product_name = models.CharField(max_length=255)
# ...
class ProductImage(models.Model):
product = models.ForeignKey(Product, related_name='images')
image_file = models.CharField(max_length=255)
# ...
When displaying search results for products, I want to prioritize products which have images associated with them. I can easily get the number of images:
from django.db.models import Count
Product.objects.annotate(image_count=Count('images'))
But that’s not actually what I want. I’d like to annotate it with a boolean field, have_images
, indicating whether the product has one or more images, so that I can sort by that:
Product.objects.annotate(have_images=(?????)).order_by('-have_images', 'product_name')
How can I do that? Thanks!
Read the docs about extra
qs = Product.objects.extra(select={'has_images': 'CASE WHEN images IS NOT NULL THEN 1 ELSE 0 END' })
Tested it works
But order_by
or where
(filter) by this field doesn’t for me (Django 1.8) 0o:
If you need to order the resulting queryset using some of the new
fields or tables you have included via extra() use the order_by
parameter to extra() and pass in a sequence of strings. These strings
should either be model fields (as in the normal order_by() method on
querysets), of the form table_name.column_name or an alias for a
column that you specified in the select parameter to extra().
qs = qs.extra(order_by = ['-has_images'])
qs = qs.extra(where = ['has_images=1'])
FieldError: Cannot resolve keyword ‘has_images’ into field.
I have found https://code.djangoproject.com/ticket/19434 still opened.
So if you have such troubles like me, you can use raw
If performance matters, my suggestion is to add the hasPictures
boolean field (as editable=False
)
Then keep right value through ProductImage
model signals (or overwriting save
and delete
methods)
Advantages:
- Index friendly.
- Better performance. Avoid joins.
- Database agnostic.
- Coding it will raise your django skills to next level.
I eventually found a way to do this using django 1.8’s new conditional expressions:
from django.db.models import Case, When, Value, IntegerField
q = (
Product.objects
.filter(...)
.annotate(image_count=Count('images'))
.annotate(
have_images=Case(
When(image_count__gt=0,
then=Value(1)),
default=Value(0),
output_field=IntegerField()))
.order_by('-have_images')
)
And that’s how I finally found incentive to upgrade to 1.8 from 1.7.
When you have to annotate existence with some filters, Sum
annotation can be used. For example, following annotates if there are any GIFs in images
:
Product.objects.filter(
).annotate(
animated_images=Sum(
Case(
When(images__image_file__endswith='gif', then=Value(1)),
default=Value(0),
output_field=IntegerField()
)
)
)
This will actually count them, but any pythonic if product.animated_images:
will work same as it was boolean.
Use conditional expressions and cast outputfield to BooleanField
Product.objects.annotate(image_count=Count('images')).annotate(has_image=Case(When(image_count=0, then=Value(False)), default=Value(True), output_field=BooleanField())).order_by('-has_image')
As from Django 1.11 it is possible to use Exists
. Example below comes from Exists documentation:
>>> from django.db.models import Exists, OuterRef
>>> from datetime import timedelta
>>> from django.utils import timezone
>>> one_day_ago = timezone.now() - timedelta(days=1)
>>> recent_comments = Comment.objects.filter(
... post=OuterRef('pk'),
... created_at__gte=one_day_ago,
... )
>>> Post.objects.annotate(recent_comment=Exists(recent_comments))