how to subquery in queryset in django?
Question:
how can i have a subquery in django’s queryset? for example if i have:
select name, age from person, employee where person.id = employee.id and
employee.id in (select id from employee where employee.company = 'Private')
this is what i have done yet.
Person.objects.value('name', 'age')
Employee.objects.filter(company='Private')
but it not working because it returns two output…
Answers:
ids = Employee.objects.filter(company='Private').values_list('id', flat=True)
Person.objects.filter(id__in=ids).values('name', 'age')
as mentioned by ypercube your use case doesn’t require subquery.
but anyway since many people land into this page to learn how to do sub-query here is how its done.
employee_query = Employee.objects.filter(company='Private').only('id').all()
Person.objects.value('name', 'age').filter(id__in=employee_query)
Source:
http://mattrobenolt.com/the-django-orm-and-subqueries/
You can create subqueries in Django by using an unevaluated queryset to filter your main queryset. In your case, it would look something like this:
employee_query = Employee.objects.filter(company='Private')
people = Person.objects.filter(employee__in=employee_query)
I’m assuming that you have a reverse relationship from Person
to Employee
named employee
. I found it helpful to look at the SQL query generated by a queryset when I was trying to understand how the filters work.
print people.query
As others have said, you don’t really need a subquery for your example. You could just join to the employee table:
people2 = Person.objects.filter(employee__company='Private')
The correct answer on your question is here https://docs.djangoproject.com/en/2.1/ref/models/expressions/#subquery-expressions
As an example:
>>> from django.db.models import OuterRef, Subquery
>>> newest = Comment.objects.filter(post=OuterRef('pk')).order_by('-created_at')
>>> Post.objects.annotate(newest_commenter_email=Subquery(newest.values('email')[:1]))
hero_qs = Hero.objects.filter(category=OuterRef("pk")).order_by("-benevolence_factor")
Category.objects.all().annotate(most_benevolent_hero=Subquery(hero_qs.values('name')[:1]))
the generated sql
SELECT "entities_category"."id",
"entities_category"."name",
(SELECT U0."name"
FROM "entities_hero" U0
WHERE U0."category_id" = ("entities_category"."id")
ORDER BY U0."benevolence_factor" DESC
LIMIT 1) AS "most_benevolent_hero"
FROM "entities_category"
For more details, see this article.
Take good care with only
if your subqueries don’t select the primary key.
Example:
class Customer:
pass
class Order:
customer: Customer
pass
class OrderItem:
order: Order
is_recalled: bool
- Customer has Orders
- Order has OrderItems
Now we are trying to find all customers with at least one recalled order-item.(1)
This will not work properly
order_ids = OrderItem.objects
.filter(is_recalled=True)
.only("order_id")
customer_ids = OrderItem.objects
.filter(id__in=order_ids)
.only('customer_id')
# BROKEN! BROKEN
customers = Customer.objects.filter(id__in=customer_ids)
The code above looks very fine, but it produces the following query:
select * from customer where id in (
select id -- should be customer_id
from orders
where id in (
select id -- should be order_id
from order_items
where is_recalled = true))
Instead one should use select
order_ids = OrderItem.objects
.filter(is_recalled=True)
.select("order_id")
customer_ids = OrderItem.objects
.filter(id__in=order_ids)
.select('customer_id')
customers = Customer.objects.filter(id__in=customer_ids)
(1) Note: in a real case we might consider ‘WHERE EXISTS’
how can i have a subquery in django’s queryset? for example if i have:
select name, age from person, employee where person.id = employee.id and
employee.id in (select id from employee where employee.company = 'Private')
this is what i have done yet.
Person.objects.value('name', 'age')
Employee.objects.filter(company='Private')
but it not working because it returns two output…
ids = Employee.objects.filter(company='Private').values_list('id', flat=True)
Person.objects.filter(id__in=ids).values('name', 'age')
as mentioned by ypercube your use case doesn’t require subquery.
but anyway since many people land into this page to learn how to do sub-query here is how its done.
employee_query = Employee.objects.filter(company='Private').only('id').all()
Person.objects.value('name', 'age').filter(id__in=employee_query)
Source:
http://mattrobenolt.com/the-django-orm-and-subqueries/
You can create subqueries in Django by using an unevaluated queryset to filter your main queryset. In your case, it would look something like this:
employee_query = Employee.objects.filter(company='Private')
people = Person.objects.filter(employee__in=employee_query)
I’m assuming that you have a reverse relationship from Person
to Employee
named employee
. I found it helpful to look at the SQL query generated by a queryset when I was trying to understand how the filters work.
print people.query
As others have said, you don’t really need a subquery for your example. You could just join to the employee table:
people2 = Person.objects.filter(employee__company='Private')
The correct answer on your question is here https://docs.djangoproject.com/en/2.1/ref/models/expressions/#subquery-expressions
As an example:
>>> from django.db.models import OuterRef, Subquery
>>> newest = Comment.objects.filter(post=OuterRef('pk')).order_by('-created_at')
>>> Post.objects.annotate(newest_commenter_email=Subquery(newest.values('email')[:1]))
hero_qs = Hero.objects.filter(category=OuterRef("pk")).order_by("-benevolence_factor")
Category.objects.all().annotate(most_benevolent_hero=Subquery(hero_qs.values('name')[:1]))
the generated sql
SELECT "entities_category"."id",
"entities_category"."name",
(SELECT U0."name"
FROM "entities_hero" U0
WHERE U0."category_id" = ("entities_category"."id")
ORDER BY U0."benevolence_factor" DESC
LIMIT 1) AS "most_benevolent_hero"
FROM "entities_category"
For more details, see this article.
Take good care with only
if your subqueries don’t select the primary key.
Example:
class Customer:
pass
class Order:
customer: Customer
pass
class OrderItem:
order: Order
is_recalled: bool
- Customer has Orders
- Order has OrderItems
Now we are trying to find all customers with at least one recalled order-item.(1)
This will not work properly
order_ids = OrderItem.objects
.filter(is_recalled=True)
.only("order_id")
customer_ids = OrderItem.objects
.filter(id__in=order_ids)
.only('customer_id')
# BROKEN! BROKEN
customers = Customer.objects.filter(id__in=customer_ids)
The code above looks very fine, but it produces the following query:
select * from customer where id in (
select id -- should be customer_id
from orders
where id in (
select id -- should be order_id
from order_items
where is_recalled = true))
Instead one should use select
order_ids = OrderItem.objects
.filter(is_recalled=True)
.select("order_id")
customer_ids = OrderItem.objects
.filter(id__in=order_ids)
.select('customer_id')
customers = Customer.objects.filter(id__in=customer_ids)
(1) Note: in a real case we might consider ‘WHERE EXISTS’