How to create a dictionary with two querysets from different models, using a common DateTimeField as the dictionary keys?

Question:

I would like to create a dictionary from two models with a field dt in common. This field should be the dictionary keys, and fields value and last the keys’s value. What is the most efficient way to do that ?

class Balance(models.Model):
    value = models.FloatField(default=0)
    dt = models.DateTimeField()

class Price(models.Model):
    last = models.FloatField(default=0)
    dt = models.DateTimeField()

The desired output would be something like this :

{
    "2022-10-11T00:00:00Z": {
        "value": 151.05,
        "last": 1,
    },
    "2022-10-10T00:00:00Z": {
        "value": 151.1,
        "last": 1.1,
    },
    "2022-10-09T00:00:00Z": {
        "value": 152,
        "last": 1.1,
    },
    "2022-10-08T00:00:00Z": {
        "value": 154,
        "last": 1.23,
    }
}

I could iterate through each dictionaries of the querysets with a nested loop and search the items with a common dt then populate key:value inside a new dictionary, but it’s not elegant and I don’t believe it’s efficient.

Asked By: Florent

||

Answers:

This will do but not so sure about efficiency.

import datetime
balances = Balance.objects.annotate(date_only=Cast('dt',DateField())).values("date_only", "value")
prices = Price.objects.annotate(date_only=Cast('dt',DateField())).values("date_only", "last")


data = {}
for (balance, price) in zip(balances, prices):
    str_date = balance['date_only'].strftime("%Y/%m/%d")
    if balance['date_only'] == price['date_only']:
        data[str_date] = {}
        data[str_date]['last'] = price['last']
        data[str_date]['value'] = balance['value']


print(data)
Answered By: Hemal Patel

This version should cope with the extra unbalanced Price rows described by OP in comments:

balances = (
    Balance
    .objects
    .annotate(date_only=Cast('dt', DateField()))
    .order_by("dt")
    .values("date_only", "value")
)
balance_dates = {balance.dt for balance in balances}
prices = (
    Price
    .objects
    .annotate(date_only=Cast('dt', DateField()))
    .filter(dt__in=balance_dates)
    .order_by("dt")
    .values("date_only", "last")
)

data = {
    balance['date_only'].strftime("%Y/%m/%d"): {
        "last": price["last"],
        "value": balance["value"],
    }
    for (balance, price) in zip(balances, prices)
}

Efficiency on the Python side is about the same as Hemal Patel’s answer (which this is based on)

But the db query will benefit from an index on dt column and could be slow if the tables are very large.

In that case you are probably better off using raw SQL to output the dataset that you want.

Answered By: Anentropic
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.