Best away to distinct a list of objects by two fields equals with sqlalchemy

Question:

I have some duplicated records in my database with different Ids, like:

id      | title         | date              |flag |  
--------+---------------+-------------------+-----+
    2099|Test           |2022-11-29 00:00:00| 1   |
    2100|Test           |2022-11-29 00:00:00| 1   |
    2101|Test           |2022-11-29 00:00:00| 1   |
    2102|Test           |2022-11-29 00:00:00| 1   |

and this is the code to get all records from my database:

event_get = Event.query.filter(Event.flag == 1)
result = events_schema.dump(event_get)
data = dict(event=result)

I wanna distinct the results by title and date, so as the example with four records, I would have all of them with the same title and date, returning only one record

I did not find a way of do this with sqlalchemy, is It possible or I would need to use a alternative code in the returned list?

Asked By: Possenti

||

Answers:

Pass the columns to distinct() to generate distinct on like this:

events = session.query(Event).distinct(Event.title, Event.date).filter(Event.flag == 1).all()

This uses a subquery that finds the smallest id for each group of title/date. The main query then only returns events with ids in that subquery. I guess check and see if this works in mysql. My small test seems to work.

from sqlalchemy.sql import select, func
subq = select(func.min(Event.id)).group_by(Event.title, Event.date)
events = session.query(Event).where(Event.id.in_(subq)).all()

Also if you just need those exact values you can probably just use regular DISTINCT like this:

for (title, date) in session.query(Event.title, Event.date).distinct().all():
    print (title, date)
Answered By: Ian Wilson
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.