How to GROUP BY a computed boolean expression in SQLite?

Question:

I have an SQLite DB of data from a month-long festival that looks in part like this:

CREATE TABLE IF NOT EXISTS performance_status (
    id INTEGER PRIMARY KEY,
    name TEXT UNIQUE       -- available, soldout, past, etc
)
CREATE TABLE IF NOT EXISTS performances (
    id INTEGER PRIMARY KEY,
    date INTEGER,           -- The date of the show
    sold_out_date INTEGER,  -- The date on which the showing sold out

    show_id INTEGER,
    status_id INTEGER,
    FOREIGN KEY(show_id) REFERENCES shows(id),
    FOREIGN KEY(status_id) REFERENCES performance_status(id),
    UNIQUE(date, show_id)
)
CREATE TABLE IF NOT EXISTS shows (
    id INTEGER PRIMARY KEY,
    name TEXT
)

I have a list of shows that have sold out ANY performance dates, sorted by the number of performances (days) that have sold out. Here is my query:

SELECT s.id, s.name, count(p.id) AS sellout_count
FROM shows AS s
    LEFT JOIN performances AS p ON s.id = p.show_id
    LEFT JOIN performance_status AS ps ON p.status_id = ps.id
WHERE ps.name = "soldout"
GROUP BY s.id
HAVING sellout_count > 0
ORDER BY sellout_count DESC
LIMIT 3

This works, and returns an array like this:

[(2100, 'Show 1', 25), (1286, 'Show 2', 25), (2936, 'Show 3', 24)]

So far so good. But I also need to know how many performance days a given show has, regardless of the sellout status. But the WHERE clause is limiting the selected rows to only those that are sold out.

If I GROUP BY s.id, ps.name, that will partition things beyond what I need, and return as many as 8 groups per show. I just want to group by the boolean "soldout or NOT soldout".

How can I do this in SQLite?

Asked By: Nick K9

||

Answers:

You can use the following subquery:

SELECT s.id,
       s.name,
       COUNT(p.id) AS sellout_count,
       (SELECT COUNT(p1.id)
        FROM performances AS p1
        WHERE p.show_id = p1.show_id
       ) AS performance_days
       ...
Answered By: choroba

Remove the WHERE clause, which not only filters out any statuses other than 'soldout' but it also changes your LEFT joins to INNER joins because it also filters out any non-matching rows.

You should use conditional aggregation with the aggregate function SUM() (or TOTAL() if you use the query without the HAVING clause) to get the column sellout_count and COUNT() to get the number of performance days:

SELECT s.id, 
       s.name, 
       SUM(ps.name = 'soldout') AS sellout_count,
       COUNT(p.id) performance_days -- or COUNT(DISTINCT p.id)
FROM shows AS s
LEFT JOIN performances AS p ON s.id = p.show_id
LEFT JOIN performance_status AS ps ON p.status_id = ps.id
GROUP BY s.id
HAVING sellout_count > 0
Answered By: forpas
Categories: questions Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.