How to calculate the average of the N most recent entries in SQLite using the sqlite3 module?
Question:
I’m coding in Python 3.11, using the tkinter
and sqlite3
packages. I’ve generated a database with four columns, one of them is called weight
and its values are defined as real
(aka decimals/floats). What I want to do is write a function using cursor.execute that "selects" the 7 most recent entries in the weight
column, calculates and returns those 7 values’ average.
I understand SQLite3 has the in-built function AVG()
and I’ve tried to use it, but that function is taking the average of all entries in the weight column, and I haven’t been able to research a way to direct it to only take the N most recent entries.
I also understand SqLite3 has the ability to cursor.fetchmany(7)
, but Sqlite3 makes all data into tuples. So when I fetchmany(7)
and hardcode it to produce the average, it throws errors about tuples being unable to interact with int/str/floats. Here’s what my function looks like so far. What I actually get when I execute this function is the average of all entries in the column, rather than the last 7.
def average_query():
#Create a database or connect to one
conn = sqlite3.connect('weight_tracker.db')
#Create cursor
c = conn.cursor()
my_average = c.execute("SELECT round(avg(weight)) FROM weights ORDER BY oid DESC LIMIT 7")
my_average = c.fetchall()
my_average = my_average[0][0]
#Create labels on screen
average_label = Label(root,text=f"Your average 7-day rolling weight is {my_average} pounds.")
average_label.grid(row=9, column=0, columnspan=2)
#Commit changes
conn.commit()
#Close connection
conn.close()
Answers:
You can either extract the top 5 and then do the average in Python:
res = c.execute('SELECT weight FROM weights ORDER BY oid DESC LIMIT 7')
rows = res.fetchall()
# E.G. [(40,), (0,), (0,), (2500,), (1500,), (144,), (999,)]
avg = sum(r[0] for r in rows) / len(rows)
# 740.4285714285714
Or you can use a nested query to perform the average:
res = c.execute('SELECT ROUND(AVG(*)) FROM ( SELECT weight FROM weights ORDER BY oid DESC LIMIT 7 );')
rows = res.fetchall()
# [(740.4285714285714,)]
avg = rows[0][0]
# 740.4285714285714
Hey I’d recommend that you try to keep your data within SQL and compute the average there. SQL is designed and optimised for this exact kind of operation.
In order to achieve an average of the last N entries you can rely on window functions. They take a while to get used to, but once you figure them out they are super powerful.
The SQL query that should give you the result would look like this:
SELECT
AVG(weight) OVER (
ORDER BY oid DESC
ROWS BETWEEN 7 PRECEDING AND CURRENT ROW
) as avg_weight
FROM
weights
I put this together in a simple python script if you want to test this out. Note the data and column names are different, but it should show you how you might do this in python.
import sqlite3
connection = sqlite3.connect("demo.db")
cursor = connection.cursor()
## Setup database for testing ##
cursor.execute(
"""
CREATE TABLE IF NOT EXISTS example (
weight INT,
timestamp INT
)
"""
)
cursor.execute(
"""
INSERT INTO example (weight, timestamp)
VALUES
(10, 0),
(13, 1),
(5, 2),
(6, 3),
(10, 4),
(3, 5),
(10, 6),
(13, 7),
(5, 8),
(6, 9)
"""
)
## Get rolling average ##
cursor.execute(
"""
SELECT
AVG(weight) OVER (
ORDER BY timestamp
ROWS BETWEEN 3 PRECEDING AND CURRENT ROW
)
FROM
example
"""
)
print(cursor.fetchall())
I’m coding in Python 3.11, using the tkinter
and sqlite3
packages. I’ve generated a database with four columns, one of them is called weight
and its values are defined as real
(aka decimals/floats). What I want to do is write a function using cursor.execute that "selects" the 7 most recent entries in the weight
column, calculates and returns those 7 values’ average.
I understand SQLite3 has the in-built function AVG()
and I’ve tried to use it, but that function is taking the average of all entries in the weight column, and I haven’t been able to research a way to direct it to only take the N most recent entries.
I also understand SqLite3 has the ability to cursor.fetchmany(7)
, but Sqlite3 makes all data into tuples. So when I fetchmany(7)
and hardcode it to produce the average, it throws errors about tuples being unable to interact with int/str/floats. Here’s what my function looks like so far. What I actually get when I execute this function is the average of all entries in the column, rather than the last 7.
def average_query():
#Create a database or connect to one
conn = sqlite3.connect('weight_tracker.db')
#Create cursor
c = conn.cursor()
my_average = c.execute("SELECT round(avg(weight)) FROM weights ORDER BY oid DESC LIMIT 7")
my_average = c.fetchall()
my_average = my_average[0][0]
#Create labels on screen
average_label = Label(root,text=f"Your average 7-day rolling weight is {my_average} pounds.")
average_label.grid(row=9, column=0, columnspan=2)
#Commit changes
conn.commit()
#Close connection
conn.close()
You can either extract the top 5 and then do the average in Python:
res = c.execute('SELECT weight FROM weights ORDER BY oid DESC LIMIT 7')
rows = res.fetchall()
# E.G. [(40,), (0,), (0,), (2500,), (1500,), (144,), (999,)]
avg = sum(r[0] for r in rows) / len(rows)
# 740.4285714285714
Or you can use a nested query to perform the average:
res = c.execute('SELECT ROUND(AVG(*)) FROM ( SELECT weight FROM weights ORDER BY oid DESC LIMIT 7 );')
rows = res.fetchall()
# [(740.4285714285714,)]
avg = rows[0][0]
# 740.4285714285714
Hey I’d recommend that you try to keep your data within SQL and compute the average there. SQL is designed and optimised for this exact kind of operation.
In order to achieve an average of the last N entries you can rely on window functions. They take a while to get used to, but once you figure them out they are super powerful.
The SQL query that should give you the result would look like this:
SELECT
AVG(weight) OVER (
ORDER BY oid DESC
ROWS BETWEEN 7 PRECEDING AND CURRENT ROW
) as avg_weight
FROM
weights
I put this together in a simple python script if you want to test this out. Note the data and column names are different, but it should show you how you might do this in python.
import sqlite3
connection = sqlite3.connect("demo.db")
cursor = connection.cursor()
## Setup database for testing ##
cursor.execute(
"""
CREATE TABLE IF NOT EXISTS example (
weight INT,
timestamp INT
)
"""
)
cursor.execute(
"""
INSERT INTO example (weight, timestamp)
VALUES
(10, 0),
(13, 1),
(5, 2),
(6, 3),
(10, 4),
(3, 5),
(10, 6),
(13, 7),
(5, 8),
(6, 9)
"""
)
## Get rolling average ##
cursor.execute(
"""
SELECT
AVG(weight) OVER (
ORDER BY timestamp
ROWS BETWEEN 3 PRECEDING AND CURRENT ROW
)
FROM
example
"""
)
print(cursor.fetchall())