python SQLite3 how to getting records that match a list of values in a column then place that into pandas df

Question:

I am not experienced with SQL or SQLite3.

I have a list of ids from another table. I want to use the list as a key in my query and get all records based on the list. I want the SQL query to feed directly into a DataFrame.

import pandas as pd
import sqlite3

cnx = sqlite3.connect('c:/path/to/data.sqlite')

# the below values are ones found in "s_id"
id_list = ['C20','C23','C25','C28', ... ,'C83']
# change list to sql string. 
id_sql = ", ".join(str(x) for x in id_list)
df = pd.read_sql_query(f"SELECT * FROM table WHERE s_id in ({id_sql})", cnx)

I am getting a DatabaseError: Execution failed on sql ‘SELECT * FROM … : no such column: C20.
When I saw this error I thought the code just needs a simple switch. So I tried this

df = pd.read_sql_query(f"SELECT * FROM table WHERE ({id_sql}) in s_id", cnx)

it did not work.

So how can I get this to work?

The table is like.

id s_id date assigned_to date_complete notes
0 C10 1/6/2020 Jack 1/8/2020 None
1 C20 1/10/2020 Jane 1/12/2020 Call back
2 C23 1/11/2020 Henry 1/12/2020 finished
n C83 rows of more data
n+1 D85 9/10/2021 Jeni 9/12/2021 Call back
Asked By: Shane S

||

Answers:

So the problem is that it is missing single quote marks in the sql string. So for the in part needs on each side of the s_id values.

import pandas as pd
import sqlite3

cnx = sqlite3.connect('c:/path/to/data.sqlite')

# the below values are ones found in "s_id"
id_list = ['C20','C23','C25','C28', ... ,'C83']
# change list to sql string. 
id_sql = "', '".join(str(x) for x in id_list)
df = pd.read_sql_query(f"SELECT * FROM table WHERE s_id in ('{id_sql}')", cnx)
Answered By: Shane S

Currently, you are missing the single quotes around your literal values and consequently the SQLite engine assumes you are attempting to query columns. However, avoid concatenation of values altogether but bind them to parameters which pandas pandas.read_sql supports with the params argument:

# the below values are ones found in "s_id"
id_list = ['C20','C23','C25','C28', ... ,'C83']

# build equal length string of ? place holders
prm_list = ", ".join("?" for _ in id_list)

# build prepared SQL statement
sql = f"SELECT * FROM table WHERE s_id IN ({prm_list})"

# run query, passing parameters and values separately
df = pd.read_sql(sql, con=cnx, params=id_list)
Answered By: Parfait
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.