Pandas read_sql with multiple parameters and lists

Question:

I’ve the following script:

now = dt.datetime.now()
date_filter = now - timedelta(days=3)
list_ids = [1,2,3]
dq_connection = mysql.connector.connect(user='user', password='pass', host='localhost', database='db')
engine = create_engine('localhost/db')
cursor = connection.cursor(buffered=True)
query = ('''
SELECT *
FROM (SELECT * FROM myTable1 WHERE id in {%s}
WHERE date >= %s;
''')
df = pd.read_sql_query(query, connection,params=(list_ids,date_filter,))

And I want to have two filters on my query:
1) List all the IDs that I’ve on list_ids
2) Filter only the dates before date_filter.

The second Filter I can do it, but when I try with the list I got:

pandas.io.sql.DatabaseError: Execution failed on sql

What I am doing wrong?

Asked By: Pedro Alves

||

Answers:

Because IN clause receives multiple values, you need to adjust prepared statement with requisite number of placeholders, %s, and then unpack list for parameters with func(*list). Plus no subquery is needed for both WHERE clauses.

query = '''SELECT * FROM myTable1 
           WHERE id in (%s, %s, %s) AND date >= %s;
        '''

df = pd.read_sql_query(query, connection, params=(*list_ids, date_filter))

For dynamic placeholders equal to length of list, integrate a str.join:

placeholders = ", ".join(["%s" for _ in list_ids])

query = '''SELECT * FROM myTable1 
           WHERE id in ({}) AND date >= %s;
        '''.format(placeholders)

df = pd.read_sql_query(query, connection, params=(*list_ids, date_filter))
Answered By: Parfait

The accepted answer suggests creating a %s placeholder for every item in the list of ids. But that’s not necessary.

Each of the Python MySQL db packages supports passing lists or tuples into parameters. It’ll figure out how to "expand" the list into the appropriate SQL before sending it off to the database.

Here’s a complete example with pymysql:

import pymysql
import pandas


with pymysql.connect(
    host='',
    user='',
    passwd='',
    db='',
) as conn:
    data = pandas.read_sql(
        sql="select * from my_table where id in %(my_ids)s", 
        con=conn, 
        params={'my_ids': [1, 2, 3,]},
    )

(As an aside, and it’s been many years since the OP’s question, and so I think this specific point isn’t particularly useful to future readers, but I’m pretty sure the SQL error he received was more mundane. He’s missing a closing parens at the end of this phrase: FROM (SELECT * FROM myTable1 WHERE id in {%s}, and shouldn’t have curly braces around %s)

Answered By: Adam Easterling
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.