Simple SELECT statement on existing table with SQLAlchemy
Question:
Nowhere on the internet does there exist a simple few-line tutorial on a simple SELECT
statement for SQLAlchemy 1.0.
Assuming I’ve established my database connection using create_engine()
, and my database tables already exist, I’d like to know how to execute the following query:
select
name,
age
from
users
where
name = 'joe'
and
age = 100
Answers:
I think the following will work for querying the users database table
from sqlalchemy.sql import and_
s = select([users]).where(and_(users.c.name == 'joe', users.c.age == 100))
for row in conn.execute(s):
print row
Found this while trying to figure out the same thing.
To select data from a table via SQLAlchemy, you need to build a representation of that table within SQLAlchemy. If Jupyter Notebook’s response speed is any indication, that representation isn’t filled in (with data from your existing database) until the query is executed.
You need Table
to build a table. You need select
to select data from the database. You need metadata
… for reasons that aren’t clear, even in the docs.
from sqlalchemy import create_engine, select, MetaData, Table, and_
engine = create_engine("dburl://user:pass@database/schema")
metadata = MetaData(bind=None)
table = Table(
'table_name',
metadata,
autoload=True,
autoload_with=engine
)
stmt = select([
table.columns.column1,
table.columns.column2
]).where(and_(
table.columns.column1 == 'filter1',
table.columns.column2 == 'filter2'
))
connection = engine.connect()
results = connection.execute(stmt).fetchall()
You can then iterate over the results. See SQLAlchemy query to return only n results? on how to return one or only a few rows of data, which is useful for slower/larger queries.
for result in results:
print(result)
I checked this with a local database, and the SQLAlchemy results are not equal to the raw SQL results. The difference, for my data set, was in how the numbers were formatted. SQL returned float64 (e.g., 633.07
), while SQLAlchemy returned objects (I think Decimal
, e.g. 633.0700000000
.)
Some help from DataCamp’s Introduction to Databases in Python
Since the original question has two columns in the select statement, and it can confuse some people on how to write using that:
from sqlalchemy import and_
stmt = select([users.columns.name,users.columns.age])
stmt= stmt.where(and_(name=='joe',age==100)
for res in connection.execute(stmt):
print(res)
Sticking with SQL alchemy for this seems overcomplicated. What you can do instead, is to pass SQL alchemy engine component, to pandas.read_sql(sql,conn).
https://pandas.pydata.org/docs/reference/api/pandas.read_sql.html
from sqlalchemy import create_engine
import pandas as pd
engine = create_engine(.....)
sql = "select name, age from users where name = 'joe' and age = 100"
df = pd.read_sql(sql,con=engine)
While most answers points to the and_
solution, which works perfectly (and may have been the best answer at the time) but needs imports and some counter intuitive coding, it is now possible to use a more usual way, with &
such as, and based on @Evans answer, the code would now be (shorter import & shorter query statement):
from sqlalchemy import create_engine, select, MetaData, Table
engine = create_engine("dburl://user:pass@database/schema")
metadata = MetaData(bind=None)
table = Table(
'table_name',
metadata,
autoload=True,
autoload_with=engine
)
stmt = select([
table.columns.column1,
table.columns.column2
]).where(
(table.columns.column1 == 'filter1')
&
(table.columns.column2 == 'filter2')
)
connection = engine.connect()
results = connection.execute(stmt).fetchall()
Please note that, as it is specified in the documentation here, when using the Python &
operator, you should use parenthesis to properly compound your query based on Python precedence rules
Nowhere on the internet does there exist a simple few-line tutorial on a simple SELECT
statement for SQLAlchemy 1.0.
Assuming I’ve established my database connection using create_engine()
, and my database tables already exist, I’d like to know how to execute the following query:
select
name,
age
from
users
where
name = 'joe'
and
age = 100
I think the following will work for querying the users database table
from sqlalchemy.sql import and_
s = select([users]).where(and_(users.c.name == 'joe', users.c.age == 100))
for row in conn.execute(s):
print row
Found this while trying to figure out the same thing.
To select data from a table via SQLAlchemy, you need to build a representation of that table within SQLAlchemy. If Jupyter Notebook’s response speed is any indication, that representation isn’t filled in (with data from your existing database) until the query is executed.
You need Table
to build a table. You need select
to select data from the database. You need metadata
… for reasons that aren’t clear, even in the docs.
from sqlalchemy import create_engine, select, MetaData, Table, and_
engine = create_engine("dburl://user:pass@database/schema")
metadata = MetaData(bind=None)
table = Table(
'table_name',
metadata,
autoload=True,
autoload_with=engine
)
stmt = select([
table.columns.column1,
table.columns.column2
]).where(and_(
table.columns.column1 == 'filter1',
table.columns.column2 == 'filter2'
))
connection = engine.connect()
results = connection.execute(stmt).fetchall()
You can then iterate over the results. See SQLAlchemy query to return only n results? on how to return one or only a few rows of data, which is useful for slower/larger queries.
for result in results:
print(result)
I checked this with a local database, and the SQLAlchemy results are not equal to the raw SQL results. The difference, for my data set, was in how the numbers were formatted. SQL returned float64 (e.g., 633.07
), while SQLAlchemy returned objects (I think Decimal
, e.g. 633.0700000000
.)
Some help from DataCamp’s Introduction to Databases in Python
Since the original question has two columns in the select statement, and it can confuse some people on how to write using that:
from sqlalchemy import and_
stmt = select([users.columns.name,users.columns.age])
stmt= stmt.where(and_(name=='joe',age==100)
for res in connection.execute(stmt):
print(res)
Sticking with SQL alchemy for this seems overcomplicated. What you can do instead, is to pass SQL alchemy engine component, to pandas.read_sql(sql,conn).
https://pandas.pydata.org/docs/reference/api/pandas.read_sql.html
from sqlalchemy import create_engine
import pandas as pd
engine = create_engine(.....)
sql = "select name, age from users where name = 'joe' and age = 100"
df = pd.read_sql(sql,con=engine)
While most answers points to the and_
solution, which works perfectly (and may have been the best answer at the time) but needs imports and some counter intuitive coding, it is now possible to use a more usual way, with &
such as, and based on @Evans answer, the code would now be (shorter import & shorter query statement):
from sqlalchemy import create_engine, select, MetaData, Table
engine = create_engine("dburl://user:pass@database/schema")
metadata = MetaData(bind=None)
table = Table(
'table_name',
metadata,
autoload=True,
autoload_with=engine
)
stmt = select([
table.columns.column1,
table.columns.column2
]).where(
(table.columns.column1 == 'filter1')
&
(table.columns.column2 == 'filter2')
)
connection = engine.connect()
results = connection.execute(stmt).fetchall()
Please note that, as it is specified in the documentation here, when using the Python &
operator, you should use parenthesis to properly compound your query based on Python precedence rules