Example of what SQLAlchemy can do, and Django ORM cannot

Question:

I’ve been doing a lot of research lately into using Pyramid with SQLAlchemy versus keeping a current application in Django. That by itself is an entire debate, but I’m not here to discuss that.

What I do want to know is, why is SQLAlchemy universally considered better than Django ORM? Almost every, if not every, comparison I’ve found between the two favors SQLAlchemy. I assume performance is a big one, as the structure of SQLAlchemy lets it translate to SQL much more smoothly.

But, I’ve also heard that with harder tasks, Django ORM is nearly impossible to use. I want to scope out how huge of an issue this can be. I’ve been reading one of the reasons to switch to SQLAlchemy is when Django ORM is no longer suiting your needs.

So, in short, could someone provide a query (doesn’t have to be actual SQL syntax) that SQLAlchemy can do, but Django ORM cannot possibly do without adding in additional raw SQL?

Update:

I’ve been noticing this question getting quite some attention since I first asked, so I’d like to throw in my extra two cents.

In the end we ended up using SQLAlchemy and I must say I’m happy with the decision.

I’m revisiting this question to provide an additional feature of SQLAlchemy that, so far, I’ve not been able to replicate in Django ORM. If someone can provide an example of how to do this I’ll gladly eat my words.

Let’s say you want to use some postgresql function, such as similarity(), which provides a fuzzy comparison (see: Finding similar strings with PostgreSQL quickly – tl;dr input two strings get back a percent similarity).

I’ve done some searching on how to do this using the Django ORM and have found nothing other than using raw sql as seems to be apparent from their documentation: https://docs.djangoproject.com/en/dev/topics/db/sql/.

i.e.

Model.objects.raw('SELECT * FROM app_model ORDER BY 
similarity(name, %s) DESC;', [input_name])

SQLalchemy, however, has func(), as described here: http://docs.sqlalchemy.org/en/latest/core/sqlelement.html#sqlalchemy.sql.expression.func

from sqlalchemy import desc, func
session.query(Model).order_by(func.similarity(Model.name, input_name))

This allows you to generate sql for any defined sql/postgresql/etc function and not require raw sql.

Asked By: limasxgoesto0

||

Answers:

This is dangerously close to being non-constructive, but I’ll bite.

Suppose we need to maintain inventories of certain items for a number of different, let’s say, accounts. DDL follows:

CREATE TABLE account (
    id serial PRIMARY KEY,
    ...
);

CREATE TABLE item (
    id serial PRIMARY KEY,
    name text NOT NULL,
    ...
);

CREATE TABLE inventory (
    account_id integer NOT NULL REFERENCES account(id),
    item_id integer NOT NULL REFERENCES item(id),
    amount integer NOT NULL DEFAULT 0 CHECK (amount >= 0),
    PRIMARY KEY (account_id, item_id)
);

First of all, Django ORM can’t work with composite primary keys. Yes, you can always add a surrogate key and unique constraint, but that’s one more column and one more index than you actually need. For a big table with a small number of columns this would add noticeable size and performance overhead. Also, ORMs generally have trouble with identity mapping using anything other than primary key.

Now, let’s say we want to query each item in the inventory of given account accompanied by its quantity, but also include all items not present there with quantity set to 0. And then sort this in descending order by quantity. Corresponding SQL:

SELECT item.id, item.name, ..., coalesce(inventory.amount, 0) AS amount
    FROM item LEFT OUTER JOIN inventory
        ON item.id = inventory.item_id AND inventory.team_id = ?
    ORDER BY amount DESC;

There is no way to express outer join with custom condition in Django ORM. Yes, you can make two simple separate queries and perform join by hand in Python loop. And performance probably won’t suffer much in this particular case. But that’s beside the point because results of every query could be reproduced on the application side using only basic SELECTs.

With SQLAlchemy:

class Account(Base):
    __tablename__ = 'account'
    id = Column(Integer, primary_key=True)
    ...

class Item(Base):
    __tablename__ = 'item'
    id = Column(Integer, primary_key=True)
    name = Column(String, nullable=False)
    ...

class Inventory(Base):
    __tablename__ = 'inventory'
    account_id = Column(Integer, ForeignKey('account.id'), primary_key=True,
            nullable=False)
    account = relationship(Account)
    item_id = Column(Integer, ForeignKey('item.id'), primary_key=True,
            nullable=False)
    item = relationship(Item)
    amount = Column(Integer, CheckConstraint('amount >= 0'), nullable=False,
            default=0)

account = session.query(Account).get(some_id)
result = (session
    .query(Item, func.coalesce(Inventory.amount, 0).label('amount'))
    .outerjoin(Inventory,
        and_(Item.id==Inventory.item_id, Inventory.account==account))
    .order_by(desc('amount'))
    .all())

As a side note, SQLAlchemy makes dictionary based collections very easy. With addition of the following code to the Account model you make relationship with Inventory appear as what it is: a mapping from items to their quantity.

items = relationship('Inventory',
    collection_class=attribute_mapped_collection('item_id'))
inventory = association_proxy('items', 'amount',
    creator=lambda k, v: Inventory(item_id=k, amount=v))

This enables you to write code such as:

account.inventory[item_id] += added_value

that transparently inserts or updates entries in the inventory table.

Complex joins, subqueries, window aggregates — Django ORM fails to deal with anything of that without falling back to raw SQL.

Answered By: rkhayrov

This should work in Django 1.11:

inventory_amount = Subquery(account.inventory_set.filter(item=OuterRef('pk')).values('amount'))
Item.objects.annotate(inventory_amount=Coalesce(inventory_amount, Value(0)))
Answered By: spiderlama
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.