Trigger a mapper event in SQLAlchemy for a "soft delete" feature mixin class

Question:

I’m implementing a soft delete mixin so that whenever I call Object.delete() it doesn’t actually delete it, but rather sets a deleted_at column to the current time.

Even though nothing actually gets deleted, I would like the app to act as if a deletion occurred and trigger before/after delete events. The idea is that by doing so, my hooks could handle all of the waterfall of operations that need to occur after a delete (even thought nothing truly was deleted).

Say this is the soft delete Mixin:

class SoftDeleteMixin:
    deleted_at = Column(DateTime, nullable=True)

    def delete(self):
        self.deleted_at = func.now()
        # I'd like to trigger a before/after delete mapper event here, but I don't know how to do it.

    def undelete(self):
        self.deleted_at = None

Let’s say I have some class that inherits this Mixin

class User(Base, SoftDeleteMixin):
   ....

I want the following event to get triggered by User.delete()

@event.listens_for(User, "before_delete")
def before_delete(mapper, connection, target):
    # do something
    pass

Is there any way to achieve this?

Answers:

Going through the relevant documentation, for "before_delete" feature to work, it’s implied that a real DELETE statement is emitted before the event is triggered, and is only applied at the "session flush operation". Given that a "soft delete" will not result in any DELETE statements being emitted, clearly that’s the wrong event to attach a listener for. As this "soft delete" implementation relies on the update of a field, the correct event to listen for should be the *_update ones, and the counterpart that’s being after will assume to be the before_update event. To demonstrate, the following MVCE will be the start.

from datetime import datetime
from logging import getLogger
from sqlalchemy import Column, Integer, String, DateTime
from sqlalchemy import create_engine, event
from sqlalchemy.orm import declarative_base, sessionmaker

Base = declarative_base()
logger = getLogger(__name__)

class SoftDeleteMixin:
    deleted_at = Column(DateTime, nullable=True)

    def delete(self):
        self.deleted_at = datetime.now()

    def undelete(self):
        self.deleted_at = None

class Entity(Base, SoftDeleteMixin):
    __tablename__ = 'entity'
    id = Column(Integer, primary_key=True)
    field_a = Column(String(255))
    field_b = Column(String(255))
    created_at = Column(DateTime(), default=datetime.now)
    # def __repr__ omitted

@event.listens_for(Entity, "before_update")
def before_update(mapper, connection, target):
    # ensure the target inherits from `SoftDeleteMixin`
    if isinstance(target, SoftDeleteMixin):
        if target.deleted_at:
            logger.info('%r is deleted', target)
        else:
            logger.info('%r is undeleted', target)

Now the usage portion of the above class definitions (both portions in my usage is in a single file, just combine both parts):

def setup_db():
    engine = create_engine('sqlite://')
    Base.metadata.create_all(engine)
    Session = sessionmaker(bind=engine)
    session = Session()
    session.add(Entity(field_a='field_a1', field_b='field_b1'))
    session.add(Entity(field_a='field_a2', field_b='field_b2'))
    session.add(Entity(field_a='field_a3', field_b='field_b3'))
    session.commit()
    return Session

def run(Session):
    session = Session()

    records = session.query(Entity).all()
    records[0].delete()
    session.commit()

    records = session.query(Entity).all()
    records[0].undelete()
    records[1].undelete()
    records[2].field_a = 'id:3 updated field_a'
    session.commit()

    records = session.query(Entity).all()
    # following not committed, so no emitted statements as a result
    records[2].delete()

def main():
    Session = setup_db()
    run(Session)

if __name__ == '__main__':
    from logging import basicConfig, INFO
    basicConfig()
    getLogger().setLevel(INFO)
    getLogger('sqlalchemy.engine').setLevel(INFO)
    main()

The log may appear as follows (only relevant parts included):

... create/inserts skipped ...
INFO:sqlalchemy.engine.Engine:BEGIN (implicit)
INFO:sqlalchemy.engine.Engine:SELECT ...
INFO:__main__:<Entity id=1> is deleted
INFO:sqlalchemy.engine.Engine:UPDATE entity SET deleted_at=? WHERE entity.id = ?
INFO:sqlalchemy.engine.Engine:[generated in 0.00011s] ('2023-04-08 19:54:39.360807', 1)
INFO:sqlalchemy.engine.Engine:COMMIT
INFO:sqlalchemy.engine.Engine:BEGIN (implicit)
INFO:sqlalchemy.engine.Engine:SELECT ...
INFO:__main__:<Entity id=1> is undeleted
INFO:__main__:<Entity id=2> is undeleted
INFO:__main__:<Entity id=3> is undeleted
INFO:sqlalchemy.engine.Engine:UPDATE entity SET deleted_at=? WHERE entity.id = ?
INFO:sqlalchemy.engine.Engine:[cached since 0.001571s ago] (None, 1)
INFO:sqlalchemy.engine.Engine:UPDATE entity SET field_a=? WHERE entity.id = ?
INFO:sqlalchemy.engine.Engine:[generated in 0.00011s] ('id:3 updated field_a', 3)
INFO:sqlalchemy.engine.Engine:COMMIT
INFO:sqlalchemy.engine.Engine:BEGIN (implicit)
INFO:sqlalchemy.engine.Engine:SELECT ...

The event listener has fired as expected with the conditions handled. Another thing of note is that the second entity, while not having been deleted, still got the undelete triggered… wait a minute, the other non-delete related update statement also triggered the listener, that does not look right. Okay, so there must be a way to constrain the listener to handle just a particular field…

Perhaps sqlalchemy.orm.AttributeEvents.set can help, let’s make the following modification to the event listener section:

# add a `set` event handler for the `deleted_at` field
@event.listens_for(Entity.deleted_at, "set")
def deleted_at_set_listener(target, value, old_value, initiator):
    if isinstance(target, SoftDeleteMixin):
        if value != old_value:
            target._deleted_at_update = True

# replace the previous `before_update` definition
@event.listens_for(Entity, "before_update")
def before_update(mapper, connection, target):
    if (isinstance(target, SoftDeleteMixin) and 
            getattr(target, '_deleted_at_update', False)):
        if target.deleted_at:
            logger.info('%r is deleted', target)
        else:
            logger.info('%r is undeleted', target)

Note that we make use of a dummy attribute _deleted_at_update flag for this example – it may be better to define this on the mixin with a simple annotated field to satisfy type-hinting requirements (which also avoids the need to use that getattr hack), but for now a quick example this will do.

Now we run this again, we get the following output:

INFO:sqlalchemy.engine.Engine:BEGIN (implicit)
INFO:sqlalchemy.engine.Engine:SELECT ...
INFO:__main__:<Entity id=1> is deleted
INFO:sqlalchemy.engine.Engine:UPDATE entity SET deleted_at=? WHERE entity.id = ?
INFO:sqlalchemy.engine.Engine:[generated in 0.00010s] ('2023-04-08 20:00:26.312881', 1)
INFO:sqlalchemy.engine.Engine:COMMIT
INFO:sqlalchemy.engine.Engine:BEGIN (implicit)
INFO:sqlalchemy.engine.Engine:SELECT ...
INFO:__main__:<Entity id=1> is undeleted
INFO:sqlalchemy.engine.Engine:UPDATE entity SET deleted_at=? WHERE entity.id = ?
INFO:sqlalchemy.engine.Engine:[cached since 0.001444s ago] (None, 1)
INFO:sqlalchemy.engine.Engine:UPDATE entity SET field_a=? WHERE entity.id = ?
INFO:sqlalchemy.engine.Engine:[generated in 0.00009s] ('id:3 updated field_a', 3)
INFO:sqlalchemy.engine.Engine:COMMIT

There we go, much better.

Now, you might also be wondering if there’s a way to make this mixin more reusable and perhaps more automatic, as attaching manually after every time a class was constructed is a sure fire way to result in forgetting the handler somehow. Fortunately, the before_mapper_configured event can be used so that all classes that subclass with the mixin class can gain the additional event listeners for the .deleted_at field. This however will require shuffling all declarations that’s been done so far, and have all of this be declared after the declarative_base() that was defined for all the ORM classes.

Base = declarative_base()

class SoftDeleteMixin:
    deleted_at = Column(DateTime, nullable=True)

    def delete(self):
        self.deleted_at = datetime.now()

    def undelete(self):
        self.deleted_at = None

def deleted_at_set_listener(target, value, old_value, initiator):
    if isinstance(target, SoftDeleteMixin):
        if value != old_value:
            target._deleted_at_update = True

def before_update(mapper, connection, target):
    if (isinstance(target, SoftDeleteMixin) and
            getattr(target, '_deleted_at_update', False)):
        if target.deleted_at:
            logger.info('%r is deleted', target)
        else:
            logger.info('%r is undeleted', target)

@event.listens_for(Base, "before_mapper_configured", propagate=True)
def on_new_class(mapper, cls_):
    if issubclass(cls_, SoftDeleteMixin):
        event.listens_for(cls_, "before_update")(before_update)
        event.listens_for(cls_.deleted_at, "set")(deleted_at_set_listener)

With that done, additional entities that need this soft delete feature can finally be easily defined by simply subclass with the mixin class, e.g.

class User(Base, SoftDeleteMixin):
    __tablename__ = 'user'
    id = Column(Integer, primary_key=True)
    name = Column(String(255))
    documents = relationship('Document', viewonly=True)
    comments = relationship('Comment', viewonly=True)

class Document(Base, SoftDeleteMixin):
    __tablename__ = 'document'
    id = Column(Integer, primary_key=True)
    title = Column(String(255))
    user_id = Column(Integer, ForeignKey('user.id'))
    author = relationship('User')
    comments = relationship('Comment', back_populates='document')

class Comment(Base, SoftDeleteMixin):
    __tablename__ = 'comment'
    id = Column(Integer, primary_key=True)
    content = Column(String(255))
    document_id = Column(Integer, ForeignKey('document.id'))
    document = relationship('Document')
    user_id = Column(Integer, ForeignKey('user.id'))
    author = relationship('User')

These additional document and user classes will also get the benefit of handler, e.g. the following output may be produced when their .delete() methods are called:

INFO:__main__:<Document id=1> is deleted
INFO:sqlalchemy.engine.Engine:UPDATE document SET deleted_at=? WHERE document.id = ?
INFO:sqlalchemy.engine.Engine:[generated in 0.00012s] ('2023-04-08 20:33:49.607657', 1)
INFO:__main__:<User id=1> is deleted
INFO:sqlalchemy.engine.Engine:UPDATE user SET deleted_at=? WHERE user.id = ?
INFO:sqlalchemy.engine.Engine:[generated in 0.00010s] ('2023-04-08 20:33:49.609633', 1)

Now this is finally done properly. A slightly more refined version with the full examples used to generate the logs may be found in this gist, as the complete example is getting rather lengthy.


Addendum: I just noticed that you asked about handling "all of the waterfall of operations that need to occur after a delete" – this can also be done in a generic way if they all follow the above subclass manner. The following modified delete_at_set_listener can address that:

def deleted_at_set_listener(target, value, old_value, initiator):
    if isinstance(target, SoftDeleteMixin):
        target._deleted_at_updated = (value != old_value)
        for name, attr in vars(type(target)).items():
            if (isinstance(attr, InstrumentedAttribute) and
                    isinstance(attr.property, Relationship) and
                    attr.property.uselist):
                for item in getattr(target, name):
                    if isinstance(item, SoftDeleteMixin):
                        if value: 
                            item.delete()
                        else: 
                            item.undelete()

Example usage:

def run(Session):
    # setup
    session = Session()
    john = User(name='John')
    dave = User(name='Dave')
    doc1 = Document(title='A serious document', author=john)
    com11 = Comment(content="John's first remark", document=doc1, author=john)
    com12 = Comment(content="Dave's first remark", document=doc1, author=dave)
    doc2 = Document(title='Second document', author=dave)
    com21 = Comment(content="John's second remark", document=doc2, author=john)
    com22 = Comment(content="Dave's second remark", document=doc2, author=dave)
    session.add(john)
    session.add(dave)
    session.add(doc1)
    session.add(doc2)
    session.add(com11)
    session.add(com12)
    session.add(com21)
    session.add(com22)
    session.commit()
    logger.info('*** deleting the first document (auto deletes its comments)') 
    document = session.query(Document).first().delete()
    session.commit()
    logger.info('*** deleting everything dave did')
    dave = session.query(User).filter(User.name=='Dave').first()
    dave.delete()
    session.commit()
    logger.info('*** undeleting everything dave did')
    # done _without_ consideration for deletes triggered by deletion 
    # of first document
    dave = session.query(User).filter(User.name=='Dave').first()
    dave.undelete()
    session.commit()

The relevant output:

INFO:__main__:*** deleting the first document (auto deletes its comments)
INFO:__main__:<Document id=1> is deleted
INFO:__main__:<Comment id=1> is deleted
INFO:__main__:<Comment id=2> is deleted
INFO:__main__:*** deleting everything dave did
INFO:__main__:<Document id=2> is deleted
INFO:__main__:<Comment id=3> is deleted
INFO:__main__:<Comment id=4> is deleted
INFO:__main__:<User id=2> is deleted
INFO:__main__:*** undeleting everything dave did
INFO:__main__:<Document id=2> is undeleted
INFO:__main__:<Comment id=3> is undeleted
INFO:__main__:<Comment id=4> is undeleted
INFO:__main__:<User id=2> is undeleted
INFO:__main__:<Comment id=2> is undeleted

Note that this is a very naive example, so it may in fact be beneficial to write specific handlers for the event listeners, as the cascading deletes may not be desirable, and the current implementation has massive inefficiencies by going through the ORM which will trigger a bunch of unnecessary selects and such, so only use this as a proof of concept. Moreover, this naive example completely ignores the fact that the undelete operation will undelete the comment attached to the other deleted document (see that comment id=2 has been undeleted, which belongs to the still deleted document id=1), so these are the pitfalls that need serious consideration for a soft delete system that has cascade operations like so.

Answered By: metatoaster
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.