How to speed up python and sqlalchemy?

Question:

The model in my source code is in the format below.
Array in dict Array in dict Array in dict…

# data structure
user_list = [{user_name: 'A', 
  email: '[email protected]', 
  items:[{name:'a_itme1', properties:[{1....},{2....}...]}
 ]} * 100]

I’m trying to put the above data into a postgresql db with SQLAlchemy.
There is a user table, an entity table, and an attribute table.
And there are tables that link users and items, and items and properties respectively.

for u in user_list:
  new_user = User(user_name=u.get('user_name'),....)
  session.add(new_user)
  session.flush()
  for item in u.get('items'):
    new_item = Item(name=item.get('name'),.....)
    session.add(new_item)
    session.flush()
    new_item_link = UserItemLink(user_id=new_user.id, item_id=new_item.id,...)
    session.add(new_item_link)
    session.flush()
    for prop in item.properties:
      new_properties = Properties(name=prop.get('name'),...)
      session.add(new_properties)
      session.flush()
      new_prop_link = ItemPropLink(item_id=new_item.id, prop_id=new_properties.id,...)
      session.add(new_prop_link)
      session.flush()
session.commit()

My models look like this:

class User(Base):
    __tablename__ = 'user'

    id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
    name = Column(String(20))
    email = Column(String(50))

    user_item_link = relationship('UserItemLink', back_populates='user')

class Item(Base):
    __tablename__ = 'item'

    id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
    name = Column(String(50))
    note = Column(String(50))

    user_item_link = relationship('UserItemLink', back_populates='item')

class Properties(Base):
    __tablename__ = 'properties'

    id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
    name = Column(String(50))
    value = Column(String(50))

    item_prop_link = relationship('ItemPropLink', back_populates='properties')

class UserItemLink(Base):
    __tablename__ = 'user_item_link'

    id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
    user_id = Column(ForeignKey('db.user.id'), nullable=False)
    item_id = Column(ForeignKey('db.item.id'), nullable=False)

The above sources have been simplified for better understanding.
When session.add() is performed sequentially with the above information, it takes a lot of time.
When 100 user information is input, there is a delay of 8 seconds or more.

Please advise to improve python speed and sqlalchemy speed.

Asked By: hangillw

||

Answers:

As you have relationships configured on the models you can compose complex objects using these relationships instead of relying on ids:

with Session() as s, s.begin():
    for u in user_list:
        user_item_links = []
        for item in u.get('items'):
            item_prop_links = []
            for prop in item['properties']:
                item_prop_link = ItemPropLink()
                item_prop_link.properties = Properties(name=prop.get('name'), value=prop.get('value'))
                item_prop_links.append(item_prop_link)
            item = Item(name=item.get('name'), item_prop_link=item_prop_links)
            user_item_link = UserItemLink()
            user_item_link.item = item
            user_item_links.append(user_item_link)
        new_user = User(name=u.get('user_name'), email=u.get('email'), user_item_link=user_item_links)
        s.add(new_user)

SQLAlchemy will automatically set the foreign keys when the session is flushed at commit time, removing the need to manually flush.

Answered By: snakecharmerb