sqlalchemy add index to existing sqlite3 database

Question:

I have created a database with pandas :

import numpy as np                                                                                                                                                                                          
import sqlite3                                                                                                                                                                                              
import pandas as pd                                                                                                                                                                                         
import sqlite3                                                                                                                                                                                              
import sqlalchemy                                                                                                                                                                                           
from sqlalchemy import create_engine                                                                                                                                                                        
from sqlalchemy.orm import sessionmaker                                                                                                                                                                     

df = pd.DataFrame(np.random.normal(0, 1, (10, 2)), columns=['A', 'B'])                                                                                                                                      

path = 'sqlite:////home/username/Desktop/example.db'                                                                                                                                                        

engine = create_engine(path, echo=False)                                                                                                                                                                    

df.to_sql('flows', engine, if_exists='append', index=False)                                                                                                                                                 

# This is only to show I am able to read the database                                                                                                                                                                                                            
df_l = pd.read_sql("SELECT * FROM flows WHERE A>0 AND B<0", engine)                                                                                                                                         

Now I would like to add one or more indexes to the database.
Is this case I would like to make first only the column A and then both the columns indices.

How can I do that?

If possible I would like a solution that uses only SqlAlchemy so that it is independent from the choice of the database.

Asked By: Donbeo

||

Answers:

You should use reflection to get hold of the table that pandas created for you.

With reference to:

SQLAlchemy Reflecting Database Objects

A Table object can be instructed to load information about itself from
the corresponding database schema object already existing within the
database. This process is called reflection. In the most simple case
you need only specify the table name, a MetaData object, and the
autoload=True flag. If the MetaData is not persistently bound, also
add the autoload_with argument:

you could try this:

meta = sqlalchemy.MetaData()
meta.reflect(bind=engine)
flows = meta.tables['flows']
# alternative of retrieving the table from meta:
#flows = sqlalchemy.Table('flows', meta, autoload=True, autoload_with=engine)

my_index = sqlalchemy.Index('flows_idx', flows.columns.get('A'))
my_index.create(bind=engine)

# lets confirm it is there
inspector = reflection.Inspector.from_engine(engine)
print(inspector.get_indexes('flows'))
Answered By: craigsparks

This seems to work for me. You will have to define the variables psql_URI, table, and col yourself. Here I assume that the table name / column name may be in (partial) uppercase but you want the name of the index to be lowercase.

Derived from the answer here: https://stackoverflow.com/a/72976667/3406189

import sqlalchemy
from sqlalchemy.orm import Session

engine_psql = sqlalchemy.create_engine(psql_URI)
autocommit_engine = engine_psql.execution_options(isolation_level="AUTOCOMMIT")
with Session(autocommit_engine) as session:
    session.execute(
        f'CREATE INDEX IF NOT EXISTS idx_{table.lower()}_{col.lower()} ON sdi_ai."{table}" ("{col}");'
    )

Answered By: grofte
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.