Django Raw SQL Query using multiple databases

Question

I am trying to pull information from 2 different databases in the same query.

Below is an example of the code I’m trying to use with identifying information changed to generic names.

I have my data bases set up in settings like this:

DATABASES = {
 'default': {},
 'primary': {
     'ENGINE': 'django.db.backends.oracle',
     'NAME': 'oracle.test.com:5432/erp',
     'USER':'test1',
     'PASSWORD':'',
 },
 'secondary': {
     'ENGINE': 'django.db.backends.oracle',
     'NAME': 'oracle2.test.com:5432/wms',
     'USER':'test2',
     'PASSWORD':'',
 },
}

Then in my views I have this:

def index(request):
    sql_query = ('select n'
             '  primary.table1.organization_id "Org", n'
             '  primary.table2.LOCATION "Loc", n'
             '  primary.table3.inventory "Inv", n'
             '  primary.table4.reorder_date "Reorder", n'
             '  secondary.table3.COMMENTS "Comments", n'
             'from n'
               '  primary.table2, primary.table1n'
               '  LEFT OUTER JOIN primary.table3 Onn'
               '    (primary.table1.SCHEDULE_NUMBER = primary.table3.SCHEDULE_NUMBER) n'
               '  LEFT OUTER JOIN primary.table4 On n'
               '    (primary.table1.SCHEDULE_NUMBER = primary.table4.PARENT_SCHEDULE_NUMBER) n'
               '  LEFT OUTER JOIN secondary.table1 On n'
               '    (primary.table1.SCHEDULE_NUMBER = primary.table1.SCHEDULE_NUMBER) n'
             'where
               'primary.table1.item_id = primary.table2.inventory_item and n'
               'primary.table1.organization_id = primary.table2.organization_id n')
    with connections['primary', 'secondary'].cursor() as cursor:
        cursor.execute(sql_query)
        field_names = [tuple(x[0] for x in cursor.description)]
        row = cursor.fetchall()
        result = field_names + row
        df = pd.DataFrame(result)
        df.rename(columns=df.iloc[0], inplace=True)
        df.drop([0], inplace=True)
        table = df.to_html(index=False, classes='mystyle', justify='left')
    return render(request, 'template.html', {'table': table})

I thought this would work because the syntax I found online for SQL says to prepend the database name to each table. Unfortunately, I get a database error saying that primary or secondary is an invalid identifier.

EDIT – I also tried to simplify things to test. When I try this I get

getattr(): attribute name must be string

and it lists this line of code

with connections[‘primary’, ‘secondary’].cursor() as cursor:

So it would appear that I cannot use 2 different databases at the same time. Is there a better solution for what I am trying to accomplish?

    test = 'select * from erp.wip_lines where organization_id = 81'

    with connections['primary', 'secondary'].cursor() as cursor:
        cursor.execute(test)
        field_names = [tuple(x[0] for x in cursor.description)]
        row = cursor.fetchall()
        result = field_names + row
        df = pd.DataFrame(result)
        df.rename(columns=df.iloc[0], inplace=True)
        df.drop([0], inplace=True)
        table = df.to_html(index=False, classes='mystyle', justify='left')
    return render(request, 'template.html', {'table': table})

Asked By: Logan

||

Source

Answer 1

First, connections is a dict so you should iterate on its keys to get both connections:

dbs_to_connect = ['primary','secondary']
for k in dbs:
    with connections[k].cursor() as cursor:
        # code here...

Second, I’m not sure how your dbs and schemas are configured in Oracle, but remember that the query should be always select <colums> from <schema>.<table>. Regardless, the query string should run directly in another DB software as well, so you can test it there to make sure the problem is not the query itself.

Answered By: Rodrigo Cava

Answer 2

Based on some advice from other forums. I learned that it is not possible to join tables across databases. So here is my solution. Run the 2 databases as 2 different queries and then store the outputs in separate dataframes. Then use pandas merge to join the 2 dataframes together with a left join.

    Filepath = 'example filepath'
    SQL_Query1 = 'example'
    SQL_Query2 = 'example'
    with connections['primary'].cursor() as cursor:
        cursor.execute(SQL_Query1)
        field_names = [tuple(x[0] for x in cursor.description)]
        row = cursor.fetchall()
        result = field_names + row
        df = pd.DataFrame(result)
        df.rename(columns=df.iloc[0], inplace=True)
        df.drop([0], inplace=True)
        df['SCHED_#'] = pd.to_numeric(df['SCHED_#'])

    with connections['secondary'].cursor() as cursor2:
        cursor2.execute(SQL_Query2)
        wms_field_names = [tuple(x[0] for x in cursor2.description)]
        wms_row = cursor2.fetchall()
        wms_result = wms_field_names + wms_row
        wms_df = pd.DataFrame(wms_result)
        wms_df.rename(columns=wms_df.iloc[0], inplace=True)
        wms_df.drop([0], inplace=True)
        wms_df['SCHED_#'] = pd.to_numeric(wms_df['SCHED_#'])

    merged_df = pd.merge(df, wms_df, on='SCHED_#', how='left', indicator=True)
    merged_df.to_csv(filepath, index=False)

Thanks everyone for your input

Answered By: Logan

Django Raw SQL Query using multiple databases

Question:

Answers: