Reading csv data into a table in postgresql via INSERT INTO with Python

Question:

I have a postgresql table created in python that I need to then populate with data from a csv file. The csv file has 4 columns and a header row. When I use a for loop with INSERT INTO it’s not working correctly.

It is giving me an error telling me that a certain column doesn’t exist, but the column is actually the first ID in the ID column.

I’ve looked over all the similar issues reported on other questions and can’t seem to find something that works.

The table looks like this (with more lines):

ID Gender Weight Age
A F 121 20
B M 156 31
C F 110 18

The code I am running is the following:

import pandas as pd

df = pd.read_csv('df.csv')

for x in df.index:
    cursor.execute("""
    INSERT INTO iddata (ID, Gender, Weight, Age)
    VALUES (%s, %s, %d, %d)""" % (df.loc[x]['ID'],
                                  df.loc[x]['Gender'],
                                  df.loc[x]['Weight'],
                                  df.loc[x]['Age']))
    conn.commit

The error I’m getting says

UndefinedColumn: column "a" does not exist
LINE 3:     VALUES (A, F, 121, 20)
                    ^

Asked By: data_life

||

Answers:

Replace the """ % with """,.

Also add () after the .commit().

The fixed loop code becomes:

for x in df.index:
    cursor.execute("""
    INSERT INTO iddata (ID, Gender, Weight, Age)
    VALUES (%s, %s, %d, %d)""", (df.loc[x]['ID'],
                                 df.loc[x]['Gender'],
                                 df.loc[x]['Weight'],
                                 df.loc[x]['Age']))
    conn.commit()

The reason why you need a comma , is to pass the data (4-element tuple) as the 2nd argument of cursor.execute. By doing so, cursor.execute will take care of correct quoting and escaping. This matters for string values: proper escaping will make sure that strings containing any characters (including ', " and ) will be sent intact to the database.

Answered By: pts
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.