Reading csv data into a table in postgresql via INSERT INTO with Python
Question:
I have a postgresql table created in python that I need to then populate with data from a csv file. The csv file has 4 columns and a header row. When I use a for loop with INSERT INTO
it’s not working correctly.
It is giving me an error telling me that a certain column doesn’t exist, but the column is actually the first ID in the ID column.
I’ve looked over all the similar issues reported on other questions and can’t seem to find something that works.
The table looks like this (with more lines):
ID
Gender
Weight
Age
A
F
121
20
B
M
156
31
C
F
110
18
The code I am running is the following:
import pandas as pd
df = pd.read_csv('df.csv')
for x in df.index:
cursor.execute("""
INSERT INTO iddata (ID, Gender, Weight, Age)
VALUES (%s, %s, %d, %d)""" % (df.loc[x]['ID'],
df.loc[x]['Gender'],
df.loc[x]['Weight'],
df.loc[x]['Age']))
conn.commit
The error I’m getting says
UndefinedColumn: column "a" does not exist
LINE 3: VALUES (A, F, 121, 20)
^
Answers:
Replace the """ %
with """,
.
Also add ()
after the .commit()
.
The fixed loop code becomes:
for x in df.index:
cursor.execute("""
INSERT INTO iddata (ID, Gender, Weight, Age)
VALUES (%s, %s, %d, %d)""", (df.loc[x]['ID'],
df.loc[x]['Gender'],
df.loc[x]['Weight'],
df.loc[x]['Age']))
conn.commit()
The reason why you need a comma ,
is to pass the data (4-element tuple) as the 2nd argument of cursor.execute. By doing so, cursor.execute will take care of correct quoting and escaping. This matters for string values: proper escaping will make sure that strings containing any characters (including '
, "
and
) will be sent intact to the database.
I have a postgresql table created in python that I need to then populate with data from a csv file. The csv file has 4 columns and a header row. When I use a for loop with INSERT INTO
it’s not working correctly.
It is giving me an error telling me that a certain column doesn’t exist, but the column is actually the first ID in the ID column.
I’ve looked over all the similar issues reported on other questions and can’t seem to find something that works.
The table looks like this (with more lines):
ID | Gender | Weight | Age |
---|---|---|---|
A | F | 121 | 20 |
B | M | 156 | 31 |
C | F | 110 | 18 |
The code I am running is the following:
import pandas as pd
df = pd.read_csv('df.csv')
for x in df.index:
cursor.execute("""
INSERT INTO iddata (ID, Gender, Weight, Age)
VALUES (%s, %s, %d, %d)""" % (df.loc[x]['ID'],
df.loc[x]['Gender'],
df.loc[x]['Weight'],
df.loc[x]['Age']))
conn.commit
The error I’m getting says
UndefinedColumn: column "a" does not exist
LINE 3: VALUES (A, F, 121, 20)
^
Replace the """ %
with """,
.
Also add ()
after the .commit()
.
The fixed loop code becomes:
for x in df.index:
cursor.execute("""
INSERT INTO iddata (ID, Gender, Weight, Age)
VALUES (%s, %s, %d, %d)""", (df.loc[x]['ID'],
df.loc[x]['Gender'],
df.loc[x]['Weight'],
df.loc[x]['Age']))
conn.commit()
The reason why you need a comma ,
is to pass the data (4-element tuple) as the 2nd argument of cursor.execute. By doing so, cursor.execute will take care of correct quoting and escaping. This matters for string values: proper escaping will make sure that strings containing any characters (including '
, "
and ) will be sent intact to the database.