Syntax error while using psycopg2.copy_expert

Question:

I use psycopg2 to call PostgreSQL COPY command from my programs written in Python, and so far I have used copy_from function in that package to achieve this. As I understand, in recent versions qualified names for tables, that is, scheman_name.table_name are no more authorized while using copy_from for security reasons (SQL injection). According to what I read on their Github page, apparently copy_expert is henceforth the way to go for qualified names. I tried to adapt my program accordingly but I get a syntax error. Here I provide a test case.

Let’s say that this is my table (assuming that dev_schema is an already existing schema):

create table dev_schema.testtab(numval integer not null);

And our data file to import into this table:
Data.txt

100
120
200
800
500

To import the above file into the table directly via psql, all I have to do is running the following command in psql prompt:

 copy dev_schema.testtab from 'D:/Data.txt' with (format CSV, NULL '', delimiter '|', quote '"');

My purpose was to translate the above into a Python program using psycopg2.copy_expert.

And here is my program:

import psycopg2
from psycopg2 import sql


def main():
    connection = psycopg2.connect(
        dbname="testdb",
        user="dev_user",
        password="Here I write my password",
        host="localhost",
        port=5432
    )
    #
    # Create a cursor
    cursor = connection.cursor()
    #
    #
    # Importing the data file
    filepath = "D:/Data.txt"
    with open(
        file=filepath,
        mode="r",
        encoding="UTF-8"
    ) as file_desc:
        option_values = [
            "format CSV",
            "NULL ''",
            "delimiter '|'",
            "quote '"'"
        ]
        copy_options = sql.SQL(', ').join(
            sql.Identifier(n) for n in option_values
        )
        cursor.copy_expert(
            sql=sql.SQL(
                "copy {} from stdin with ({})"
            ).format(
                sql.Identifier("dev_schema", "testtab"),
                copy_options
            ),
            file=file_desc
        )

    cursor.close()
    connection.close()
    print("done")


if __name__ == "__main__":
    main()

But when I run this program I get the following error message:

cursor.copy_expert(
psycopg2.errors.SyntaxError: ERROR:  option « format CSV » not recognized
LINE 1: copy "dev_schema"."testtab" from stdin with ("format CSV", "...

But I don’t see what’s the problem with CSV option as it works pretty well when I use directly copy in psql prompt.

Could you kindly make some clarification?
Thanks in advance

EDIT:
Based on Adrian Klaver’s comment I changed my code accordingly by including directly as a string all options at the end but now I get a new error message:

psycopg2.errors.SyntaxError: ERROR : Syntax error on or near   'dev_schema'  
LINE 1: copy Identifier('dev_schema', 'testtab') from stdin with (fo...

Here is the new version of my code:

import psycopg2
from psycopg2 import sql


def main():
    connection = psycopg2.connect(
        dbname="testdb",
        user="dev_user",
        password="Here I write my password",
        host="localhost",
        port=5432
    )
    #
    # Create a cursor
    cursor = connection.cursor()
    #
    #
    # Importing the data file
    filepath = "D:/Data.txt"
    with open(
        file=filepath,
        mode="r",
        encoding="UTF-8"
    ) as file_desc:
        option_values = "".join(
            [
                "format CSV, ",
                "NULL '', ",
                "delimiter '|', ",
                "quote '"',"
            ]
        )
        cursor.copy_expert(
            sql=sql.SQL("".join(
                [
                    "copy {} from stdin with (",
                    option_values,
                    ")"
                ]).format(sql.Identifier("dev_schema", "testtab"))
            ),
            file=file_desc
        )

    cursor.close()
    connection.close()
    print("done")


if __name__ == "__main__":
    main()
Asked By: user17911

||

Answers:

General tip. Assign a composed query to a variable and print it to see what is wrong.

        copy_cmd = sql.SQL(
            "copy {} from stdin with ({})"
        ).format(
            sql.Identifier("dev_schema", "testtab"),
            copy_options
        )
        print(copy_cmd.as_string(connection))

Of course, copy_options should be built as plain text, not a list of identifiers:

    with open(
        file=filepath,
        mode="r",
        encoding="UTF-8"
    ) as file_desc:
        option_values = [
            "format CSV",
            "NULL ''",
            "delimiter '|'",
            "quote '"'"
        ]
        copy_options = sql.SQL(', '.join(
            n for n in option_values)
        )
        copy_cmd = sql.SQL(
            "copy {} from stdin with ({})"
        ).format(
            sql.Identifier("dev_schema", "testtab"),
            copy_options
        )
        # print(copy_cmd.as_string(connection))
        cursor.copy_expert(
            sql=copy_cmd,
            file=file_desc
        )

    connection.commit() # !!!
    connection.close()

Do not forget to commit!

Answered By: klin
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.