How to use python to commit data in batches to SQLServer database?

Question:

The data I want to insert into the database likes this:

datalist =[['2012', '1', '3', '1', '832.0', '261.0', '100.00'],
            ['2012', '1', '5', '1', '507.0', '193.0', '92.50'],
            ['2012', '2', '3', '1', '412.0', '200.0', '95.00'],
            ['2012', '2', '5', '1', '560.0', '335.0', '90.00'],
            ['2012', '3', '3', '1', '584.0', '205.0', '100.00'],
            ['2012', '3', '5', '1', '595.0', '162.0', '92.50'],
            ['2012', '4', '3', '1', '504.0', '227.0', '100.00'],
            ['2012', '4', '5', '1', '591.0', '264.0', '92.50']]

But in fact, there are 500,000 rows in datalist. So I just listed a part of it.

The code I insert into the database likes this:

import pymssql

server = '127.0.0.1'
user = "test"
password = "test"
database='SQLTest'
datalist = [['2012', '1', '3', '1', '832.0', '261.0', '100.00'],
            ['2012', '1', '5', '1', '507.0', '193.0', '92.50'],
            ['2012', '2', '3', '1', '412.0', '200.0', '95.00'],
            ['2012', '2', '5', '1', '560.0', '335.0', '90.00'],
            ['2012', '3', '3', '1', '584.0', '205.0', '100.00'],
            ['2012', '3', '5', '1', '595.0', '162.0', '92.50'],
            ['2012', '4', '3', '1', '504.0', '227.0', '100.00'],
            ['2012', '4', '5', '1', '591.0', '264.0', '92.50']]

#But in fact, there are 500,000 rows in datalist

try:
    conn = pymssql.connect(server, user, password, database)
    cursor = conn.cursor()
    for one_row in datalist:
        val1 = one_row[4]
        val2 = one_row[5]
        val3 = one_row[6]
        sql = "insert into table_for_test values(col1, col2, col3)" % (val1, val2,val3)
        cursor.execute(sql)
        conn.commit()
except Exception as ex:
    conn.rollback()
    raise ex
finally:
    conn.close()

Because of the amount of data is too large,So I want to insert data in batchs,how to modify the code?

Asked By: user9270170

||

Answers:

One way to do this is to use the BULK INSERT statement.
https://learn.microsoft.com/en-us/sql/t-sql/statements/bulk-insert-transact-sql

The input should be from a file (e.g CSV).

So for example if the data is in CSV file

BULK INSERT table_for_test
    FROM C:useradmindownloadsmycsv.csv
    WITH (
        FIRSTROW=1
      , FIELDTERMINATOR=','
      , ROWTERMINATOR='n'
    )
Answered By: Saher Ahwal

Now I know how to do it.
Use executeMany.The element must be tuple in the list.

import pymssql

server = '127.0.0.1'
user = "test"
password = "test"
database='SQLTest'
datalist = [('2012', '1', '3', '1', '832.0', '261.0', '100.00'),
            ('2012', '1', '5', '1', '507.0', '193.0', '92.50'),
            ('2012', '2', '3', '1', '412.0', '200.0', '95.00'),
            ('2012', '2', '5', '1', '560.0', '335.0', '90.00'),
            ('2012', '3', '3', '1', '584.0', '205.0', '100.00'),
            ('2012', '3', '5', '1', '595.0', '162.0', '92.50'),
            ('2012', '4', '3', '1', '504.0', '227.0', '100.00'),
            ('2012', '4', '5', '1', '591.0', '264.0', '92.50')]

try:
    conn = pymssql.connect(server, user, password, database)
    cursor = conn.cursor()
    sql = "insert into table_for_test values(col1, col2, col3, col4, col5, col6, col7) values(%s, %s, %s, %s, %s, %s, %s)"
    cursor.executemany(sql, datalist)
    conn.commit()
except Exception as ex:
    conn.rollback()
    raise ex
finally:
    conn.close()
Answered By: user9270170

Executemany() is slow because it’s using the for loop and in that it’s using the execute() function itself.

the easiest way which i found it is to use the execute() command itself in range of 1000.
for e.g i have data like this => data = [(1,"test1","123"),(2,"test2","543"),(3,"test3","876"),(4,"test4","098")]

so you can insert in this way

 def bulk_batch_insertion(self,data,tablename):
    print("Data==============>",len(data))
    start = 0
    end = 1000
    while data[start:end]:
        print("start ",start,'end ',end)
        new_data = ','.join(data[start:end])
        print("length of new data =========>",len(data[start:end]))
        query = f"INSERT INTO {tablename} VALUES {new_data}"
        print("Insert QUERY ====> ",query)
        self.cursor.execute(query)
        self.conn.commit()
        print("Successfully Inserted the data ")
        start = end
        end = start + 1000
    print("Execution  successfulll=======endss=====>>")

Note : I’m not using BULK insert because I have some encoding and some manipulations to be done before passing it into the database.

Answered By: shadabB
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.