How can we use Python/Pandas to push data to Azure SQL Server?

Question:

I read the documentation in the two links below.

https://docs.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-set-database-firewall-rule-azure-sql-database?view=azuresqldb-current

https://www.virtual-dba.com/blog/firewalls-database-level-azure-sql/

I just setup my firewall ruls this past weekend and everything looks ok, I think. When I run the script below, I get the start and end IP addresses from my laptop.

USE [ryan_sql_db]
GO
SELECT * FROM sys.database_firewall_rules

I think we’re good here. Now, I am trying to login to the Azure database using the code below, loop through several CSV files in a folder, and push each one into my database.

import os
import glob
import pandas as pd
import ntpath

import urllib.parse
params = urllib.parse.quote_plus(
    'Driver=%s;' % '{ODBC Driver 17 for SQL Server}' +
    'Server=%s,1433;' % 'ryan-server.database.windows.net' +
    'Database=%s;' % 'ryan_sql_db' +
    'Uid=%s;' % 'MyUid'  +
    'Pwd={%s};' % 'MyPwd' +
    'Encrypt=no;' +
    'TrustServerCertificate=no;'
    )
    
from sqlalchemy.engine import create_engine
conn_str = 'mssql+pyodbc:///?odbc_connect=' + params
engine = create_engine(conn_str)
        
connection = engine.connect()
connection

    
# specifying the path to csv files
path = 'C:\Users\ryans\Desktop\test\'
csvname= ntpath.basename(path)
  
# csv files in the path
files = glob.glob(path + "*.csv")
  
# defining an empty list to store content
df = pd.DataFrame()
content = []
i = 1  

# checking all the csv files in the specified path
for filename in files:
    print(filename)
    df = pd.read_csv(filename, usecols=range(15), header=1)
    df['filename'] = os.path.basename(filename)
    content.append(df)
    df = pd.concat(content, ignore_index=True)  
    print(i)
    df = df.set_index('filename') 
    i = i + 1  

    try:
        df.to_sql('health', engine, if_exists='append', chunksize=100000, method=None,index=False)
    except Exception as e:
        print(e)

The code just runs and runs. It never finishes and it never throws an error. What am I doing wrong here? If I run the script below in Azure, I’m seeing 0 records, even after several minutes.

SELECT TOP (1000) * FROM [dbo].[health]

Query succeeded: Affected rows: 0

Asked By: ASH

||

Answers:

Have a look on the below code to send the data to SQL table which is working.

    import pyodbc
    import pandas as pd
    
    df = pd.read_csv("c:\user\usernamedepartment.csv")
    
    server = 'yourservername' 
    database = 'AdventureWorks' 
    username = 'username' 
    password = 'yourpassword' 
    cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)
    cursor = cnxn.cursor()
    #Insert Dataframe into SQL Server:
    for index, row in df.iterrows():
         cursor.execute("INSERT INTO HumanResources.DepartmentTest (DepartmentID,Name,GroupName) values(?,?,?)", row.DepartmentID, row.Name, row.GroupName)
    cnxn.commit()
    cursor.close() 

Refer to this official doc for detailed explanation.

Answered By: SaiKarri-MT