"Insert Into" statement causing errors due to "Parameter 7 (""): The supplied value is not a valid instance of data type float."

Question:

I’m loading a batch of CSV files into a SQL Server table using Python one row at a time. The files each contain a number of free text fields and erroneous data which I trim and rename before attempting to insert.

In general (about 95% of the time), the code seems to work however exceptions appear with the error message described below.

I’m confused as a) I only have four columns in my table, and can’t understand why it would be looking for Parameter 7, and b) the text columns are being loaded into nvarchar(max) formatted columns, so I wouldn’t expect a data type error.

I’ve checked the source files to see which rows threw an error, there seems to be no discernible difference between the problem rows and others that are successfully loaded.

I’ve trimmed the process right back to only insert the JobID (as a bigint) and it works without issue, but as soon as I bring in the text fields, it causes an error.

I’m using Python 3.7.0 and loading into SQL Server 14.0

import numpy as np
import pyodbc
import os
import glob
import pandas as pd
import csv
import config
import urllib
import shutil
import codecs

path = "C:\myFilePath"

allFiles = glob.glob(os.path.join(path, "*.csv"))

for file_ in allFiles:
    df = pd.concat((pd.read_csv(f, encoding='utf8') for f in allFiles))

cnxn = pyodbc.connect("Driver={ODBC Driver 13 for SQL Server};"                                             
                  "Server=myServer;"
                  "Database=myDatabase;"
                  "Trusted_Connection=yes;"
                  "SelectMethod=cursor;"
                  )

df2 = df[['JobID', 'NPS_score', 'Obtuse_Column_Name_1', 'Obtuse_Column_Name_2']].copy()

df2.columns = ['JobID', 'Score','Q1', 'Q2']

cursor = cnxn.cursor()
for index,row in df2.iterrows():
    try:
        counter = counter + 1
        cursor.execute("""insert into [myDB].[dbo].[input_test]( [JobID], [Score], [Q1], [Q2]) VALUES (?, ?, ?, ?)""", row['JobID'],row['Score'],row['Q1'], row['Q2'])
        cursor.commit()
        print(counter)
    except Exception as e:
        print(e) 
        continue    
cursor.close()  
cnxn.close()

I expect the data to be loaded but on some lines get the following error code:

(‘42000’, ‘[42000] [Microsoft][ODBC Driver 13 for SQL Server][SQL
Server]The incoming tabular data stream (TDS) remote procedure call
(RPC) protocol stream is incorrect. Parameter 7 (“”): The supplied
value is not a valid instance of data type float. Check the source
data for invalid values. An example of an invalid value is data of
numeric type with scale greater than precision. (8023)
(SQLExecDirectW)’)

Asked By: Ron Sibayan

||

Answers:

I have found and solved the issue. Following @TomRon’s advise, I printed the insert statement into the error block. This showed that despite me assuming Q2 was populated (as it was in the CSV) when I loaded it into a dataframe, for some reason it was NaN.

I simply included the function to replace NaN with 0 and it now loads all rows succesfully using the following line of code right after I create the dataframe I want to load (df2)

df2 = df2.fillna(value=0)
Answered By: Ron Sibayan

I have come across similar kind of error while dumping the data to SQL table

I exported the data to .csv file and found out there are inf values in two columns.

So, I replaced inf values with nan and it worked 🙂

df2.replace({np.inf: np.nan, -np.inf: np.nan}, inplace=True)

and fillna(0) for replacing nan with 0

df2 = df2.fillna(0)

PS: This answer might help someone else too.

Answered By: Suhas_Pote

You should replace nan to None by below code

df = df.where(pd.notnull(df), None)
Answered By: Jitendra Garg
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.