Get error ProgrammingError when insert to MySQL database using Python

Question:

I have a dataframe that have about 200M rows with example like this:

Date         tableName    attributeName
29/03/2019   tableA       attributeA
....

and I want to save the dataframe to a table in MySQL database. This is what I’ve tried to insert the dataframe to table:

def insertToTableDB(tableName,dataFrame):
    mysqlCon = mysql.connector.connect(host='localhost',user='root',passwd='')
    cursor = mysqlCon.cursor()
    for index, row in dataFrame.iterrows():
        myList =[row.Date, row.tableName, row.attributeName]
        query = "INSERT INTO `{0}`(`Date`, `tableName`, `attributeName`) VALUES (%s,%s,%s);".format(tableName)
        cursor.execute(query,myList)
        print(myList)
    try:
        mysqlCon.commit()
        cursor.close()        
        print("Done")
        return tableName,dataFrame
    except:
        cursor.close()
        print("Fail")

This code successful when I inserted a dataframe that have 2M rows. But, when I inserted dataframe that have 200M rows, I got error like this:

File "C:UsersUserAnaconda3libsite-packagesmysqlconnectorcursor.py", line 569, in execute
self._handle_result(self._connection.cmd_query(stmt))

File "C:UsersUserAnaconda3libsite-packagesmysqlconnectorconnection.py", line 553, in cmd_query
result = self._handle_result(self._send_cmd(ServerCmd.QUERY, query))

File "C:UsersUserAnaconda3libsite-packagesmysqlconnectorconnection.py", line 442, in _handle_result
raise errors.get_exception(packet)

ProgrammingError: Unknown column 'nan' in 'field list'

My dataframe doesn’t have ‘nan’ value. Could someone help me to solve this problem?

Thank you so much.

Asked By: elisa

||

Answers:

try these steps

  1. drop rows containing nan using dropna
  2. Filter rows which not contains nan in string.
  3. Convert nan into None
df.dropna(inplace=True)

df[(df['Date']!='nan') & (df['tableName']!='nan') &(df['attributeName']!='nan')]

df1 = df.where((pd.notnull(df)), None)

Answered By: tawab_shakeel

replace everywhere ‘NaN’ for the string ’empty’:

df = df.replace(np.nan, 'empty')

Remember to:

import numpy as np
Answered By: Danrley Pereira

df = df.astype(str) solves the problem for me – assuming you’ve already set up your table schema

Answered By: hq2nguye
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.