Get error ProgrammingError when insert to MySQL database using Python
Question:
I have a dataframe that have about 200M rows with example like this:
Date tableName attributeName
29/03/2019 tableA attributeA
....
and I want to save the dataframe to a table in MySQL database. This is what I’ve tried to insert the dataframe to table:
def insertToTableDB(tableName,dataFrame):
mysqlCon = mysql.connector.connect(host='localhost',user='root',passwd='')
cursor = mysqlCon.cursor()
for index, row in dataFrame.iterrows():
myList =[row.Date, row.tableName, row.attributeName]
query = "INSERT INTO `{0}`(`Date`, `tableName`, `attributeName`) VALUES (%s,%s,%s);".format(tableName)
cursor.execute(query,myList)
print(myList)
try:
mysqlCon.commit()
cursor.close()
print("Done")
return tableName,dataFrame
except:
cursor.close()
print("Fail")
This code successful when I inserted a dataframe that have 2M rows. But, when I inserted dataframe that have 200M rows, I got error like this:
File "C:UsersUserAnaconda3libsite-packagesmysqlconnectorcursor.py", line 569, in execute
self._handle_result(self._connection.cmd_query(stmt))
File "C:UsersUserAnaconda3libsite-packagesmysqlconnectorconnection.py", line 553, in cmd_query
result = self._handle_result(self._send_cmd(ServerCmd.QUERY, query))
File "C:UsersUserAnaconda3libsite-packagesmysqlconnectorconnection.py", line 442, in _handle_result
raise errors.get_exception(packet)
ProgrammingError: Unknown column 'nan' in 'field list'
My dataframe doesn’t have ‘nan’ value. Could someone help me to solve this problem?
Thank you so much.
Answers:
try these steps
- drop rows containing nan using
dropna
- Filter rows which not contains
nan
in string.
- Convert nan into None
df.dropna(inplace=True)
df[(df['Date']!='nan') & (df['tableName']!='nan') &(df['attributeName']!='nan')]
df1 = df.where((pd.notnull(df)), None)
replace everywhere ‘NaN’ for the string ’empty’:
df = df.replace(np.nan, 'empty')
Remember to:
import numpy as np
df = df.astype(str)
solves the problem for me – assuming you’ve already set up your table schema
I have a dataframe that have about 200M rows with example like this:
Date tableName attributeName
29/03/2019 tableA attributeA
....
and I want to save the dataframe to a table in MySQL database. This is what I’ve tried to insert the dataframe to table:
def insertToTableDB(tableName,dataFrame):
mysqlCon = mysql.connector.connect(host='localhost',user='root',passwd='')
cursor = mysqlCon.cursor()
for index, row in dataFrame.iterrows():
myList =[row.Date, row.tableName, row.attributeName]
query = "INSERT INTO `{0}`(`Date`, `tableName`, `attributeName`) VALUES (%s,%s,%s);".format(tableName)
cursor.execute(query,myList)
print(myList)
try:
mysqlCon.commit()
cursor.close()
print("Done")
return tableName,dataFrame
except:
cursor.close()
print("Fail")
This code successful when I inserted a dataframe that have 2M rows. But, when I inserted dataframe that have 200M rows, I got error like this:
File "C:UsersUserAnaconda3libsite-packagesmysqlconnectorcursor.py", line 569, in execute
self._handle_result(self._connection.cmd_query(stmt))
File "C:UsersUserAnaconda3libsite-packagesmysqlconnectorconnection.py", line 553, in cmd_query
result = self._handle_result(self._send_cmd(ServerCmd.QUERY, query))
File "C:UsersUserAnaconda3libsite-packagesmysqlconnectorconnection.py", line 442, in _handle_result
raise errors.get_exception(packet)
ProgrammingError: Unknown column 'nan' in 'field list'
My dataframe doesn’t have ‘nan’ value. Could someone help me to solve this problem?
Thank you so much.
try these steps
- drop rows containing nan using
dropna
- Filter rows which not contains
nan
in string. - Convert nan into None
df.dropna(inplace=True)
df[(df['Date']!='nan') & (df['tableName']!='nan') &(df['attributeName']!='nan')]
df1 = df.where((pd.notnull(df)), None)
replace everywhere ‘NaN’ for the string ’empty’:
df = df.replace(np.nan, 'empty')
Remember to:
import numpy as np
df = df.astype(str)
solves the problem for me – assuming you’ve already set up your table schema