Pandas to_sql doesn't insert any data in my table
Question:
I am trying to insert some data in a table I have created.
I have a data frame that looks like this:
I created a table:
create table online.ds_attribution_probabilities
(
attribution_type text,
channel text,
date date ,
value float
)
And I am running this python script:
engine = create_engine("postgresql://@e.eu-central-1.redshift.amazonaws.com:5439/mdhclient_encoding=utf8")
connection = engine.raw_connection()
result.to_sql('online.ds_attribution_probabilities', con=engine, index = False, if_exists = 'append')
I get no error, but when I check there are no data in my table. What can be wrong? Do I have to commit or do an extra step?
Answers:
Try to specify a schema name:
result.to_sql('ds_attribution_probabilities', con=engine,
schema='online', index=False, if_exists='append')
I faced the same problem when I used .connect()
and .begin()
with engine.connect() as conn, conn.begin():
dataframe.to_sql(name='table_name', schema='schema',
con=conn, if_exists='append', index=False)
conn.close()
Just remove the .connect()
and .begin()
and it will work.
This could happen because it defaults to the public database, and there’s probably a table with that name under the public database/schema, with your data in it.
@MaxU’s answer does help some, but not the others.
For others, here is something else you can try:
When you create the engine, specify the schemaname like this:
engine = create_engine(*<connection_string>*,
connect_args={'options': '-csearch_path={}'.format(*<dbschema_name>*)})
I had a similar issue caused by the fact that I was passing sqlalchemy connection object instead of engine object to the con parameter. In my case tables were created but left empty.
Check the autocommit
setting: https://docs.sqlalchemy.org/en/latest/core/connections.html#understanding-autocommit
engine.execute(text("SELECT my_mutating_procedure()").execution_options(autocommit=True))
Hopefully this helps someone else. to_sql
will fail silently in the form of what looks like a successful insert if you pass a connection object. This is definitely true for Postgres, but i assume the same for others as well, based on the method docs:
con : sqlalchemy.engine.Engine or sqlite3.Connection
Using SQLAlchemy makes it possible to use any DB supported by that
library. Legacy support is provided for sqlite3.Connection objects.
This got me because the typing hints stated Union[Engine, Connection]
, which is “technically” true.
If you have a session with SQLAlchemy
try passing con=session.get_bind(),
In my case, writing data to the database was hampered by the fast option.
Why is this not fast loading interfering, I have not yet figured out.
This code doesn’t work:
engine = sqlalchemy.create_engine("mssql+pyodbc:///?odbc_connect={}".format(db_params), fast_executemany=True)
df.to_sql('tablename', engine, index=False, schema = 'dbo', if_exists='replace' )
Without fast_executemany=True
the code works well.
use method=None
None : Uses standard SQL INSERT clause (one per row).
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_sql.html
Mine worked like this:
df.to_sql(name=table_name, method=None, schema=schema, index=False, if_exists='append', chunksize=50, con=conn.get_bind())
*table_name without prepending the schema name
I am trying to insert some data in a table I have created.
I have a data frame that looks like this:
I created a table:
create table online.ds_attribution_probabilities
(
attribution_type text,
channel text,
date date ,
value float
)
And I am running this python script:
engine = create_engine("postgresql://@e.eu-central-1.redshift.amazonaws.com:5439/mdhclient_encoding=utf8")
connection = engine.raw_connection()
result.to_sql('online.ds_attribution_probabilities', con=engine, index = False, if_exists = 'append')
I get no error, but when I check there are no data in my table. What can be wrong? Do I have to commit or do an extra step?
Try to specify a schema name:
result.to_sql('ds_attribution_probabilities', con=engine,
schema='online', index=False, if_exists='append')
I faced the same problem when I used .connect()
and .begin()
with engine.connect() as conn, conn.begin():
dataframe.to_sql(name='table_name', schema='schema',
con=conn, if_exists='append', index=False)
conn.close()
Just remove the .connect()
and .begin()
and it will work.
This could happen because it defaults to the public database, and there’s probably a table with that name under the public database/schema, with your data in it.
@MaxU’s answer does help some, but not the others.
For others, here is something else you can try:
When you create the engine, specify the schemaname like this:
engine = create_engine(*<connection_string>*,
connect_args={'options': '-csearch_path={}'.format(*<dbschema_name>*)})
I had a similar issue caused by the fact that I was passing sqlalchemy connection object instead of engine object to the con parameter. In my case tables were created but left empty.
Check the autocommit
setting: https://docs.sqlalchemy.org/en/latest/core/connections.html#understanding-autocommit
engine.execute(text("SELECT my_mutating_procedure()").execution_options(autocommit=True))
Hopefully this helps someone else. to_sql
will fail silently in the form of what looks like a successful insert if you pass a connection object. This is definitely true for Postgres, but i assume the same for others as well, based on the method docs:
con : sqlalchemy.engine.Engine or sqlite3.Connection
Using SQLAlchemy makes it possible to use any DB supported by that
library. Legacy support is provided for sqlite3.Connection objects.
This got me because the typing hints stated Union[Engine, Connection]
, which is “technically” true.
If you have a session with SQLAlchemy
try passing con=session.get_bind(),
In my case, writing data to the database was hampered by the fast option.
Why is this not fast loading interfering, I have not yet figured out.
This code doesn’t work:
engine = sqlalchemy.create_engine("mssql+pyodbc:///?odbc_connect={}".format(db_params), fast_executemany=True)
df.to_sql('tablename', engine, index=False, schema = 'dbo', if_exists='replace' )
Without fast_executemany=True
the code works well.
use method=None
None : Uses standard SQL INSERT clause (one per row).
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_sql.html
Mine worked like this:
df.to_sql(name=table_name, method=None, schema=schema, index=False, if_exists='append', chunksize=50, con=conn.get_bind())
*table_name without prepending the schema name