Trying to upload a pandas dataframe using teradataml copy_to_sql function

Question:

I’m pretty new to uploading data to teradata. The method I know works is inserting row by row using insert statements but would like to avoid that. I am trying to directly upload my panda’s dataframe to teradata but have not been successful yet. I’ve tried 2 methods and my preference is to get method 1 to work but want to get a working solution first.

I’ve tried 2 methods.
1.Teradataml module – copy_to_sql
2.Teradata module – using insert statement

method 1: Create table using copy_to_sql function

from teradataml.dataframe.copy_to import copy_to_sql
from teradataml import create_context, remove_context

df # some dataframe

table_name="db.table"
copy_to_sql(df = df_new, table_name = "db.table", primary_index="index", if_exists="replace")

method 2: Add to already created table using insert statement

import teradata

udaExec = teradata.UdaExec (appName=appname, version="1.0", logConsole=False)
connect = udaExec.connect(method="odbc",system=host, username=user,
                            password=passwrd)

num_of_chunks=100
table_name="db.table"
query='INSERT INTO '+table_name+' values(?,?,?,?,?);'
df_chunks=np.array_split(df_new2, num_of_chunks)
for i,_ in enumerate(df_chunks):
    data = [tuple(x) for x in df_chunks[i].to_records(index=False)]
    connect.executemany(query, data,batch=True)

**method 1** get the following error related to access.  Not sure while the SQL statement is adding quotes for the bolded table below:
OperationalError: (teradatasql.OperationalError) [Version 16.20.0.48] [Session 5229096] [Teradata Database] [Error 3524] The user does not have CREATE TABLE access to database U378597.
[SQL: 
CREATE multiset TABLE **"db.table"** (
    "PBP" VARCHAR(1024) CHAR SET UNICODE, 
    recon VARCHAR(1024) CHAR SET UNICODE, 
    date2 TIMESTAMP(6), 
    "CF" FLOAT, 
    "index" VARCHAR(1024) CHAR SET UNICODE
)
primary index( "index" )

]
**method 2** get a error about inserting dates.  Assume datetime needs to be converted in someway to work in teradata table but unsure how

DatabaseError: (6760, '[HY000] [Teradata][ODBC Teradata Driver][Teradata Database] Invalid timestamp. ')
Asked By: Joe Moss

||

Answers:

The table_name is an unqualified name. To specify the Teradata “database” in which the table should be created, use the separate schema_name parameter.

And for “method 2”, consider using the teradatasql package instead of teradata. Or I suppose you could .isoformat(' ') the timestamp.

Answered By: Fred

Here is my preferred way to connect to Teradata:

import teradataml as tdml # TD python library
conn = tdml.create_context(host = "hostname:port", username="USERNAME", password = getpass.getpass('Password:'), logmech='LDAP')

Use copy_to_sql for small datasets and fastload() for large ones: https://docs.teradata.com/r/Teradata-Package-for-Python-User-Guide/May-2022/teradataml-General-Functions/Data-Transfer-Utility/Saving-DataFrame-to-Vantage/fastexport

tdml.copy_to_sql(df, table_name='TableName', if_exists='replace')

from teradataml.dataframe.fastload import fastload
fastload(df = df, table_name = 'TableName')
Answered By: Jeremy Yu
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.