Setting Snowflake converter_class Still Converts to Python Data Types
Question:
In the docs for the python Snowflake connector, it says that setting the connection parameter converter_class
when creating the connection object can be used to suppress conversion to python types (leaves data as strings). But I see no difference between queries run with the following two connections (using snowflake-connector-python=2.7.0
):
from snowflake.connector.converter_null import SnowflakeNoConverterToPython
DBH1 = snowflake.connector.connect(
user='username',
password='password',
account='account',
converter_class=SnowflakeNoConverterToPython # why isn't this working?
)
DBH2 = snowflake.connector.connect(
user='username',
password='password',
account='account'
)
Queries executed from both DBH1 and DBH2 return timestamps as python datetime objects, and not strings. I noticed that in the doc on snowflake.connector
parameters, there is no mention of a converter_class
option – this trick is only listed in the "optimizing data pulls" section here: https://docs.snowflake.com/en/user-guide/python-connector-example.html#improving-query-performance-by-bypassing-data-conversion. Is it possible that this feature has been dropped without cleaning up the doc?
Answers:
When this feature was added initially it was only meant for JSON result set format. Since then we migrated result set to ARROW and for this format it doesn’t work indeed (ARROW is default format now).
To demonstrate I use this code:
ctx = snowflake.connector.connect(
user=USER,
password=PASSWORD,
account=ACCOUNT,
role=ROLE,
database=DATABASE,
schema=SCHEMA,
warehouse=WAREHOUSE,
converter_class=SnowflakeNoConverterToPython
)
cs = ctx.cursor()
try:
cs.execute("SELECT CURRENT_TIMESTAMP()")
res = cs.fetchone()
print(f'{res[0]}')
print(type(res[0]))
print(isinstance(res[0], str))
This returns to me:
2021-11-24 21:34:44.314000+13:00
<class 'datetime.datetime'>
False
Now, I change result set format back to original JSON:
try:
cs.execute("alter session set python_connector_query_result_format='JSON'")
cs.execute("SELECT CURRENT_TIMESTAMP()")
This time I get:
1637742958.657000000
<class 'str'>
True
The ARROW format has several advantages over JSON and you can read more here
In the docs for the python Snowflake connector, it says that setting the connection parameter converter_class
when creating the connection object can be used to suppress conversion to python types (leaves data as strings). But I see no difference between queries run with the following two connections (using snowflake-connector-python=2.7.0
):
from snowflake.connector.converter_null import SnowflakeNoConverterToPython
DBH1 = snowflake.connector.connect(
user='username',
password='password',
account='account',
converter_class=SnowflakeNoConverterToPython # why isn't this working?
)
DBH2 = snowflake.connector.connect(
user='username',
password='password',
account='account'
)
Queries executed from both DBH1 and DBH2 return timestamps as python datetime objects, and not strings. I noticed that in the doc on snowflake.connector
parameters, there is no mention of a converter_class
option – this trick is only listed in the "optimizing data pulls" section here: https://docs.snowflake.com/en/user-guide/python-connector-example.html#improving-query-performance-by-bypassing-data-conversion. Is it possible that this feature has been dropped without cleaning up the doc?
When this feature was added initially it was only meant for JSON result set format. Since then we migrated result set to ARROW and for this format it doesn’t work indeed (ARROW is default format now).
To demonstrate I use this code:
ctx = snowflake.connector.connect(
user=USER,
password=PASSWORD,
account=ACCOUNT,
role=ROLE,
database=DATABASE,
schema=SCHEMA,
warehouse=WAREHOUSE,
converter_class=SnowflakeNoConverterToPython
)
cs = ctx.cursor()
try:
cs.execute("SELECT CURRENT_TIMESTAMP()")
res = cs.fetchone()
print(f'{res[0]}')
print(type(res[0]))
print(isinstance(res[0], str))
This returns to me:
2021-11-24 21:34:44.314000+13:00
<class 'datetime.datetime'>
False
Now, I change result set format back to original JSON:
try:
cs.execute("alter session set python_connector_query_result_format='JSON'")
cs.execute("SELECT CURRENT_TIMESTAMP()")
This time I get:
1637742958.657000000
<class 'str'>
True
The ARROW format has several advantages over JSON and you can read more here