amazon-athena

boto3 get_query_runtime_statistics sometimes not returning "rows" data

boto3 get_query_runtime_statistics sometimes not returning "rows" data Question: I have a lambda that attempts to find out whether a previously executed athena query has returned any rows or not. To do so I am using the boto3 function get_query_runtime_statistics and then extracting the "Rows" data: response = athena_client.get_query_runtime_statistics(QueryExecutionId=query_id) row_count = response["QueryRuntimeStatistics"]["Rows"]["OutputRows"] However, in a previous …

Total answers: 2

Run aws Athena query by Lambda: error name 'response' is not defined

Run aws Athena query by Lambda: error name 'response' is not defined Question: I create an AWS lambda function with python 3.9 to run the Athena query and get the query result import time import boto3 # create Athena client client = boto3.client(‘athena’) # create Athena query varuable query = ‘select * from mydatabase.mytable limit …

Total answers: 1

Airflow Jinja Templating in params

Airflow Jinja Templating in params Question: I have an Airflow operator which allows me to query Athena which accepts a Jinja templated file as the query input. Usually, I pass variables such as table/database names, etc to the template for create table and add partition statements. This works fine for defined strings. My task definition …

Total answers: 1

Write pandas dataframe into AWS athena database

Write pandas dataframe into AWS athena database Question: I have run a query using pyathena, and have created a pandas dataframe. Is there a way to write the pandas dataframe to AWS athena database directly? Like data.to_sql for MYSQL database. Sharing a example of dataframe code below for reference need to write into AWS athena …

Total answers: 4

How to Create Dataframe from AWS Athena using Boto3 get_query_results method

How to Create Dataframe from AWS Athena using Boto3 get_query_results method Question: I’m using AWS Athena to query raw data from S3. Since Athena writes the query output into S3 output bucket I used to do: df = pd.read_csv(OutputLocation) But this seems like an expensive way. Recently I noticed the get_query_results method of boto3 which …

Total answers: 7