etl

google translate API bottleneck

google translate API bottleneck Question: I’m currently working on an ETL pipeline and it takes too long to run after checking which part of the code takes the longest I found this: I’m using the Google Cloud Translate API to translate keywords that don’t have translations in my db, but I’m running into a bottleneck …

Total answers: 1

Pass returned value from a previous Python operator task to another in airflow

Pass returned value from a previous Python operator task to another in airflow Question: I am a new user to Apache Airflow. I am building a DAG like the following to schedule tasks: def add(): return 1 + 1 def multiply(a): return a * 999 dag_args = { ‘owner’: ‘me’, ‘depends_on_past’: False, ‘start_date’: datetime(2023, 2, …

Total answers: 1

Connect to Dynamics Database through Python and sqlalchemy

Connect to Dynamics Database through Python and sqlalchemy Question: I’m new to dynamics but don’t understand why I need to pay for a connector into my database. Is it possible to do the following link for free? I tried looking online and alternative packages and options, only to find there is a cost for all …

Total answers: 1

macros are not recognised in dbt

macros are not recognised in dbt Question: {{ config ( pre_hook = before_begin("{{audit_tbl_insert(1,’stg_news_sentiment_analysis_incr’) }}"), post_hook = after_commit("{{audit_tbl_update(1,’stg_news_sentiment_analysis_incr’,’dbt_development’,’news_sentiment_analysis’) }}") ) }} select rd.news_id ,rd.title, rd.description, ns.sentiment from live_crawler_output_rss.rss_data rd left join live_crawler_output_rss.news_sentiment ns on rd.news_id = ns.data_id limit 10000; This is my model in DBT which is configured with pre and post hooks which referance a …

Total answers: 2

split array from csv into columns

split array from csv into columns Question: i have this CSV Data number,event_date,event_timestamp,event_name,event_params 0,20220315,1668314165054758,eventTracking,"[{‘key’: ‘test0’, ‘value’: {‘string_value’: None, ‘int_value’: 1665374225, ‘float_value’: None, ‘double_value’: None}} {‘key’: ‘test1’, ‘value’: {‘string_value’: None, ‘int_value’: 0, ‘float_value’: None, ‘double_value’: None}} {‘key’: ‘test2’, ‘value’: {‘string_value’: ‘http:test.com’, ‘int_value’: None, ‘float_value’: None, ‘double_value’: None}} {‘key’: ‘test3’, ‘value’: {‘string_value’: ‘[email protected]’, ‘int_value’: None, ‘float_value’: None, …

Total answers: 1

Explode column of objects python pandas dataframe

Explode column of objects python pandas dataframe Question: I am trying to explode a column to create new rows within pandas data frame. What would be the best approach to this? Input: SKU Quantity Name YY-123-671 5 drawer YY-345-111-WH,YY-345-111-RD,YY-345-111-BL 10 desk LK-896-001 1 lamp Desired Output: SKU Quantity Name YY-123-671 5 drawer YY-345-111-WH 10 desk …

Total answers: 1

How to transform Row data into column data using pandas?

How to transform Row data into column data using pandas? Question: I exported many reports from my system in xls in the same specific format and need to change them to another format: Basically for every item description I need to insert the corresponding Account series it is in column J using pandas. Data CP …

Total answers: 2

Calling an inner function in Python

Calling an inner function in Python Question: I have this final main.py that combines every function I wrote separately, but I can’t make it work, it actually returns the Success at the end but it actually does nothing nor in my local folders or MongoDB. The function is this one: def gw2_etl(url): def log_scrape(url): HEADERS …

Total answers: 1

Airflow 2.0 task getting skipped after BranchPython Operator

Airflow 2.0 task getting skipped after BranchPython Operator Question: I’m fiddling with branches in Airflow in the new version and no matter what I try, all the tasks after the BranchOperator get skipped. Here is a minimal example of what I’ve been trying to accomplish from airflow.decorators import dag, task from datetime import timedelta, datetime …

Total answers: 2

Pandas Dataframe: Extract info from specific series

Pandas Dataframe: Extract info from specific series Question: I have this dataframe which need to extract package info (ML, KG, PZA, LT, UN, etc) from description column, and i’m pretty new at pandas. This is the dataframe right now SKU Description 1 TRIDENT 6S SANDIA 9GR 2 CANAST RABBIT F1 A 1UN 3 HAND SOAP …

Total answers: 1