data-cleaning

Polars: Fill missing months per group

Polars: Fill missing months per group Question: I want to fill in missing months in a data frame per group based on the minimum and maximum date in each group. This approach works but uses polars.apply. import polars as pl import numpy as np from datetime import date DATA_SIZE = 10000000 raw_df = pl.DataFrame({ "id": …

Total answers: 1

Python function not accepting the second argument

Python function not accepting the second argument Question: Sample Data frame – df = pd.DataFrame({‘City’:[‘York New’, ‘Parague’, ‘New Delhi’, ‘Venice’, ‘new Orleans’], ‘Event’:[‘Music’, ‘Poetry’, ‘Theatre’, ‘Comedy’, ‘Tech_Summit’], ‘Cost’:[10000, 5000, 15000, 2000, 12000]}) index_ = [pd.Period(’02-2018′), pd.Period(’04-2018′), pd.Period(’06-2018′), pd.Period(’10-2018′), pd.Period(’12-2018′)] df.index = index_ print(df) Problem Statement – For those cities which starts with the keyword ‘New’ …

Total answers: 2

Replacing null values with multiple values using different conditions in python

Replacing null values with multiple values using different conditions in python Question: The dataset contains 1581 rows and 14 columns. Dataset is loaded as df. Important columns for this problem are Partner_working (values: Yes, No) and Partner_salary. There are 106 null values in Partner_salary column. I have to replace these null values with the median …

Total answers: 1

Automate fractal like nested JSON normalization

Automate fractal like nested JSON normalization Question: The problem : I have 100+ JSON with a fractal like structure of list of dicts. The width and the heigth of the data structure vary a lot from one JSON to another. Each labels are parts of a sentence. test = [ { "label": "I", "children": [ …

Total answers: 1

How to separate Numbers from string and move them to next column in Python?

How to separate Numbers from string and move them to next column in Python? Question: I am working on a share market data and in some columns market cap has shifted to previous column. I am trying to fetch them in next column but the value it’s returning is completely different. This is the code …

Total answers: 2

How to remove rows with null values from a column?

How to remove rows with null values from a column? Question: I have small dataframe with null values in columns. Movie Duration Avatar 178 Spectre John Carter 132 Tangled Titanic 195 I can remove rows with null values for one column at a time with this command – df.drop(df[df[‘duration’].isnull()].index) But, suppose I had a large …

Total answers: 1

Pandas Dataframe: Extract info from specific series

Pandas Dataframe: Extract info from specific series Question: I have this dataframe which need to extract package info (ML, KG, PZA, LT, UN, etc) from description column, and i’m pretty new at pandas. This is the dataframe right now SKU Description 1 TRIDENT 6S SANDIA 9GR 2 CANAST RABBIT F1 A 1UN 3 HAND SOAP …

Total answers: 1

Apply a function to each line of a CSV file in python

Apply a function to each line of a CSV file in python Question: I have a regular expression that I want to apply to each line of a CSV file. Here’s the function which basically removes all comma’s encountered before a single digit number. The function is working perfectly fine for the string. Input : …

Total answers: 1