pandas

NaN values created when joining two dataframes

NaN values created when joining two dataframes Question: I am trying to one hot encode data using the sci-kit learn library from, kaggle https://www.kaggle.com/datasets/rkiattisak/salaly-prediction-for-beginer X is a two column dataframe of the age and years of experience columns with the rows containing null values cleaned out with dropna(). My goal is to one hot encode …

Total answers: 1

Python Groupby columns and apply function

Python Groupby columns and apply function Question: I have a dataframe that looks like this which contains all divisions and both conferences from 2000-2022. Tm Conference Division W-L%. Year Bills AFC East 0.813 2022 Dolphins AFC East 0.529 2022 Patriots AFC East 0.471 2022 Jets AFC East 0.412 2022 Cowboys NFC East 0.706 2022 Giants …

Total answers: 2

Pandas truncates strings in numpy list

Pandas truncates strings in numpy list Question: Consider the following minimal example: @dataclass class ExportEngine: def __post_init__(self): self.list = pandas.DataFrame(columns=list(MyObject.CSVHeaders())) def export(self): self.prepare() self.list.to_csv("~/Desktop/test.csv") def prepare(self): values = numpy.concatenate( ( numpy.array(["Col1Value", "Col2Value", " Col3Value", "Col4Value"]), numpy.repeat("", 24), ) ) for x in range(8): #not the best way, but done due to other constraints start = …

Total answers: 2

Create column that orders ID by first Start Date

Create column that orders ID by first Start Date Question: Imagine I have the following dataframe: ID Start Date 1 1990-01-01 1 1990-01-01 1 1991-01-01 2 1991-01-01 2 1990-01-01 3 2002-01-01 3 2000-01-01 4 1991-01-01 What would be the best way to create a column named Order that, for each unique ID in the ID …

Total answers: 1

How to drop_duplicates but remain a specified value in pandas dataframe?

How to drop_duplicates but remain a specified value in pandas dataframe? Question: enter image description here date price_bl price_ss price_bs price_sl 0 2022-03-09 03:00:00 41198.5 NaN NaN NaN 0 2022-03-10 01:00:00 NaN NaN NaN 40931.0 0 2022-03-10 01:00:00 NaN NaN 40931.0 NaN 1 2022-03-16 02:00:00 40867.8 NaN NaN NaN 0 2022-03-16 02:00:00 NaN 40867.8 NaN …

Total answers: 1

Sorting CSV with IP address column in pandas

Sorting CSV with IP address column in pandas Question: I have a CSV file with a column of IP addresses, MAC addresses and some other data. I want to sort all of the data by the IP addresses in ascending order Input: | IP Address | MAC Address | ID | | — | — …

Total answers: 1

pandas calculate returns batween two dates for multiple data points

pandas calculate returns batween two dates for multiple data points Question: I have a dataframe with the following columns: Date Identifier Price 28/02/2023 BBA LIBOR USD 3 MONTH 55 31/01/2023 BBA LIBOR USD 3 MONTH 63 28/02/2023 BBA LIBOR USD 1 Month 32 31/01/2023 BBA LIBOR USD 1 Month 59 28/02/2023 MSCI All Country World …

Total answers: 2

Boxplot with intervals based on timeseries

Boxplot with intervals based on timeseries Question: I have the following similar dataframe import pandas as pd import random dikt={‘Date’: pd.date_range("2018-01-01", periods=1500, freq="5T"), ‘Snd’: [random.randrange(1, 50, 1) for i in range(1500)]} df=pd.DataFrame(dikt) what i want is to create is a plot which consists of a group of boxplots which every boxplot is a time interval …

Total answers: 1