categorical-data

Python categorical variables NaN while creating box-plot

Python categorical variables NaN while creating box-plot Question: After I successfully created categorical values, their result is NaN. I used this command: df[‘Memory’]= pd.cut(pd.to_numeric(df[‘RAM ‘], errors="coerce"), [0,4,8,12], include_lowest=True, labels=[‘Basic’, ‘Intermediate’, ‘Advaced’]) After running df.head() here’s the table: When I try to box-plot them: sns.boxplot(x=’Memory’, y=’Price’, data=df[‘RAM ‘] = pd.to_numeric(df[‘RAM ‘], errors="coerce") ; df[‘Memory’]= pd.cut(df[‘RAM ‘], …

Total answers: 1

Python creating categorical variable error

Python creating categorical variable error Question: I need to create categorical variables for RAM category. Basic: RAM [0-4] Intermediate: RAM [5-8] Advanced: RAM [8-12] Command: df[‘Memory’]=pd.cut(df[‘RAM ‘], [0,4,8,12], include_lowest=True, labels=[‘Basic’,’Intermediate’, ‘Advaced’]) Error: TypeError Traceback (most recent call last) <ipython-input-58-5c93d7c00ba2> in <cell line: 1>() —-> 1 df[‘Memory’]=pd.cut(df[‘RAM ‘], [0,4,8,12], include_lowest=True, labels=[‘Basic’, ‘Intermediate’, ‘Advaced’]) 1 frames /usr/local/lib/python3.9/dist-packages/pandas/core/reshape/tile.py …

Total answers: 1

How to apply the same cat.codes to 2 different dataframes?

How to apply the same cat.codes to 2 different dataframes? Question: I have 2 dataframes X_train and X_test. These 2 dataframes have the same columns. There is 1 column called levels that needs to be changed from str to int. However, each dataframe’s levels columns has different unique values: X_train has: [‘Level 0’, ‘Level 10’, …

Total answers: 1

Losing my target variable when encoding categorial variables

Losing my target variable when encoding categorial variables Question: I am dealing with a little challenge. I am trying to create a logistic regression model (multicass). Some of my variables are categorical, therefore I’m trying to encode them. My initial dataset looks like that: The column I want to predict is action1_preflop, it contains 3 …

Total answers: 2

Redefine categories of a categorical variable ignoring upper and lower case

Redefine categories of a categorical variable ignoring upper and lower case Question: I have a dataset with a categorical variable that is not nicely coded. The same category appears sometimes with upper case letters and sometimes with lower case (and several variations of it). Since I have a large dataset, I would like to harmonize …

Total answers: 2

case insensitive pandas.Series.replace

case insensitive pandas.Series.replace Question: I want to replace some values in categorical data columns with np.nan. What is the best method for replacing values in a case-insensitive manner while maintaining the same categories (in the same order)? import pandas as pd import numpy as np # set up a DF with ordered categories values = …

Total answers: 2

De-duplicate some columns while doing a "hierarchical" one-hot encoding

De-duplicate some columns while doing a "hierarchical" one-hot encoding Question: I have a pandas dataframe (df) with columns A, B, C, and D. I have situation in which I wish to de-duplicate the values in the first two columns, one-hot-encode the third column, and do a one-hot-encoding of the last column in such a way …

Total answers: 1

Stacked Bar Chart based on Pandas Column

Stacked Bar Chart based on Pandas Column Question: I have a set of data which can be replicated with the following code: import numpy as np import pandas as pd Neurons = np.array([ 20, 600, 300, 300, 200, 50, 20, 100, 50, 300, 100, 600, 20,20, 600, 200, 200, 600, 600, 100, 200, 200, 300, …

Total answers: 2

Creating a regression model using Day of Week, Hour of Day, and Type of Media?

Creating a regression model using Day of Week, Hour of Day, and Type of Media? Question: Working with Python 3 in a Jupyter notebook. I am trying to create a regression model (equation?) to predict the Eng as % of Followers variable. I’d be given Media Type, Hour Created, and Day of Week. These should …

Total answers: 1