scikit-learn

How run sklearn.preprocessing.OrdinalEncoder on several columns?

How run sklearn.preprocessing.OrdinalEncoder on several columns? Question: this code raise error: import pandas as pd from sklearn.compose import ColumnTransformer from sklearn.pipeline import Pipeline from sklearn.preprocessing import OrdinalEncoder # Define categorical columns and mapping dictionary categorical_cols = [‘color’, ‘shape’, ‘size’] mapping = {‘red’: 0, ‘green’: 1, ‘blue’: 2, ‘circle’: 0, ‘square’: 1, ‘triangle’: 2, ‘small’: 0, …

Total answers: 2

What is the best practice to apply cross-validation using TimeSeriesSplit() over dataframe within end-2-end pipeline in python?

What is the best practice to apply cross-validation using TimeSeriesSplit() over dataframe within end-2-end pipeline in python? Question: Let’s say I have dataset within the following pandas dataframe format with a non-standard timestamp column without datetime format as follows: +——–+—–+ |TS_24hrs|count| +——–+—–+ |0 |157 | |1 |334 | |2 |176 | |3 |86 | |4 …

Total answers: 2

After dropping columns with missing values, sklearn still throwing ValueError

After dropping columns with missing values, sklearn still throwing ValueError Question: I am currently taking the intermediate machine learning course on kaggle, and am quite new to machine learning. I’m currently trying to create a Random Forest model and implementing OH Encoding on my data, but as it is my first time have been struggling …

Total answers: 2

NaN values created when joining two dataframes

NaN values created when joining two dataframes Question: I am trying to one hot encode data using the sci-kit learn library from, kaggle https://www.kaggle.com/datasets/rkiattisak/salaly-prediction-for-beginer X is a two column dataframe of the age and years of experience columns with the rows containing null values cleaned out with dropna(). My goal is to one hot encode …

Total answers: 1

Data Science Data Analysis

Data Science Data Analysis Question: I have a dataset with people’s characteristics and I need to predict their breakfast here‘s an example of df. And I am training cat boost algorithm for that. Is it possible in my case to predict not only one kind of breakfast, but also an additional one? By additional I …

Total answers: 2

SKLearn Linear Regression on Grouped Pandas Dataframe without aggregation?

SKLearn Linear Regression on Grouped Pandas Dataframe without aggregation? Question: Trying to perform a linear regression over a set of grouped columns and put the coefficient results on each line without performing an aggregations (equivalent to a window function in SQL). I’m banging my head against a wall here. In a for loop this works …

Total answers: 1

AttributeError: 'CountVectorizer' object has no attribute 'get_feature_names'

AttributeError: 'CountVectorizer' object has no attribute 'get_feature_names' Question: The code was working before without showing any errors. It’s for a sentimental analysis machine learning project. The code is on logistic regression model for word count: c = CountVectorizer(stop_words = ‘english’) def text_fit(X, y, model,clf_model,coef_show=1): X_c = model.fit_transform(X) print(‘# features: {}’.format(X_c.shape[1])) X_train, X_test, y_train, y_test = …

Total answers: 1

NotFittedError (instance is not fitted yet) after invoked cross_validate

NotFittedError (instance is not fitted yet) after invoked cross_validate Question: This is my minimal reproducible example: x = np.array([ [1, 2], [3, 4], [5, 6], [6, 7] ]) y = [1, 0, 0, 1] model = GaussianNB() scores = cross_validate(model, x, y, cv=2, scoring=("accuracy")) model.predict([8,9]) What I intended to do is instantiating a Gaussian Naive …

Total answers: 1

Sklearn-classifier, issue with freez (pod pending in K8s)

Sklearn-classifier, issue with freez (pod pending in K8s) Question: I got freez of Sklearn-classifier in MLRun (the job is still running after 5, 10, 20, … minutes), see log output: 2023-02-21 13:50:15,853 [info] starting run training uid=e8e66defd91043dda62ae8b6795c74ea DB=http://mlrun-api:8080 2023-02-21 13:50:16,136 [info] Job is running in the background, pod: training-tgplm see freez/pending issue on Web UI: …

Total answers: 1

How do I extract meaningful simple rules from this classification problem?

How do I extract meaningful simple rules from this classification problem? Question: I have a problem of this type: A customer creates an order by hand, which might be erroneous. Submitting a wrong order is costly, which is why we try to reduce the error rate. I need to detect what factors cause an error, …

Total answers: 1