scikit-learn

Finding optimal weights in regression

Finding optimal weights in regression Question: I am new to linear regression and sklearn. I have a problem where I have input feature x1, which contains 101 ones and input feature x2 100 ones and then a zero. The output y is all 101 ones. I am trying to find the optimal value of w1 …

Total answers: 1

How to sklearn random split test/train set by class and id?

How to sklearn random split test/train set by class and id? Question: I would like to divide the set into training and test in a 50:50 ratio according to the class ‘fruit’. However, so that classes with the same ID go into either the training or test set. Here is an example data: import pandas …

Total answers: 1

Why sklearn's KFold can only be enumerated once (also on using it in xgboost.cv)?

Why sklearn's KFold can only be enumerated once (also on using it in xgboost.cv)? Question: Trying to create a KFold object for my xgboost.cv, and I have import pandas as pd from sklearn.model_selection import KFold df = pd.DataFrame([[1,2,3,4,5],[6,7,8,9,10]]) KF = KFold(n_splits=2) kf = KF.split(df) But it seems I can only enumerate once: for i, (train_index, …

Total answers: 1

What to do about future warning when using sklearn.neighbors?

What to do about future warning when using sklearn.neighbors? Question: I fear, I have the same problem as in this post: getting a warning when using sklearn.neighbors about keepdims I try to use KNN as part of an ensemble classifier, but everytime I get the following warning: FutureWarning: Unlike other reduction functions (e.g. skew, kurtosis), …

Total answers: 2

VScode Jupyter Notebook crash in cell

VScode Jupyter Notebook crash in cell Question: I get this error when I run sklearn to train on a very large dataset. If the dataset is small, it works, but if it is above a threshold, the kernel crashes. Error: info 16:24:11.630: Process Execution: > ~/miniconda3/envs/auto-sklearn/bin/python -m pip list > ~/miniconda3/envs/auto-sklearn/bin/python -m pip list info …

Total answers: 2

Test train split while retaining original dimension

Test train split while retaining original dimension Question: I am trying to split a pandas dataframe of size 610×9724 (610 users x 9724 movies), putting 80% of the non-null values of the dataset into training and 20% of the remaining non-null values into the test set while replacing the 20% removed values from training with …

Total answers: 1

Naive Bayes Gaussian Classification Prediction not taking array in SK-Learn

Naive Bayes Gaussian Classification Prediction not taking array in SK-Learn Question: I have made the following gaussian prediction model in SK-learn: chess_gnb = GaussianNB().fit(raw[[‘elo’, ‘opponent_rating’, ‘winner_loser_elo_diff’]],raw[‘winner’]) I then made a test array and attempted to feed it into the model: test1 = [[‘elo’, 1000], [‘opponent_rating’, 800], [‘winner_loser_elo_diff’, 200]] chess_gnb.predict(test1) However, I’m getting this error: ValueError: …

Total answers: 1

X has 45 features, but LinearRegression is expecting 8 features as input

X has 45 features, but LinearRegression is expecting 8 features as input Question: I am trying to run a Polynomial regression model for the fetch california housing data. However, I get X has 45 features, but LinearRegression is expecting 8 features as input. Does anybody knows why? Any help would be greatly appreciated. Thanks. from …

Total answers: 1

Return pipeline score as one of multiple evaluation metrics

Return pipeline score as one of multiple evaluation metrics Question: I am using a pipeline in a hyperparameter gridsearch in sklearn. I would like the search to return multiple evaluation scores – one a custom scoring function that I wrote, and the other the default score function of the pipeline. I tried using the parameter …

Total answers: 1

ColumnTransformer is normalizing columns that I set to not normalize

ColumnTransformer is normalizing columns that I set to not normalize Question: I was trying to use ColumnTransformer from sklearn to do preprocess data (normalize and convert a column to ordinal). However, I encountered some issues where the column that I specify not to standardize gets standardized. from sklearn.compose import ColumnTransformer from sklearn.preprocessing import StandardScaler, OrdinalEncoder …

Total answers: 1