cross-validation

What's the mistake I am doing in this CV code

What's the mistake I am doing in this CV code Question: I am trying to do CV for my training and testing datasets. I am using LinearRegressor. However, when I run the code, I get the error below. But when I run the code on Decision Trees I don’t get any errors and the code …

Total answers: 1

Gaussian Process Regression: tune hyperparameters based on validation set

Gaussian Process Regression: tune hyperparameters based on validation set Question: In the standard scikit-learn implementation of Gaussian-Process Regression (GPR), the hyper-parameters (of the kernel) are chosen based on the training set. Is there an easy to use implementation of GPR (in python), where the hyperparemeters (of the kernel) are chosen based on a separate validation …

Total answers: 2

Retrieve cross validation performance (AUC) on h2o AutoML for holdout dataset

Retrieve cross validation performance (AUC) on h2o AutoML for holdout dataset Question: I am training a binary classification model with h2o AutoML using the default cross-validation (nfolds=5). I need to obtain the AUC score for each holdout fold in order to compute the variability. This is the code I am using: h2o.init() prostate = h2o.import_file("https://h2o-public-test-data.s3.amazonaws.com/smalldata/prostate/prostate.csv") …

Total answers: 2

Why does LogisticRegressionCV's .score() differ from cross_val_score?

Why does LogisticRegressionCV's .score() differ from cross_val_score? Question: I was using LogisticRegressionCV’s .score() method to yield an accuracy score for my model. I also used cross_val_score to yield an accuracy score with the same cv split (skf), expecting the same score to show up. But alas, they were different and I’m confused. I first did …

Total answers: 2

sklearn cross_val_score() returns NaN values

sklearn cross_val_score() returns NaN values Question: i’m trying to predict next customer purchase to my job. I followed a guide, but when i tried to use cross_val_score() function, it returns NaN values.Google Colab notebook screenshot Variables: X_train is a dataframe X_test is a dataframe y_train is a list y_test is a list Code: X_train, X_test, …

Total answers: 9

GridSearchCV & RandomizedSearchCV – do you refit the model after running

GridSearchCV & RandomizedSearchCV – do you refit the model after running Question: I have some test and train data, the test data does not have any dependant variables. I’m currently running a GridSearchCV or RandomizedSearchCV to find the best paramaters. Should I pass all of my “test” X & y values into a GridSearchCV or …

Total answers: 2

Target transformation and feature selection in scikit-learn

Target transformation and feature selection in scikit-learn Question: I am using RFECV for feature selection in scikit-learn. I would like to compare the result of a simple linear model (X,y) with that of a log transformed model (using X, log(y)) Simple Model: RFECV and cross_val_score provide the same result (we need to compare the average …

Total answers: 2

How to perform SMOTE with cross validation in sklearn in python

How to perform SMOTE with cross validation in sklearn in python Question: I have a highly imbalanced dataset and would like to perform SMOTE to balance the dataset and perfrom cross validation to measure the accuracy. However, most of the existing tutorials make use of only single training and testing iteration to perfrom SMOTE. Therefore, …

Total answers: 3