logistic-regression

Why does adding duplicated features improve Logistic Regression accuracy?

Why does adding duplicated features improve Logistic Regression accuracy? Question: from sklearn.datasets import load_iris from sklearn.linear_model import LogisticRegression X, y = load_iris(return_X_y=True) for i in range(5): X_redundant = np.c_[X,X[:,:i]] # repeating redundant features print(X_redundant.shape) clf = LogisticRegression(random_state=0,max_iter=1000).fit(X_redundant, y) print(clf.score(X_redundant, y)) Output (150, 4) 0.9733333333333334 (150, 5) 0.98 (150, 6) 0.98 (150, 7) 0.9866666666666667 (150, 8) …

Total answers: 1

ValueError: shapes (120,6) and (7,) not aligned: 6 (dim 1) != 7 (dim 0)

ValueError: shapes (120,6) and (7,) not aligned: 6 (dim 1) != 7 (dim 0) Question: I’m trying to implement multiclass classification with logistic regression on an Iris.csv dataset from Kaggle. This is my code. import numpy as np import pandas as pd from sklearn.model_selection import train_test_split def standardize(X_tr): # (x-Mean(x))/std(X) Normalizes data for i in …

Total answers: 1

Statsmodels Clustered Logit Model With Robust Standard Errors

Statsmodels Clustered Logit Model With Robust Standard Errors Question: I have the following dataframe: df.head() id case volfluid map rr o2 fluid 1044 36 3 3.0 1.0 3.0 2.0 0.0 1045 37 3 2.0 3.0 1.0 2.0 1.0 1046 38 3 3.0 2.0 2.0 1.0 0.0 1047 36 4 2.0 3.0 1.0 3.0 1.0 1048 …

Total answers: 1

How can I code Vuong's statistical test in Python?

How can I code Vuong's statistical test in Python? Question: I need to implement Vuong’s test for non-nested models. Specifically, I have logistic-regression models that I would like to compare. I have found implementations in R and STATA online, but unfortunately I work in Python and am not familiar with those frameworks/languages. Also unfortunate is …

Total answers: 1

logistic-regression converting a categorical column to numeric : single vs multiple column

logistic-regression converting a categorical column to numeric : single vs multiple column Question: i want to train a logistic regression model on a dataset which has a categorical HomePlanet column contains 3 distinct values as : Earth , Europa , Mars when i do : pd.get_dummies(train[‘HomePlanet’]) it seperates all categories as columns.Then i train the …

Total answers: 1

Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty

Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty Question: I’m building a logistic regression model to predict a binary target feature. I want to try different values of different parameters using the param_grid argument, to find the best fit with the best values. This is my code: from sklearn.model_selection import train_test_split X_train, …

Total answers: 1

Improve scikit-learn Logistic Regression model vs Statsmodels – significant variables

Improve scikit-learn Logistic Regression model vs Statsmodels – significant variables Question: I’m working on an binary classification prediction and using a Logistic Regression. I know with statsmodels, it is possible to know the significant variables thanks to the p-value and remove the no significant ones to have a more performant model. import statsmodels.api as sm …

Total answers: 1

logistic regression using more than one predictor

logistic regression using more than one predictor Question: I want to fit a logistic regression model that predicts Y using X1 and X2. What I know is that we use the following method: x_train, x_test, y_train, y_test = train_test_split(X,Y,test_size) and then model = LogisticRegression() model.fit(x_train,y_train) To predict Y using X, I don’t know how to …

Total answers: 2

how to plot the decision boundary of a polynomial logistic regression in python?

how to plot the decision boundary of a polynomial logistic regression in python? Question: I have looked into the example on this website: https://scipython.com/blog/plotting-the-decision-boundary-of-a-logistic-regression-model/ I understand how they plot the decision boundary for a linear feature vector. But how would I plot the decision boundary if I apply from sklearn.preprocessing import PolynomialFeatures … poly = …

Total answers: 2