Multiclass regression using one vs one

Question:

I am trying to fit a logistic regression using the method of one vs one in R or Python however nothing seems to work. I have an interest variable Y with 4 classes and 16 X variables. In my model, I would like to use all X variables to start with and predict the Y. My data is already separated in training data and test data.I tried using the ovo function from the scikit-learn package, but I receive an error: TOTAL NO. of ITERATIONS REACHED LIMIT. Help would be greatly appreciated.

train = pd.read_csv("C:/Python/train.csv")
test = pd.read_csv("C:/Python/test.csv")

train_y = train.iloc[:,-1]
train_x=train.iloc[:,:-1]

model = LogisticRegression()

ovo = OneVsOneClassifier(model)

ovo.fit(train_x,train_y)

predlog=ovo.predict(Y)

Asked By: ProgrammingGirly

||

Answers:

Try increasing the maximum number of iterations: The logistic regression model in scikit-learn has a default maximum number of iterations set to 100. If your model is not converging within that number of iterations, you may want to increase it. You can set the maximum number of iterations using the max_iter parameter, like this:
python
Copy code

model = LogisticRegression(max_iter=1000)

You can try setting max_iter to a higher value, such as 1000 or 10000, and see if that helps.

Standardize your input variables: It’s often helpful to standardize your input variables before fitting a logistic regression model. This can improve the convergence of the model and make it more stable. You can standardize your input variables using the StandardScaler class from scikit-learn, like this:

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler() train_x = scaler.fit_transform(train_x)

This will standardize your input variables so that they have zero mean and unit variance.

Try using a different solver: The solver is the algorithm used to optimize the logistic regression model. Different solvers have different strengths and weaknesses, so it’s possible that a different solver will work better for your data. The solver parameter in scikit-learn’s LogisticRegression class can be set to different values, such as lbfgs, liblinear, newton-cg, sag, or saga. You can try using a different solver to see if it helps:
python

model = LogisticRegression(solver=’lbfgs’, max_iter=1000)

Check your data: Finally, you may want to double-check your data to make sure there are no missing values or other issues that could be causing the convergence problem. You can use functions like train_x.isnull().sum() to check for missing values in your data. You can also try removing any variables that are highly correlated or adding regularization to your model.