scikit-learn

Nearest neighbor for list of arrays

Nearest neighbor for list of arrays Question: `I have a list of arrays like this (in x, y coordinates): coordinates= array([[ 300, 2300], [ 670, 2360], [ 400, 2300]]), array([[1500, 1960], [1620, 2200], [1505, 1975]]), array([[ 980, 1965], [1060, 2240], [1100, 2250], [ 980, 1975]]), array([[ 565, 1940], [ 680, 2180], [ 570, 1945]])] I …

Total answers: 1

Masking nan values from an xarray dataset for scikit.learn mulltiple linear regression following scipy

Masking nan values from an xarray dataset for scikit.learn mulltiple linear regression following scipy Question: I’m attempting to use scikit-learn.linear_model’s LinearRegression find the multiple linear regression coefficients for different variables at each latitude and longitude point along the time dimension like so: for i in range(len(data.lat)): for j in range(len(data.lon)): storage_dframe[i, j, :] = LinearRegression().fit(np.array((data.ivar1.values[:, …

Total answers: 2

A problem with the user input during the random forest classifier demonstration

A problem with the user input during the random forest classifier demonstration Question: I got over 90% accuracy with the Random Forest classifier, but I worry the rest of the algorithms give much lower results: A table with the results But this is not the main concern. The problem is that when I used user …

Total answers: 1

ValueError while computing Mean Poisson Deviance with the sklearn package

ValueError while computing Mean Poisson Deviance with the sklearn package Question: I am trying to compute the mean Poisson deviance for the predictions that I got from a random forest regression using the metric implemented in sklearn. However, I got this error: ValueError: Mean Tweedie deviance error with power=2 can only be used on strictly …

Total answers: 1

How to plot a confusion matrix

How to plot a confusion matrix Question: I am trying to evaluate my renet50 model with a confusion matrix, but the confusion matrix looks like this: matrix = confusion_matrix(y_test, y_pred, normalize="pred") print(matrix) # output array([[1, 0], [1, 2]], dtype=int64) I am using scikit-learn for generating the confusion matrix and tf keras for making the model …

Total answers: 2

Multiclass confusion matrix in python

Multiclass confusion matrix in python Question: I’m trying to create a single multiclass confusion matrix in Python. df_true = pd.DataFrame({ "y_true": [0,0,1,1,0,2] }) df_pred = pd.DataFrame({ "y_pred": [0,1,2,0,1,2] }) And I want a single confusion matrix that tells me the actual and predict value for each case. Like this: Asked By: juanmac || Source Answers: …

Total answers: 1

How to resolve "cannot import name '_MissingValues' from 'sklearn.utils._param_validation'" issue when trying to import imblearn?

How to resolve "cannot import name '_MissingValues' from 'sklearn.utils._param_validation'" issue when trying to import imblearn? Question: I am trying to import imblearn into my python notebook after installing the required modules. However, I am getting the following error: Additional info: I am using a virtual environment in Visual Studio Code. I’ve made sure that venv …

Total answers: 3

Precision, Recall and F1 with Sklearn for a Multiclass problem

Precision, Recall and F1 with Sklearn for a Multiclass problem Question: I have a Multiclass problem, where 0 is my negative class and 1 and 2 are positive. Check the following code: import numpy as np from sklearn.metrics import confusion_matrix from sklearn.metrics import ConfusionMatrixDisplay from sklearn.metrics import f1_score from sklearn.metrics import precision_score from sklearn.metrics import …

Total answers: 1

Laeble encoding pandas dataframe, same label for same value

Laeble encoding pandas dataframe, same label for same value Question: Here is a snippet of my df: 0 1 2 3 4 5 … 11 12 13 14 15 16 0 BSO PRV BSI TUR WSP ACP … HLR HEX HEX None None None 1 BSO PRV BSI TUR WSP ACP … HLF HLR HEX …

Total answers: 2

Why does adding duplicated features improve Logistic Regression accuracy?

Why does adding duplicated features improve Logistic Regression accuracy? Question: from sklearn.datasets import load_iris from sklearn.linear_model import LogisticRegression X, y = load_iris(return_X_y=True) for i in range(5): X_redundant = np.c_[X,X[:,:i]] # repeating redundant features print(X_redundant.shape) clf = LogisticRegression(random_state=0,max_iter=1000).fit(X_redundant, y) print(clf.score(X_redundant, y)) Output (150, 4) 0.9733333333333334 (150, 5) 0.98 (150, 6) 0.98 (150, 7) 0.9866666666666667 (150, 8) …

Total answers: 1