machine-learning

Trying to apply fit_transofrm() function from sklearn.compose.ColumnTransformer class on array but getting "tuple index out of range" error

Trying to apply fit_transofrm() function from sklearn.compose.ColumnTransformer class on array but getting "tuple index out of range" error Question: I am beginner in ML/AI and trying to do pre-proccesing on my dataset of digits that I’ve made myself. I want to apply OneHotEncoding on my categorical variable (which is a dependent one,idk if it is …

Total answers: 1

Does the preprocessing of one algorithm change the conditions of the experiment?

Does the preprocessing of one algorithm change the conditions of the experiment? Question: As an example, We have two algorithms that utilize the same dataset and the same train and test data: 1 – uses k-NN and returns the accuracy; 2 -applies preprocessing before k-NN and adds a few more things, before returning the accuracy. …

Total answers: 1

Housing Data Set Not Able to Load From 'Hands-On Machine Learning'

Housing Data Set Not Able to Load From 'Hands-On Machine Learning' Question: I have followed other solutions that were posted on stackoverflow about trying to load the housing dataset which mostly included trying to call ‘fetch_housing_data()’ as well. However, even after I do that, I still get a filenotfound error indicating that there is no …

Total answers: 1

How to use MLFlow in a functional style / functional programming?

How to use MLFlow in a functional style / functional programming? Question: Is there a reliable way to use MLFlow in a functional style? As it is not possible to pass the run ID for example to the function which logs a parameter, I wonder whether it is possible to seperate code executed in my …

Total answers: 2

Plot nlargest is showing the inverse output

Plot nlargest is showing the inverse output Question: I am trying to plot the feature importance generated using random forest algorithm using the below code. However, the largest values are shown at the bottom. But I want them to be at the top. feat_importances = pd.Series(g_search.best_estimator_.feature_importances_, index=X_train.columns) feat_importances.nlargest(20).plot(kind=’barh’) You can see the graph below that …

Total answers: 1

Can a parquet file exceed 2.1GB?

Can a parquet file exceed 2.1GB? Question: I’m having an issue storing a large dataset (around 40GB) in a single parquet file. I’m using the fastparquet library to append pandas.DataFrames to this parquet dataset file. The following is a minimal example program that appends chunks to a parquet file until it crashes as the file-size …

Total answers: 1

Memory Error when parsing a large number of files

Memory Error when parsing a large number of files Question: I am parsing 6k csv files to merge them into one. I need this for their joint analysis and training of the ML model. There are too many files and my computer ran out of memory by simply concatenating them. S = ‘’ for f …

Total answers: 1

Can spacy's text categorizer learn the logic of recognizing two words in order?

Can spacy's text categorizer learn the logic of recognizing two words in order? Question: I’m trying to determine if Spacy’s text categorizer can learn a simple logic to detect the presence of two consecutive words in order: "jhon died". After training, for this experiment, the only results that matter are the output for the same …

Total answers: 1

Linear regression for time series

Linear regression for time series Question: I am pretty new to Machine Learning and have some confusion, so sorry for trivial question. I have time series data set, very simple with two columns – Date and Price. I’m predicting the price and want to add some features to my model like moving average for last …

Total answers: 2

AxisError: axis -1 is out of bounds

AxisError: axis -1 is out of bounds Question: I already referred the posts here,here and here Am trying to run a lassoCV model and fit it on my training dataset. So, I tried the below code (this works) from numpy import arange from sklearn.linear_model import LassoCV from sklearn.model_selection import RepeatedKFold # define model evaluation method …

Total answers: 1