h2o | py4u

How to get proper feature importance information when using categorical feature in h2O

How to get proper feature importance information when using categorical feature in h2O Question: When I have categorical features in my dataset, h20 implies one-hot encoding and start the training process. When I call summary method to see the feature importance tho, it treats each encoded categorical feature as a feature. My question is that …

Total answers: 1

H2OTypeError: Argument `x` should be a None | integer | string | ModelBase | list(string | integer) | set(integer | string), got H2OFrame

H2OTypeError: Argument `x` should be a None | integer | string | ModelBase | list(string | integer) | set(integer | string), got H2OFrame Question: I working with the Titanic dataset and made some basic preprocessing (such as normalization, ohe, etc.). Then, I tried to use H2O algorithm and got following error: from h2o.estimators.gbm import H2OGradientBoostingEstimator …

Total answers: 1

I am getting error while defining H2OContext in python spark script

I am getting error while defining H2OContext in python spark script Question: Code: from pyspark.sql import SparkSession from pysparkling import * hc = H2OContext.getOrCreate() I am using spark standalone cluster 3.2.1 and try to initiate H2OContext in python file. while trying to run the script using spark-submit, i am getting following error: hc = H2OContext.getOrCreate() …

Total answers: 1

Retrieve cross validation performance (AUC) on h2o AutoML for holdout dataset

Retrieve cross validation performance (AUC) on h2o AutoML for holdout dataset Question: I am training a binary classification model with h2o AutoML using the default cross-validation (nfolds=5). I need to obtain the AUC score for each holdout fold in order to compute the variability. This is the code I am using: h2o.init() prostate = h2o.import_file("https://h2o-public-test-data.s3.amazonaws.com/smalldata/prostate/prostate.csv") …

Total answers: 2

h2o frame from pandas casting

h2o frame from pandas casting Question: I am using h2o to perform predictive modeling from python. I have loaded some data from a csv using pandas, specifying some column types: dtype_dict = {‘SIT_SSICCOMP’:’object’, ‘SIT_CAPACC’:’object’, ‘PTT_SSIRMPOL’:’object’, ‘PTT_SPTCLVEI’:’object’, ‘cap_pad’:’object’, ‘SIT_SADNS_RESP_PERC’:’object’, ‘SIT_GEOCODE’:’object’, ‘SIT_TIPOFIRMA’:’object’, ‘SIT_TPFRODESI’:’object’, ‘SIT_CITTAACC’:’object’, ‘SIT_INDIRACC’:’object’, ‘SIT_NUMCIVACC’:’object’ } date_cols = [“SIT_SSIDTSIN”,”SIT_SSIDTDEN”,”PTT_SPTDTEFF”,”PTT_SPTDTSCA”,”SIT_DTANTIFRODE”,”PTT_DTELABOR”] columns_to_drop = [‘SIT_TPFRODESI’,’SIT_CITTAACC’, ‘SIT_INDIRACC’, ‘SIT_NUMCIVACC’, ‘SIT_CAPACC’, ‘SIT_LONGITACC’, …

Total answers: 2