'(slice(None, None, None), 0)' is an invalid key

Question:

I’m writing a code to implement k-fold cross validation.

data = pd.read_csv('Data_assignment1.csv')
k=10

np.random.shuffle(data.values)  # Shuffle all rows
folds = np.array_split(data, k) # split the data into k folds

for i in range(k):
    x_cv = folds[i][:, 0]  # Set ith fold for testing
    y_cv = folds[i][:, 1]
    new_folds = np.row_stack(np.delete(folds, i, 0)) # Remove ith fold for training
    x_train = new_folds[:, 0]  # Set the remaining folds for training
    y_train = new_folds[:, 1]

When trying to set the values for x_cv and y_cv, I get the error:

TypeError: '(slice(None, None, None), 0)' is an invalid key      

In an attempt to solve this, I tried using folds.iloc[i][:, 0].values etc:

for i in range(k):
    x_cv = folds.iloc[i][:, 0].values  # Set ith fold for testing
    y_cv = folds.iloc[i][:, 1].values
    new_folds = np.row_stack(np.delete(folds, i, 0)) # Remove ith fold for training
    x_train = new_folds.iloc[:, 0].values  # Set the remaining folds for training
    y_train = new_folds.iloc[:, 1].values

I then got the error:

AttributeError: 'list' object has no attribute 'iloc'  

How can I get around this?

Asked By: sabrina

||

Answers:

  1. folds = np.array_split(data, k) will return a list of Dataframes.
  2. type(folds) == list
  3. This is why you got AttributeError: 'list' object has no attribute 'iloc'.
    List objects dont have the iloc method.
  4. So you need to first access list with index first to get each DataFrame object. folds[i].
  5. type(folds[i]) == pandas.DataFrame
  6. Now use iloc on the DataFrame object.
  7. folds[i].iloc[:,0].values

I had this problem to plot using matplotlib,solve the problem by using
plt.scatter(X.iloc[:,0], X.iloc[:,1], s=40, c=y, cmap=’winter’)
instead of :
plt.scatter(X[:,0], X[:,1], s=40, c=y, cmap=plt.cm.BuGn)

Answered By: Amir shahcheraghian