iloc giving 'IndexError: single positional indexer is out-of-bounds'
Question:
I am trying to encode some information to read into a Machine Learning model using the following
import numpy as np
import pandas as pd
import matplotlib.pyplot as py
Dataset = pd.read_csv('filename.csv', sep = ',')
X = Dataset.iloc[:,:-1].values
Y = Dataset.iloc[:,18].values
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X = LabelEncoder()
X[:, 0] = labelencoder_X.fit_transform(X[:, 0])
onehotencoder = OneHotEncoder(categorical_features = [0])
X = onehotencoder.fit_transform(X).toarray()
however I am getting an error that reads
IndexError: single positional indexer is out-of-bounds
Answers:
This error is caused by:
Y = Dataset.iloc[:,18].values
Indexing is out of bounds here most probably because there are less than 19 columns in your Dataset, so column 18 does not exist. The following code you provided doesn’t use Y at all, so you can just comment out this line for now.
This happens when you index a row/column with a number that is larger than the dimensions of your dataframe
. For instance, getting the eleventh column when you have only three.
import pandas as pd
df = pd.DataFrame({'Name': ['Mark', 'Laura', 'Adam', 'Roger', 'Anna'],
'City': ['Lisbon', 'Montreal', 'Lisbon', 'Berlin', 'Glasgow'],
'Car': ['Tesla', 'Audi', 'Porsche', 'Ford', 'Honda']})
You have 5 rows and three columns:
Name City Car
0 Mark Lisbon Tesla
1 Laura Montreal Audi
2 Adam Lisbon Porsche
3 Roger Berlin Ford
4 Anna Glasgow Honda
Let’s try to index the eleventh column (it doesn’t exist):
df.iloc[:, 10] # there is obviously no 11th column
IndexError: single positional indexer is out-of-bounds
If you are a beginner with Python, remember that df.iloc[:, 10]
would refer to the eleventh column.
It does not help for the solution of the question here, but whoever might come here for the error and not for the example, I had this error IndexError: single positional indexer is out-of-bounds
when I tried to find a row in a dataframe2 while looping over the rows of dataframe1, using many criteria in the filter of dataframe2, and adding each found row to a new empty dataframe3 (Do not ask me why!). One of the values in the row was a "nan" value both in dataframe1 and dataframe2. I could not filter anymore nor add a new row.
Solution:
dataframe1.fillna("nan") # or whatever you want as a fill value
dataframe2.fillna("nan")
and the script ran through without the error.
This was really a good explanation !! I was also trying to loop a row that did not exist. Try to loop over by reducing the (i) value it will work.
I am trying to encode some information to read into a Machine Learning model using the following
import numpy as np
import pandas as pd
import matplotlib.pyplot as py
Dataset = pd.read_csv('filename.csv', sep = ',')
X = Dataset.iloc[:,:-1].values
Y = Dataset.iloc[:,18].values
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X = LabelEncoder()
X[:, 0] = labelencoder_X.fit_transform(X[:, 0])
onehotencoder = OneHotEncoder(categorical_features = [0])
X = onehotencoder.fit_transform(X).toarray()
however I am getting an error that reads
IndexError: single positional indexer is out-of-bounds
This error is caused by:
Y = Dataset.iloc[:,18].values
Indexing is out of bounds here most probably because there are less than 19 columns in your Dataset, so column 18 does not exist. The following code you provided doesn’t use Y at all, so you can just comment out this line for now.
This happens when you index a row/column with a number that is larger than the dimensions of your dataframe
. For instance, getting the eleventh column when you have only three.
import pandas as pd
df = pd.DataFrame({'Name': ['Mark', 'Laura', 'Adam', 'Roger', 'Anna'],
'City': ['Lisbon', 'Montreal', 'Lisbon', 'Berlin', 'Glasgow'],
'Car': ['Tesla', 'Audi', 'Porsche', 'Ford', 'Honda']})
You have 5 rows and three columns:
Name City Car
0 Mark Lisbon Tesla
1 Laura Montreal Audi
2 Adam Lisbon Porsche
3 Roger Berlin Ford
4 Anna Glasgow Honda
Let’s try to index the eleventh column (it doesn’t exist):
df.iloc[:, 10] # there is obviously no 11th column
IndexError: single positional indexer is out-of-bounds
If you are a beginner with Python, remember that df.iloc[:, 10]
would refer to the eleventh column.
It does not help for the solution of the question here, but whoever might come here for the error and not for the example, I had this error IndexError: single positional indexer is out-of-bounds
when I tried to find a row in a dataframe2 while looping over the rows of dataframe1, using many criteria in the filter of dataframe2, and adding each found row to a new empty dataframe3 (Do not ask me why!). One of the values in the row was a "nan" value both in dataframe1 and dataframe2. I could not filter anymore nor add a new row.
Solution:
dataframe1.fillna("nan") # or whatever you want as a fill value
dataframe2.fillna("nan")
and the script ran through without the error.
This was really a good explanation !! I was also trying to loop a row that did not exist. Try to loop over by reducing the (i) value it will work.