Index out of bound when dropping rows in a dataframe
Question:
I can’t understand why i get the error "IndexError: index 159 is out of bounds for axis 0 with size 159" while dropping a list of rows from a dataframe.
#Import file Excel
xls = pd.ExcelFile(file_path)
#Parse away the first 5 rows
df = xls.parse('Daten', skiprows=5, index_col=None, na_values=['NA'])
# Select row where value in column "Punktrolle_SO" is not 'UK_Schwelle_Wehr_Blockrampe'
row_numbers = [x+1 for x in df[df['Punktrolle_SO'] != 'UK_Schwelle_Wehr_Blockrampe'].index]
#Changing the index to skip the index 0
df.index = df.index + 1
#Dropping the rows where the data are not 'UK_Schwelle_Wehr_Blockrampe'
dataframe = df.drop(df.index[row_numbers], inplace=True)
The list row_numbers contains the correct 156 values and the dataframe index goes from 1 to 159 so why do I get an IndexError?
runfile('O:/GIS/GEP/Risikomanagement/Flussvermessung/ALD/Analyses/ReadMultileFilesInOne.py', wdir='O:/GIS/GEP/Risikomanagement/Flussvermessung/ALD/Analyses')
Traceback (most recent call last):
File "O:GISGEPRisikomanagementFlussvermessungALDAnalysesReadMultileFilesInOne.py", line 73, in <module>
dataframe = df.drop(df.index[row_numbers], inplace=True)
File "C:ProgramDataAnaconda3libsite-packagespandascoreindexesrange.py", line 708, in __getitem__
return super().__getitem__(key)
File "C:ProgramDataAnaconda3libsite-packagespandascoreindexesbase.py", line 3941, in __getitem__
result = getitem(key)
IndexError: index 159 is out of bounds for axis 0 with size 159
Can anyone help me to see what I am doing worng?
Thank you,
Davide
I expect a dataframe containig the rows of the Excel file where the value in the column "Punktrolle_SO" is equal to ‘UK_Schwelle_Wehr_Blockrampe’.
Answers:
Isn’t it better to just keep the rows containing UK_Schwelle_Wehr_Blockrampe
using?:
df[df["Punktrolle_SO"].str.contains("UK_Schwelle_Wehr_Blockrampe")]
If the dataframe has a size of 159, then the highest index is 158. This is because the indicies start at 0 instead of 1. You are trying to access an index one higher than the maximum.
The dataframe does not go from 1 to 159 – it goes from 0 to 158. Thus index 159 will be out of bounds. You need to offset your accesses by 1.
I can’t understand why i get the error "IndexError: index 159 is out of bounds for axis 0 with size 159" while dropping a list of rows from a dataframe.
#Import file Excel
xls = pd.ExcelFile(file_path)
#Parse away the first 5 rows
df = xls.parse('Daten', skiprows=5, index_col=None, na_values=['NA'])
# Select row where value in column "Punktrolle_SO" is not 'UK_Schwelle_Wehr_Blockrampe'
row_numbers = [x+1 for x in df[df['Punktrolle_SO'] != 'UK_Schwelle_Wehr_Blockrampe'].index]
#Changing the index to skip the index 0
df.index = df.index + 1
#Dropping the rows where the data are not 'UK_Schwelle_Wehr_Blockrampe'
dataframe = df.drop(df.index[row_numbers], inplace=True)
The list row_numbers contains the correct 156 values and the dataframe index goes from 1 to 159 so why do I get an IndexError?
runfile('O:/GIS/GEP/Risikomanagement/Flussvermessung/ALD/Analyses/ReadMultileFilesInOne.py', wdir='O:/GIS/GEP/Risikomanagement/Flussvermessung/ALD/Analyses')
Traceback (most recent call last):
File "O:GISGEPRisikomanagementFlussvermessungALDAnalysesReadMultileFilesInOne.py", line 73, in <module>
dataframe = df.drop(df.index[row_numbers], inplace=True)
File "C:ProgramDataAnaconda3libsite-packagespandascoreindexesrange.py", line 708, in __getitem__
return super().__getitem__(key)
File "C:ProgramDataAnaconda3libsite-packagespandascoreindexesbase.py", line 3941, in __getitem__
result = getitem(key)
IndexError: index 159 is out of bounds for axis 0 with size 159
Can anyone help me to see what I am doing worng?
Thank you,
Davide
I expect a dataframe containig the rows of the Excel file where the value in the column "Punktrolle_SO" is equal to ‘UK_Schwelle_Wehr_Blockrampe’.
Isn’t it better to just keep the rows containing UK_Schwelle_Wehr_Blockrampe
using?:
df[df["Punktrolle_SO"].str.contains("UK_Schwelle_Wehr_Blockrampe")]
If the dataframe has a size of 159, then the highest index is 158. This is because the indicies start at 0 instead of 1. You are trying to access an index one higher than the maximum.
The dataframe does not go from 1 to 159 – it goes from 0 to 158. Thus index 159 will be out of bounds. You need to offset your accesses by 1.