What am I doing wrong? I am very new to python
Question:
I am trying to run this
final_df = pd.DataFrame() #empty dataframe
for csv_file in file_list:
df = pd.read_csv(csv_file)
csv_file_name = csv_file.split('\')[7]
print('Processing File : {}'.format(csv_file_name))
df.columns = df.columns.str.replace(' ', '')
df['TIMESTAMP'] = pd.to_datetime(df['TIMESTAMP'])
df.set_index(['TIMESTAMP'], inplace=True)
if 'Unnamed:13' in df.columns:
df.drop(['Unnamed:13'], axis=1, inplace=True)
df_trim = df.apply(lambda x: x.str.strip() if x.dtype == 'object' else x)
new_df = df_trim[df_trim['SERIES'].isin(['EQ', 'BE', 'SM'])]
final_df = final_df.append(new_df)
final_df.sort_index(inplace=True) #to sort by dates
========================================
Getting error
IndexError Traceback (most recent call last)
Cell In[17], line 5
3 for csv_file in file_list:
4 df = pd.read_csv(csv_file)
----> 5 csv_file_name = csv_file.split('\')[7]
6 print('Processing File : {}'.format(csv_file_name))
7 df.columns = df.columns.str.replace(' ', '')
IndexError: list index out of range
What am I doing worng? what does this error mean?
Answers:
csv_file_name = csv_file.split('\')[7]
IndexError: list index out of range
Means that the list csv_file.split('\')
has less than 7 items, so 7 is out of the range of indexes
csv_file.split('\')
turns an array into a list, broken up by the delimiter
, so thisisastring
turns into ['this', 'is', 'a', 'string']
. If this list has less than 7 items, there must be less than 7
s in the string,
You are trying to access the 8th elements (7 + 1, as list starts at index 0 in python) of the list csv_file.split('\')
Based on the error, this list does not have 8 elements (IndexError: list index out of range
).
If I were you, I would try to see what csv_file.split('\')
looks like either :
- by adding a breakpoint and checking with a debugger
- by adding a
print(csv_file.split('\'))
before the actual error)
It seems that when you split the string into array, you want to get element out of array range. So, the solution can be (if you want to extract last item):
csv_file_name = csv_file.split('\')[len(csv_file.split('\'))-1]
IndexError: list index out of range
This error means you are trying to reach a section of an array that does not exist. e.g.
some_array = [ "a", "b", "c" ]
len(some_array)
will show 3, but as arrays are 0 based that means the last element some_array[2]
is "c"
In your case you have split csv_file
which I can only assume is a full path to a file. e.g. /my/path/to/some/file.txt
.
The problem is you have asked for the hard coded 7th item csv_file.split('\')[7]
. Hardcoding is VERY bad practice because it is fragile. For one of the paths there are less than 7 elements, hence your error. You could get the length of the array, or use [-1]
to get the last element.
However if all you want is the filename then there are much better python functions to get it. e.g.
import os
print(os.path.basename(your_path))
The -1
index will return the last element, in your case the file name.
Although in Linux the path strings follows forward slashes /
so its better to use os.path.basename
to get the file name.
import os
print(os.path.basename(csv_file))
>>> 'file_name.file_extension'
I am trying to run this
final_df = pd.DataFrame() #empty dataframe
for csv_file in file_list:
df = pd.read_csv(csv_file)
csv_file_name = csv_file.split('\')[7]
print('Processing File : {}'.format(csv_file_name))
df.columns = df.columns.str.replace(' ', '')
df['TIMESTAMP'] = pd.to_datetime(df['TIMESTAMP'])
df.set_index(['TIMESTAMP'], inplace=True)
if 'Unnamed:13' in df.columns:
df.drop(['Unnamed:13'], axis=1, inplace=True)
df_trim = df.apply(lambda x: x.str.strip() if x.dtype == 'object' else x)
new_df = df_trim[df_trim['SERIES'].isin(['EQ', 'BE', 'SM'])]
final_df = final_df.append(new_df)
final_df.sort_index(inplace=True) #to sort by dates
========================================
Getting error
IndexError Traceback (most recent call last)
Cell In[17], line 5
3 for csv_file in file_list:
4 df = pd.read_csv(csv_file)
----> 5 csv_file_name = csv_file.split('\')[7]
6 print('Processing File : {}'.format(csv_file_name))
7 df.columns = df.columns.str.replace(' ', '')
IndexError: list index out of range
What am I doing worng? what does this error mean?
csv_file_name = csv_file.split('\')[7]
IndexError: list index out of range
Means that the list csv_file.split('\')
has less than 7 items, so 7 is out of the range of indexes
csv_file.split('\')
turns an array into a list, broken up by the delimiter , so
thisisastring
turns into ['this', 'is', 'a', 'string']
. If this list has less than 7 items, there must be less than 7 s in the string,
You are trying to access the 8th elements (7 + 1, as list starts at index 0 in python) of the list csv_file.split('\')
Based on the error, this list does not have 8 elements (IndexError: list index out of range
).
If I were you, I would try to see what csv_file.split('\')
looks like either :
- by adding a breakpoint and checking with a debugger
- by adding a
print(csv_file.split('\'))
before the actual error)
It seems that when you split the string into array, you want to get element out of array range. So, the solution can be (if you want to extract last item):
csv_file_name = csv_file.split('\')[len(csv_file.split('\'))-1]
IndexError: list index out of range
This error means you are trying to reach a section of an array that does not exist. e.g.
some_array = [ "a", "b", "c" ]
len(some_array)
will show 3, but as arrays are 0 based that means the last element some_array[2]
is "c"
In your case you have split csv_file
which I can only assume is a full path to a file. e.g. /my/path/to/some/file.txt
.
The problem is you have asked for the hard coded 7th item csv_file.split('\')[7]
. Hardcoding is VERY bad practice because it is fragile. For one of the paths there are less than 7 elements, hence your error. You could get the length of the array, or use [-1]
to get the last element.
However if all you want is the filename then there are much better python functions to get it. e.g.
import os
print(os.path.basename(your_path))
The -1
index will return the last element, in your case the file name.
Although in Linux the path strings follows forward slashes /
so its better to use os.path.basename
to get the file name.
import os
print(os.path.basename(csv_file))
>>> 'file_name.file_extension'