Import all sheets for multiple excel files and append into one dataframe in python
Question:
I have three .XLS file, for first file I have total 2 sheets, for second file I have 9 sheets, for third file I have 11 sheets, I need to read all these files and all their sheets, The method which I knew is not efficient.
what I have tried –
df = pd.DataFrame()
for file_name in os.listdir(file_path):
if '~$' in file_name:
continue
else:
xls = pd.ExcelFile(os.path.join(file_path, file_name))
file1 = xls.parse(0)
file2 = xls.parse(1)
file3 = file1.append(file2)
df = pd.append(file3)
How can I make it dynamic, so that I don’t have to specify variables like file1, file2..
Answers:
Create an empty dictionary, and store all the sheets there and then append one by one all of them into a dataframe –
In this way you dont have to create file1, file2…everytime
df = pd.DataFrame()
d = {}
for file_name in os.listdir(file_path):
if '~$' in file_name:
continue
else:
xls = pd.ExcelFile(os.path.join(file_path, file_name))
for sheets in xls.sheet_names:
d[sheets] = xls.parse(sheets)
for k,v in d.items():
df = pd.append(d[k])
I have three .XLS file, for first file I have total 2 sheets, for second file I have 9 sheets, for third file I have 11 sheets, I need to read all these files and all their sheets, The method which I knew is not efficient.
what I have tried –
df = pd.DataFrame()
for file_name in os.listdir(file_path):
if '~$' in file_name:
continue
else:
xls = pd.ExcelFile(os.path.join(file_path, file_name))
file1 = xls.parse(0)
file2 = xls.parse(1)
file3 = file1.append(file2)
df = pd.append(file3)
How can I make it dynamic, so that I don’t have to specify variables like file1, file2..
Create an empty dictionary, and store all the sheets there and then append one by one all of them into a dataframe –
In this way you dont have to create file1, file2…everytime
df = pd.DataFrame()
d = {}
for file_name in os.listdir(file_path):
if '~$' in file_name:
continue
else:
xls = pd.ExcelFile(os.path.join(file_path, file_name))
for sheets in xls.sheet_names:
d[sheets] = xls.parse(sheets)
for k,v in d.items():
df = pd.append(d[k])