How do I create several DataFrames for each sheet in an Excel file?

Question:

I have an Excel file with several sheets. I would like to iterate over each sheet of the Excel file and create a DataFrame for each sheet in the file. Also, I would like the DataFrames to be named “hfi_(name of the sheet)”. This is the for loop I wrote so far, but it is not running due to a syntax error. Would you help me solve this?

for sheet in hfi_file:    
  hfi_file=pd.read_excel("/Users/glosophy/Desktop/hfi/hfi_2016.xlsx",sheet_name=sheet)
return str("hfi"+sheet)=pd.DataFrame(hfi_file)
Asked By: Guillermina

||

Answers:

I will recommend store those dataframes into a dict

d={"hfi" + sheet:pd.DataFrame(pd.read_excel("/Users/glosophy/Desktop/hfi/hfi_2016.xlsx", sheet_name=sheet)) for sheet in hfi_file}

Then you can using .get to access the data

d.get('your dict key')
Answered By: BENY

You need to import the Excel file first, then read each sheet by name

multi_sheet_file = pd.ExcelFile("/Users/glosophy/Desktop/hfi/hfi_2016.xlsx")

# Gets the sheet names in the file
excel_sheet_names = multi_sheet_file.sheet_names

dict_of_sheets = {}
for sheet in excel_sheet_names:
    dict_of_sheets["hfi" + sheet] = pd.read_excel(multi_sheet_file, sheetname=sheet')
Answered By: ZaxR

I’m clearly slow on the draw, but here’s another answer:

import pandas as pd

# importing these so sessionInfo comes up for them
import openpyxl
import xlrd

file = 'output.xlsx'
df1 = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})
df2 = pd.DataFrame({'sheet2-col1': [0, 1], 'sheet2-col2': [30, 40]})

writer = pd.ExcelWriter(file)
df1.to_excel(writer, 'Sheet1')
df2.to_excel(writer, 'Sheet2')
writer.save()

reader = pd.ExcelFile(file)
dict = {"hfi" + i : pd.read_excel(file, sheet_name=i) for i in reader.sheet_names}
print(dict)
#> {'hfiSheet1':    col1  col2
#> 0     1     3
#> 1     2     4, 'hfiSheet2':    sheet2-col1  sheet2-col2
#> 0            0           30
#> 1            1           40}

Created on 2018-08-30 by the reprexpy package

import reprexpy
print(reprexpy.SessionInfo())
#> Session info --------------------------------------------------------------------
#> Platform: Darwin-17.7.0-x86_64-i386-64bit (64-bit)
#> Python: 3.6
#> Date: 2018-08-30
#> Packages ------------------------------------------------------------------------
#> openpyxl==2.5.6
#> pandas==0.23.4
#> reprexpy==0.1.1
#> xlrd==1.1.0
Answered By: Chris

Here is the easiest way I can share with you:

# read the sheet file
import pandas as pd
sheets = pd.ExcelFile('sheet_filename.xlsx')

# save each sheet to a dataframe
df1 = pd.read_excel(sheets, 'Sheet1')
df2 = pd.read_excel(sheets, 'Sheet2')

Hope this helped you!

Answered By: Jade Cacho
# Read the excel file
filename = 'yourpath/yourfile.xlsx'

df_raw = pd.read_excel(filename)

# iterate through the sheets in the file
for sheet in df_raw.sheet_names:
    sheet = pd.read_excel(filename, sheet_name=sheet)
Answered By: makaraduman
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.