Loop function to rename dataframes

Question:

I am new to coding and currently i want to create individual dataframes from each excel tab. It works out so far by doing a search in this forum (i found a sample using dictionary), but then i need one more step which i can’t figure out.

This is the code i am using:

import pandas as pd

excel = 'sample.xlsx'

xls = pd.ExcelFile(excel)
d = {}
for sheet in xls.sheet_names:
    print(sheet)
    d[f'{sheet}'] = pd.read_excel(xls, sheet_name=sheet)

Let’s say i have 3 excel tabs called ‘alpha’, ‘beta’ and ‘charlie’.
the code above will gave me 3 dataframes and i can call them by typing: d['alpha'], d['beta'] and d['charlie'].
What i want is to rename the dataframes so instead of calling them by typing (for example) d['alpha'], i just need to write alpha (without any other extras).

Edit: The excel i want to parse has 50+ tabs and it can grow
Edit 2: Thank you all for the links and the answers! it is a great help

Asked By: E_Ri

||

Answers:

You need to create variables which correspond to the three dataframes:

alpha, beta, charlie = d.values()

Edit:

Since you mentioned that the excel sheet could have 50+ tabs and could grow, you may prefer to do it your original loop. This can be done dynamically using exec

import pandas as pd

excel = 'sample.xlsx'

xls = pd.ExcelFile(excel)
d = {}
for sheet in xls.sheet_names:
    print(sheet)
    exec(f'{sheet}' + " = pd.read_excel(xls, sheet_name=sheet)")

It might be better practice, however, to simply index your sheets and access them by index. A 50+ length collection of excel sheets is probably better organized by appending to a list and accessing by index:

d = []
for sheet in xls.sheet_names:
    print(sheet)
    d.append(pd.read_excel(xls, sheet_name=sheet))

#d[0] = alpha; d[1] = beta, and so on...
Answered By: luke

I think you are looking for the build-in exec method, which executes strings.
But I do not recommend using exec, it is really widely discussed why it shouldn’t be used or at least should be used cautiously.

As I do not have your data, I think it is achievable using the following code:

import pandas as pd

excel='sample.xlsx'
xls=pd.ExcelFile(excel)

for sheet in xls.sheet_names:
 print(sheet)
 code_to_execute = f'{sheet} = pd.read_excel(xls,sheet_name={sheet})'
 exec(code_to_execute)

But again, I highlight that it is not the cleanest way to do that. Your approach is definitely cleaner, to be more precise, I would always use dicts for those kinds of assignments. See here for more about exec.

In general, you want to generate a string.

possible_string = 'a=10'
exec(possible_string)
print(a) # 10
Answered By: ko3

Don’t rename them.

I can think of two scenarios here:

1. The sheets are fundamentally different

When people ask how to dynamically assign to variable names, the usual (and best) answer is "Use a dictionary". Here’s one example.

Indeed, this is the reason Pandas does it this way!

In this case, my opinion is that your best move here is to do nothing, and just use the dictionary you have.

2. The sheets are roughly the same

If the sheets are all basically the same, and only differ by one attribute (e.g. they represent monthly sales and the names of the sheets are ‘May’, ‘June’, etc), then your best move is to merge them somehow, adding a column to reflect the sheet name (month, in my example).

Whatever you do, don’t use exec or eval, no matter what anyone tells you. They are not options for beginner programmers.

Answered By: Matt Hall
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.