Using a context manager to read excel files in python

Question:

I’m currently using the following line to read Excel files

df = pd.read_excel(f"myfile.xlsx")

The problem is the enormous slow down which occurs when I implement data from this Excel file, for example in function commands. I think this occurs because I’m not reading the file via a context manager. Is there a way of combining a ‘with’ command with the pandas ‘read’ command so the code runs more smoothly? Sorry that this is vague, I’m just learning about context managers.

Edit : Here is an example of a piece of code that does not run…

import pandas as pd
import numpy as np

def fetch_excel(x):
 df_x = pd.read_excel(f"D00{x}_balance.xlsx")
 return df_x

T = np.zeros(3000)

for i in range(0, 3000):
 T[i] = fetch_excel(1).iloc[i+18, 0]

print(fetch_excel(1).iloc[0,0])

…or it takes more than 5 minutes which seems exceptional to me. Anyway I can’t work with a delay like that. If I comment out the for loop, this does work.

Asked By: In the blind

||

Answers:

Usually the key reason to use standard context managers for reading in files is convenience of closing and opening the underlying file descriptor. You can create context managers to do anything you’d like, though. They’re just functions.

Unfortunately they aren’t likely to solve the problem of slow loading times reading in your excel file.

Answered By: Mike L

You are accessing the HDD, opening, reading and converting the SAME file D001_balance.xlsx 3000 times to access a single piece of data – different row each time from 18 to 3017. This is pointless as the data is all in the DataFrame after one reading. Just use:

df_x = pd.read_excel(f"D001_balance.xlsx")

T = np.zeros(3000)

for i in range(0, 3000):
    T[i] = df_x.iloc[i+18, 0]

print(df_x.iloc[0,0])
Answered By: user19077881
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.