Error concatenating specific sheet from multiple workbooks into one df

Question:

I am trying to separate out a specific sheet from about 300 excel workbooks and combine them into a single dataframe.

I have tried this code:

import pandas as pd
import glob
import openpyxl
from openpyxl import load_workbook

pd.set_option("display.max_rows", 100, "display.max_columns", 100)
allexcelfiles = glob.glob(r"C:UsersLELI Laptop 5DesktopDTP1*.xlsx")
cefdf = []

for ExcelFile in allexcelfiles:
    wb = load_workbook(ExcelFile)
    for sheet in wb:
        list_of_sheetnames = [sheet for sheet in wb.sheetnames if "SAR" in sheet]
        df = pd.read_excel(ExcelFile, sheet_name = list_of_sheetnames, nrows = 24)
        cefdf.append(df)
df = pd.concat(cefdf)

From which I get this error:

TypeError: cannot concatenate object of type '<class 'dict'>'; only Series and DataFrame objs are valid

I then tried this:

df = pd.DataFrame(pd.read_excel(ExcelFile, sheet_name = list_of_sheetnames, nrows = 24))

From which I get this error:

ValueError: If using all scalar values, you must pass an index
Asked By: doctor of spin

||

Answers:

You can concat dictonary of DataFrames, reason is because multiple sheetnames in list_of_sheetnames:

for ExcelFile in allexcelfiles:
    wb = load_workbook(ExcelFile)

    list_of_sheetnames = [sheet for sheet in wb.sheetnames if "SAR" in sheet]
    
    dfs = pd.read_excel(ExcelFile, sheet_name = list_of_sheetnames, nrows = 24)
    cefdf.append(pd.concat(dfs))
    
df = pd.concat(cefdf)
Answered By: jezrael
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.