Read and concat multiple excel files based on a specific column

Question:

path = '/Desktop/somefolder'
for filename in os.listdir(path):
    with open(path+filename) as f:
         - read the 3-4 excel files and attach the path
         - be able to concat them based on a specific column

filename gives me the name of the file I have in the directory. My idea was to concat the filename with the path to be able to read and concat them.

I am not sure how to use the filename that I get to be able to load it as a df and concat it.

Asked By: FalloutATS21

||

Answers:

Read the data into data frames

df1 = pd.read_excel('file1.xlsx')
df2 = pd.read_excel('file2.xlsx') 

Create filtered data frames

df1Filtered = df1[df1["YourColumnName"].("YourColumnValues")
df2Filtered = df2[df2["YourColumnName"].("YourColumnValues")

Concat the filtered data frames

NewDF = pd.concat([df1Filtered, df2Filtered])

Write a new file to excel

NewDF.to_excel('NewFile.xlsx')
Answered By: SteveSchoepfer

Try:

import pandas as pd
import os

path = '/Desktop/somefolder/'

dfs = []
for filename in os.listdir(path):
    dfs.append(pd.read_excel(path+filename, engine='openpyxl'))

pd.concat(dfs, axis=0)
Answered By: user16367225
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.