How to concatenate a list of csv files (including empty ones) using Pandas

Question:

I have a list of .csv files stored in a local folder and I’m trying to concatenate them into one single dataframe.

Here is the code I’m using :

import pandas as pd
import os

folder = r'C:Users_M92DesktopmyFolder'

df = pd.concat([pd.read_csv(os.path.join(folder, f), delimiter=';') for f in os.listdir(folder)])
display(df)

Only one problem, it happens that one of the files is sometimes empty (0 cols, 0 rows) and in this case, pandas is throwing an EmptyDataError: No columns to parse from file in line 6.

Do you have any suggestions how to bypass the empty csv file ?
And why not how to concatenate csv files in a more efficient/simplest way.

Ideally, I would also like to add a column (to the dataframe df) to carry the name of each .csv.

Asked By: L'Artiste

||

Answers:

You can check if a file is empty with:

import os

os.stat(FILE_PATH).st_size == 0

In your use case:

import os

df = pd.concat([
    pd.read_csv(os.path.join(folder, f), delimiter=';') 
    for f in os.listdir(folder) 
    if os.stat(os.path.join(folder, f)).st_size != 0
])
Answered By: ASGM

Personally I would filter the files for content first, then merge them using the basic try-except.

import pandas as pd
import os

folder = r'C:Users_M92DesktopmyFolder'
data = []

for f in os.listdir(folder):
   try:
      temp = pd.read_csv(os.path.join(folder, f), delimiter=';')
      # adding original filename column as per request
      temp['origin'] = f
      data.append(temp)
   except pd.errors.EmptyDataError:
      continue

df = pd.concat(data)

display(df)
Answered By: Wonhyeong Seo
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.