reading multiple files contained in a zip file with pandas


I have multiple zip files containing different types of txt files.
Like below:

  - file1.txt
  - file2.txt
  - file3.txt

How can I use pandas to read in each of those files without extracting them?

I know if they were 1 file per zip I could use the compression method with read_csv like below:

df = pd.read_csv(, compression='zip') 

Any help on how to do this would be great.

Asked By: johnnyb



I had a similar problem with XML files awhile ago. The zipfile module can get you there.

from zipfile import ZipFile

z = ZipFile(yourfile)

text_files = z.infolist()

for text_file in text_files:

If you want to concatenate them into a pandas object then it might get a bit more complex, but that should get you started. Note that the read method returns bytes, so you may have to handle that as well.

Answered By: Iain Dwyer

You can pass to pandas.read_csv() to construct a pandas.DataFrame from a csv-file packed into a multi-file zip.



Example to read all .csv into a dict:

from zipfile import ZipFile

zip_file = ZipFile('')
dfs = {text_file.filename: pd.read_csv(
       for text_file in zip_file.infolist()
       if text_file.filename.endswith('.csv')}
Answered By: Stephen Rauch

The most simplest way to handle this (if you have multiple parts of one big csv file compressed to a one zip file).

import pandas as pd
from zipfile import ZipFile

df = pd.concat(
    [pd.read_csv(ZipFile('').open(i)) for i in ZipFile('').namelist()],
Answered By: valentinmk

For those who have empty txt files in the zipfile:

from zipfile import ZipFile
z = ZipFile('')
df = pd.concat(
    [pd.read_csv( for i in z.infolist() if i.compress_size > 0],

Otherwise, the "pandas.errors.EmptyDataError: No columns to parse from file" would show up.

Answered By: Songhua Hu
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.