Adding folders to a zip file using python

Question:

I want to create a zip file. Add a folder to the zip file and then add a bunch of files to that folder.

So I want to end up with a zip file with a single folder with files in.

I dont know if its bad practice to have folders in zip files or something but google gives me nothing on the subject.

I started out with this:

def addFolderToZip(myZipFile,folder):
    folder = folder.encode('ascii') #convert path to ascii for ZipFile Method
    for file in glob.glob(folder+"/*"):
            if os.path.isfile(file):
                print file
                myZipFile.write(file, os.path.basename(file), zipfile.ZIP_DEFLATED)
            elif os.path.isdir(file):
                addFolderToZip(myZipFile,file)

def createZipFile(filename,files,folders):
    curTime=strftime("__%Y_%m_%d", time.localtime())
    filename=filename+curTime;
    print filename
    zipFilename=utils.getFileName("files", filename+".zip")
    myZipFile = zipfile.ZipFile( zipFilename, "w" ) # Open the zip file for writing 
    for file in files:
        file = file.encode('ascii') #convert path to ascii for ZipFile Method
        if os.path.isfile(file):
            (filepath, filename) = os.path.split(file)
            myZipFile.write( file, filename, zipfile.ZIP_DEFLATED )

    for folder in  folders:   
        addFolderToZip(myZipFile,folder)  
    myZipFile.close()
    return (1,zipFilename)


(success,filename)=createZipFile(planName,files,folders);

Taken from: http://mail.python.org/pipermail/python-list/2006-August/396166.html

Which gets rid of all folders and puts all files in the target folder (and its subfolders) into a single zip file. I couldnt get it to add an entire folder.

If I feed the path to a folder in myZipFile.write, I get

IOError: [Errno 13] Permission denied: ‘..packedbin’

Any help is much welcome.

Related question: How do I zip the contents of a folder using python (version 2.5)?

Asked By: Mizipzor

||

Answers:

after adding some imports your code runs fine for me, how do you call the script, maybe you could tell us the folder structure of the ‘..packedbin’ directory.

I called your code with the following arguments:

planName='test.zip'
files=['z.py',]
folders=['c:\temp']
(success,filename)=createZipFile(planName,files,folders)

`

Answered By: RSabet

A zip file has no directory structure, it just has a bunch of pathnames and their contents. These pathnames should be relative to an imaginary root folder (the ZIP file itself). “../” prefixes have no defined meaning in a zip file.

Consider you have a file, a and you want to store it in a “folder” inside a zip file. All you have to do is prefix the filename with a folder name when storing the file in the zipfile:

zipi= zipfile.ZipInfo()
zipi.filename= "folder/a" # this is what you want
zipi.date_time= time.localtime(os.path.getmtime("a"))[:6]
zipi.compress_type= zipfile.ZIP_DEFLATED
filedata= open("a", "rb").read()

zipfile1.writestr(zipi, filedata) # zipfile1 is a zipfile.ZipFile instance

I don’t know of any ZIP implementations allowing the inclusion of an empty folder in a ZIP file. I can think of a workaround (storing a dummy filename in the zip “folder” which should be ignored on extraction), but not portable across implementations.

Answered By: tzot

Ok, after i understood what you want, it is as simple as using the second argument of zipfile.write, where you can use whatever you want:

import zipfile
myZipFile = zipfile.ZipFile("zip.zip", "w" )
myZipFile.write("test.py", "dir\test.py", zipfile.ZIP_DEFLATED )

creates a zipfile where test.py would be extracted to a directory called dir

EDIT:
I once had to create an empty directory in a zip file: it is possible.
after the code above just delete the file test.py from the zipfile, the file is gone, but the empty directory stays.

Answered By: RSabet

Heres the edited code I ran. Its based on the code above, taken from the mailing list. I added the imports and made a main routine. I also cut out the fiddling with the output filename to make the code shorter.

#!/usr/bin/env python

import os, zipfile, glob, sys

def addFolderToZip(myZipFile,folder):
    folder = folder.encode('ascii') #convert path to ascii for ZipFile Method
    for file in glob.glob(folder+"/*"):
            if os.path.isfile(file):
                print file
                myZipFile.write(file, os.path.basename(file), zipfile.ZIP_DEFLATED)
            elif os.path.isdir(file):
                addFolderToZip(myZipFile,file)

def createZipFile(filename,files,folders):
    myZipFile = zipfile.ZipFile( filename, "w" ) # Open the zip file for writing 
    for file in files:
        file = file.encode('ascii') #convert path to ascii for ZipFile Method
        if os.path.isfile(file):
            (filepath, filename) = os.path.split(file)
            myZipFile.write( file, filename, zipfile.ZIP_DEFLATED )

    for folder in  folders:   
        addFolderToZip(myZipFile,folder)  
    myZipFile.close()
    return (1,filename)

if __name__=="__main__":
    #put everything in sys.argv[1] in out.zip, skip files
    print createZipFile("out.zip", [], sys.argv[1])

At work, on my Windows box, this code ran fine but didnt create any “folders” in the zipfile. At least I recall it did. Now at home, on my Linux box, the zip file created seems to be bad:

$ unzip -l out.zip 
Archive:  out.zip
  End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.
unzip:  cannot find zipfile directory in one of out.zip or
        out.zip.zip, and cannot find out.zip.ZIP, period.

I dont know if I accidently broke the code, I think its the same. Crossplatform issues? Either way, its not related to my original question; getting folders in the zip file. Just wanted to post the code I actually ran, not the code I based my code on.

Answered By: Mizipzor
import zipfile
import os


class ZipUtilities:

    def toZip(self, file, filename):
        zip_file = zipfile.ZipFile(filename, 'w')
        if os.path.isfile(file):
                    zip_file.write(file)
            else:
                    self.addFolderToZip(zip_file, file)
        zip_file.close()

    def addFolderToZip(self, zip_file, folder): 
        for file in os.listdir(folder):
            full_path = os.path.join(folder, file)
            if os.path.isfile(full_path):
                print 'File added: ' + str(full_path)
                zip_file.write(full_path)
            elif os.path.isdir(full_path):
                print 'Entering folder: ' + str(full_path)
                self.addFolderToZip(zip_file, full_path)

def main():
    utilities = ZipUtilities()
    filename = 'TEMP.zip'
    directory = 'TEMP'
    utilities.toZip(directory, filename)

main()

I’m running:

python tozip.py

This is the log:

havok@fireshield:~$ python tozip.py

File added: TEMP/NARF (7ª copia)
Entering folder: TEMP/TEMP2
File added: TEMP/TEMP2/NERF (otra copia)
File added: TEMP/TEMP2/NERF (copia)
File added: TEMP/TEMP2/NARF
File added: TEMP/TEMP2/NARF (copia)
File added: TEMP/TEMP2/NARF (otra copia)
Entering folder: TEMP/TEMP2/TEMP3
File added: TEMP/TEMP2/TEMP3/DOCUMENTO DEL FINAL
File added: TEMP/TEMP2/TEMP3/DOCUMENTO DEL FINAL (copia)
File added: TEMP/TEMP2/NERF
File added: TEMP/NARF (copia) (otra copia)
File added: TEMP/NARF (copia) (copia)
File added: TEMP/NARF (6ª copia)
File added: TEMP/NERF (copia) (otra copia)
File added: TEMP/NERF (4ª copia)
File added: TEMP/NERF (otra copia)
File added: TEMP/NERF (3ª copia)
File added: TEMP/NERF (6ª copia)
File added: TEMP/NERF (copia)
File added: TEMP/NERF (5ª copia)
File added: TEMP/NARF (8ª copia)
File added: TEMP/NARF (3ª copia)
File added: TEMP/NARF (5ª copia)
File added: TEMP/NERF (copia) (3ª copia)
File added: TEMP/NARF
File added: TEMP/NERF (copia) (copia)
File added: TEMP/NERF (8ª copia)
File added: TEMP/NERF (7ª copia)
File added: TEMP/NARF (copia)
File added: TEMP/NARF (otra copia)
File added: TEMP/NARF (4ª copia)
File added: TEMP/NERF
File added: TEMP/NARF (copia) (3ª copia)

As you can see, it work, the archive is ok too. This is a recursive function that can zip an entire folder. The only problem is that it doesn’t create a empty folder.

Cheers.

Answered By: havok-cr

When you wanna create an empty folder, you can do it like this:

    storage = api.Storage.open("empty_folder.zip","w")
    res = storage.open_resource("hannu//","w")
    storage.close()

Folder not showe in winextractor, but when you extract it it is showed.

Answered By: Hannnu Ala-Harja

Below is some code for zipping an entire directory into a zipfile.

This seems to work OK creating zip files on both windows and linux. The output
files seem to extract properly on windows (built-in Compressed Folders feature,
WinZip, and 7-Zip) and linux. However, empty directories in a zip file appear
to be a thorny issue. The solution below seems to work but the output of
“zipinfo” on linux is concerning. Also the directory permissions are not set
correctly for empty directories in the zip archive. This appears to require
some more in depth research.

I got some info from this velocity reviews thread and this python mailing list thread.

Note that this function is designed to put files in the zip archive with
either no parent directory or just one parent directory, so it will trim any
leading directories in the filesystem paths and not include them inside the
zip archive paths. This is generally the case when you want to just take a
directory and make it into a zip file that can be extracted in different
locations.

Keyword arguments:

dirPath — string path to the directory to archive. This is the only
required argument. It can be absolute or relative, but only one or zero
leading directories will be included in the zip archive.

zipFilePath — string path to the output zip file. This can be an absolute
or relative path. If the zip file already exists, it will be updated. If
not, it will be created. If you want to replace it from scratch, delete it
prior to calling this function. (default is computed as dirPath + “.zip”)

includeDirInZip — boolean indicating whether the top level directory should
be included in the archive or omitted. (default True)

(Note that StackOverflow seems to be failing to pretty print my python with
triple quoted strings, so I just converted my doc strings to the post text here)

#!/usr/bin/python
import os
import zipfile

def zipdir(dirPath=None, zipFilePath=None, includeDirInZip=True):

    if not zipFilePath:
        zipFilePath = dirPath + ".zip"
    if not os.path.isdir(dirPath):
        raise OSError("dirPath argument must point to a directory. "
            "'%s' does not." % dirPath)
    parentDir, dirToZip = os.path.split(dirPath)
    #Little nested function to prepare the proper archive path
    def trimPath(path):
        archivePath = path.replace(parentDir, "", 1)
        if parentDir:
            archivePath = archivePath.replace(os.path.sep, "", 1)
        if not includeDirInZip:
            archivePath = archivePath.replace(dirToZip + os.path.sep, "", 1)
        return os.path.normcase(archivePath)

    outFile = zipfile.ZipFile(zipFilePath, "w",
        compression=zipfile.ZIP_DEFLATED)
    for (archiveDirPath, dirNames, fileNames) in os.walk(dirPath):
        for fileName in fileNames:
            filePath = os.path.join(archiveDirPath, fileName)
            outFile.write(filePath, trimPath(filePath))
        #Make sure we get empty directories as well
        if not fileNames and not dirNames:
            zipInfo = zipfile.ZipInfo(trimPath(archiveDirPath) + "/")
            #some web sites suggest doing
            #zipInfo.external_attr = 16
            #or
            #zipInfo.external_attr = 48
            #Here to allow for inserting an empty directory.  Still TBD/TODO.
            outFile.writestr(zipInfo, "")
    outFile.close()

Here’s some sample usages. Note that if your dirPath argument has several leading directories, only the LAST one will be included by default. Pass includeDirInZip=False to omit all leading directories.

zipdir("foo") #Just give it a dir and get a .zip file
zipdir("foo", "foo2.zip") #Get a .zip file with a specific file name
zipdir("foo", "foo3nodir.zip", False) #Omit the top level directory
zipdir("../test1/foo", "foo4nopardirs.zip")
Answered By: Peter Lyons

here is my function i use to zip a folder:

import os
import os.path
import zipfile

def zip_dir(dirpath, zippath):
    fzip = zipfile.ZipFile(zippath, 'w', zipfile.ZIP_DEFLATED)
    basedir = os.path.dirname(dirpath) + '/' 
    for root, dirs, files in os.walk(dirpath):
        if os.path.basename(root)[0] == '.':
            continue #skip hidden directories        
        dirname = root.replace(basedir, '')
        for f in files:
            if f[-1] == '~' or (f[0] == '.' and f != '.htaccess'):
                #skip backup files and all hidden files except .htaccess
                continue
            fzip.write(root + '/' + f, dirname + '/' + f)
    fzip.close()
Answered By: Dmitry Nedbaylo

Thank you wery much for this useful function! I found it very useful as I was also searching for help. However, maybe it would be useful to change it a little bit so that

basedir = os.path.dirname(dirpath) + '/'

would be

basedir = os.path.dirname(dirpath + '/')

Because found that if I want to zip folder ‘Example’ which is located at ‘C:folderpathnotWantedtozipExample’,

I got in Windows:

dirpath = 'C:folderpathnotWantedtozipExample'
basedir = 'C:folderpathnotWantedtozipExample/'
dirname = 'C:folderpathnotWantedtozipExampleExampleSubfolder_etc'

But I suppose your code should give

dirpath = 'C:folderpathnotWantedtozipExample'
basedir = 'C:folderpathnotWantedtozipExample'
dirname = 'Subfolder_etc'
Answered By: note

You can also use shutil

import shutil

zip_name = 'pathtozip_file'
directory_name = 'pathtodirectory'

# Create 'pathtozip_file.zip'
shutil.make_archive(zip_name, 'zip', directory_name)

This will put the whole folder in the zip.

Answered By: Gideon

If you look at a zip file created with Info-ZIP, you’ll see that directories are indeed listed:

$ zip foo.zip -r foo
  adding: foo/ (stored 0%)
  adding: foo/foo.jpg (deflated 84%)
$ less foo.zip
  Archive:  foo.zip
 Length   Method    Size  Cmpr    Date    Time   CRC-32   Name
--------  ------  ------- ---- ---------- ----- --------  ----
       0  Stored        0   0% 2013-08-18 14:32 00000000  foo/
  476320  Defl:N    77941  84% 2013-08-18 14:31 55a52268  foo/foo.jpg
--------          -------  ---                            -------
  476320            77941  84%                            2 files

Notice that the directory entry has zero length and is not compressed. It seems you can achieve the same thing with Python by writing the directory by name, but forcing it to not use compression.

if os.path.isdir(name):
    zf.write(name, arcname=arcname, compress_type=zipfile.ZIP_STORED)
else:
    zf.write(name, arcname=arcname, compress_type=zipfile.ZIP_DEFLATED)

It might be worth making sure arcname ends in a /.

Answered By: z0r
import os
import zipfile

zf = zipfile.ZipFile("file.zip", "w")
for file in os.listdir(os.curdir):
    if not file.endswith('.zip') and os.path.isfile(os.curdir+'/'+file):
        print file
        zf.write(file)
    elif os.path.isdir(os.curdir+'/'+file):
        print f
        for f in os.listdir(os.curdir+'/'+file):
            zf.write(file+'\'+f)
zf.close()
Answered By: sohom

Easiest way for me is using zipfile CLI (Command-Line Interface). The zipfile CLI can take either files or folders as arguments, and add them recursively to the archive.

So if you have a file hierarcy of:

- file1.txt
- folder1 
   - file2.txt
   - file3.txt

And you want all of it to be archived into ‘result.zip’, you simply write:

python -m zipfile -c result.zip file1.txt folder

In case you want to use it within python code and use the zipfile module imported, you can call its main function in the following way:

import zipfile
zipfile.main(['-c', 'result.zip', 'file1.md', 'folder'])
Answered By: Razko
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.