How to eliminate absolute path in zip archive if absolute paths for files are provided?

Question:

I have two files in two different directories, one is '/home/test/first/first.pdf', the other is '/home/text/second/second.pdf'. I use following code to compress them:

import zipfile, StringIO
buffer = StringIO.StringIO()
first_path = '/home/test/first/first.pdf'
second_path = '/home/text/second/second.pdf'
zip = zipfile.ZipFile(buffer, 'w')
zip.write(first_path)
zip.write(second_path)
zip.close()

After I open the zip file that I created, I have a home folder in it, then there are two sub-folders in it, first and second, then the pdf files. I don’t know how to include only two pdf files instead of having full path zipped into the zip archive. I hope I make my question clear, please help.

Asked By: Shang Wang

||

Answers:

I suspect there might be a more elegant solution, but this one should work:

def add_zip_flat(zip, filename):
    dir, base_filename = os.path.split(filename)
    os.chdir(dir)
    zip.write(base_filename)

zip = zipfile.ZipFile(buffer, 'w')
add_zip_flat(zip, first_path)
add_zip_flat(zip, second_path)
zip.close()
Answered By: shx2

The zipfile write() method supports an extra argument (arcname) which is the archive name to be stored in the zip file, so you would only need to change your code with:

from os.path import basename
...
zip.write(first_path, basename(first_path))
zip.write(second_path, basename(second_path))
zip.close()

When you have some spare time reading the documentation for zipfile will be helpful.

Answered By: João Pinto

I use this function to zip a directory without include absolute path

import zipfile
import os 
def zipDir(dirPath, zipPath):
    zipf = zipfile.ZipFile(zipPath , mode='w')
    lenDirPath = len(dirPath)
    for root, _ , files in os.walk(dirPath):
        for file in files:
            filePath = os.path.join(root, file)
            zipf.write(filePath , filePath[lenDirPath :] )
    zipf.close()
#end zipDir
Answered By: himnabil

Can be done that way also (this allow for creating archives >2GB)

import os, zipfile
def zipdir(path, ziph):
    """zipper"""
    for root, _, files in os.walk(path):
        for file_found in files:
            abs_path = root+'/'+file_found
            ziph.write(abs_path, file_found)
zipf = zipfile.ZipFile(DEST_FILE.zip, 'w', zipfile.ZIP_DEFLATED, allowZip64=True)
zipdir(SOURCE_DIR, zipf)
zipf.close()
Answered By: jon doe

You can override the filename in the archive with the arcname parameter:

with zipfile.ZipFile(file="sample.zip", mode="w", compression=zipfile.ZIP_DEFLATED) as out_zip:
for f in Path.home().glob("**/*.txt"):
    out_zip.write(f, arcname=f.name)

Documentation reference: https://docs.python.org/3/library/zipfile.html#zipfile.ZipFile.write

Answered By: Guts

As João Pinto said, the arcname argument of ZipFile.write is what you need. Also, reading the documentation of pathlib is helpful. You can easily get the relative path to something also with pathlib.Path.relative_to, no need to switch to os.path.

import zipfile
from pathlib import Path

folder_to_compress = Path("/path/to/folder")
path_to_archive = Path("/path/to/archive.zip")

with zipfile.ZipFile(
        path_to_archive,
        mode="w",
        compression=zipfile.ZIP_DEFLATED,
        compresslevel=7,
    ) as zip:
    for file in folder_to_compress.rglob("*"):
        relative_path = file.relative_to(folder_to_compress)
        print(f"Packing {file} as {relative_path}")
        zip.write(file, arcname=relative_path)
Answered By: Zababa