How to extract a mult-part zip file in python?

Question:

Suposse that I have some files that I downloaded from a server and they are zipped with 7zip in multiple parts, the format is something like this myfile.zip.001, myfile.zip.002, …, myfile.zip.00n. Basically, I need to extract the content of it in the same folder where they are stored.

I tried using zipfile, patoolib and pyunpack without success, here is what I’ve done:

file_path = r"C:UsersuserDocumentsmyfile.zip.001" #I also tested with only .zip
extract_path = r"C:UsersuserDocuments"

#"

import zipfile
with zipfile.ZipFile(file_path, "r") as zip_ref:
  zip_ref.extractall(extract_path) # myfile.zip.001 file isn't zip file.

from pyunpack import Archive
Archive(file_path).extractall(extract_path) # File is not a zip file

import patoolib
patoolib.extract_archive(file_path, outdir=extract_path) # unknown archive format for file `myfile.zip.001'

Another way (that works, but it’s very ugly) is this one:

import os
import subprocess

path_7zip = r"C:Program Files (x86)7-Zip7z.exe"

cmd = [path_7zip, 'x', 'myfile.zip.001']
sp = subprocess.Popen(cmd, stderr=subprocess.STDOUT, stdout=subprocess.PIPE)

But this makes the user install 7zip in his computer, which isn’t a good approach of what I’m looking for.

So, the question is: there is at least a way to extract/unzip multi-parts files with the format x.zip.001 in python?

Asked By: NewbieInFlask

||

Answers:

You seem to be on the right track with zipfile, but you most likely have to concatenate the zip file before using extractall.

import os

zip_prefix = "myfile.zip."

# N number of parts
import glob

parts = glob.glob(zip_prefix + '*')
n = len(parts)

# Concatenate
with open("myfile.zip", "wb") as outfile:
    for i in range(1, n+1):
        filename = zip_prefix + str(i).zfill(3)
        with open(filename, "rb") as infile:
            outfile.write(infile.read())

# Extract
import zipfile

with zipfile.ZipFile(file_path, "r") as zip_ref:
  zip_ref.extractall(extract_path)
Answered By: iohans