Getting Error: [Errno 95] Operation not supported while writing zip file in databricks
Question:
Here i am trying to zip the file and write that to one folder (mount point) using below code in Databricks.
# List all files which need to be compressed
import os
modelPath = '/dbfs/mnt/temp/zip/'
filenames = [os.path.join(root, name) for root, dirs, files in os.walk(top=modelPath , topdown=False) for name in files]
print(filenames)
zipPath = '/dbfs/mnt/temp/compressed/demo.zip'
import zipfile
with zipfile.ZipFile(zipPath, 'w') as myzip:
for filename in filenames:
print(filename)
print(myzip)
myzip.write(filename)
But I am getting error as [Errno 95] Operation not supported.
Error Details
OSError Traceback (most recent call last)
<command-2086761864237851> in <module>
15 print(myzip)
---> 16 myzip.write(filename)
/usr/lib/python3.8/zipfile.py in write(self, filename, arcname, compress_type, compresslevel)
1775 with open(filename, "rb") as src, self.open(zinfo, 'w') as dest:
-> 1776 shutil.copyfileobj(src, dest, 1024*8)
1777
/usr/lib/python3.8/zipfile.py in close(self)
1181 self._fileobj.write(self._zinfo.FileHeader(self._zip64))
-> 1182 self._fileobj.seek(self._zipfile.start_dir)
1183
OSError: [Errno 95] Operation not supported
During handling of the above exception, another exception occurred:
OSError Traceback (most recent call last)
/usr/lib/python3.8/zipfile.py in close(self)
1837 if self._seekable:
-> 1838 self.fp.seek(self.start_dir)
1839 self._write_end_record()
OSError: [Errno 95] Operation not supported
During handling of the above exception, another exception occurred:
OSError Traceback (most recent call last)
OSError: [Errno 95] Operation not supported
During handling of the above exception, another exception occurred:
OSError Traceback (most recent call last)
<command-2086761864237851> in <module>
14 print(filename)
15 print(myzip)
---> 16 myzip.write(filename)
/usr/lib/python3.8/zipfile.py in __exit__(self, type, value, traceback)
1310
1311 def __exit__(self, type, value, traceback):
-> 1312 self.close()
1313
1314 def __repr__(self):
/usr/lib/python3.8/zipfile.py in close(self)
1841 fp = self.fp
1842 self.fp = None
-> 1843 self._fpclose(fp)
1844
1845 def _write_end_record(self):
/usr/lib/python3.8/zipfile.py in _fpclose(self, fp)
1951 self._fileRefCnt -= 1
1952 if not self._fileRefCnt and not self._filePassed:
-> 1953 fp.close()
1954
1955
Could anyone help me to resolve this issue.
Note: Here i can zip the file using shutil, but i want avoid driver so using above approch.
Answers:
You didn’t provide details of your mount, probably it’s Blob Storage or ADLSv2 and apparently it doesn’t allow file seek.
Check out this simple snippet:
%python
path = '/dbfs/mnt/temp/testfile'
with open(path, "w") as f:
f.write("test")
f.seek(1)
f.write("x")
with open(path, "r") as f:
print(f.read())
It will throw "Operation not supported" at f.seek(1)
.
Repeat the same with path = '/tmp/testfile'
and you’ll get correct result ("txst").
Weird thing is that the seek in zipfile.py should not be reached at all, it looks like self._seekable
returned incorrect value, I’m not sure if that’s a problem of the library or Azure.
Anyway, just create the archive in local directory and move it to the mount afterwards.
tempPath = '/tmp/demo.zip'
zipPath = '/dbfs/mnt/temp/compressed/demo.zip'
import zipfile
import os
with zipfile.ZipFile(tempPath, 'w') as myzip:
for filename in filenames:
print(filename)
print(myzip)
myzip.write(filename)
os.rename(tempPath, zipPath)
Here i am trying to zip the file and write that to one folder (mount point) using below code in Databricks.
# List all files which need to be compressed
import os
modelPath = '/dbfs/mnt/temp/zip/'
filenames = [os.path.join(root, name) for root, dirs, files in os.walk(top=modelPath , topdown=False) for name in files]
print(filenames)
zipPath = '/dbfs/mnt/temp/compressed/demo.zip'
import zipfile
with zipfile.ZipFile(zipPath, 'w') as myzip:
for filename in filenames:
print(filename)
print(myzip)
myzip.write(filename)
But I am getting error as [Errno 95] Operation not supported.
Error Details
OSError Traceback (most recent call last)
<command-2086761864237851> in <module>
15 print(myzip)
---> 16 myzip.write(filename)
/usr/lib/python3.8/zipfile.py in write(self, filename, arcname, compress_type, compresslevel)
1775 with open(filename, "rb") as src, self.open(zinfo, 'w') as dest:
-> 1776 shutil.copyfileobj(src, dest, 1024*8)
1777
/usr/lib/python3.8/zipfile.py in close(self)
1181 self._fileobj.write(self._zinfo.FileHeader(self._zip64))
-> 1182 self._fileobj.seek(self._zipfile.start_dir)
1183
OSError: [Errno 95] Operation not supported
During handling of the above exception, another exception occurred:
OSError Traceback (most recent call last)
/usr/lib/python3.8/zipfile.py in close(self)
1837 if self._seekable:
-> 1838 self.fp.seek(self.start_dir)
1839 self._write_end_record()
OSError: [Errno 95] Operation not supported
During handling of the above exception, another exception occurred:
OSError Traceback (most recent call last)
OSError: [Errno 95] Operation not supported
During handling of the above exception, another exception occurred:
OSError Traceback (most recent call last)
<command-2086761864237851> in <module>
14 print(filename)
15 print(myzip)
---> 16 myzip.write(filename)
/usr/lib/python3.8/zipfile.py in __exit__(self, type, value, traceback)
1310
1311 def __exit__(self, type, value, traceback):
-> 1312 self.close()
1313
1314 def __repr__(self):
/usr/lib/python3.8/zipfile.py in close(self)
1841 fp = self.fp
1842 self.fp = None
-> 1843 self._fpclose(fp)
1844
1845 def _write_end_record(self):
/usr/lib/python3.8/zipfile.py in _fpclose(self, fp)
1951 self._fileRefCnt -= 1
1952 if not self._fileRefCnt and not self._filePassed:
-> 1953 fp.close()
1954
1955
Could anyone help me to resolve this issue.
Note: Here i can zip the file using shutil, but i want avoid driver so using above approch.
You didn’t provide details of your mount, probably it’s Blob Storage or ADLSv2 and apparently it doesn’t allow file seek.
Check out this simple snippet:
%python
path = '/dbfs/mnt/temp/testfile'
with open(path, "w") as f:
f.write("test")
f.seek(1)
f.write("x")
with open(path, "r") as f:
print(f.read())
It will throw "Operation not supported" at f.seek(1)
.
Repeat the same with path = '/tmp/testfile'
and you’ll get correct result ("txst").
Weird thing is that the seek in zipfile.py should not be reached at all, it looks like self._seekable
returned incorrect value, I’m not sure if that’s a problem of the library or Azure.
Anyway, just create the archive in local directory and move it to the mount afterwards.
tempPath = '/tmp/demo.zip'
zipPath = '/dbfs/mnt/temp/compressed/demo.zip'
import zipfile
import os
with zipfile.ZipFile(tempPath, 'w') as myzip:
for filename in filenames:
print(filename)
print(myzip)
myzip.write(filename)
os.rename(tempPath, zipPath)