Reading Corrupted/partial Zip Files In Python
Join the DZone community and get the full member experience.
Join For FreeFirst a simple script for reading non-corruted zipfiles in python:
filename = 'foo.zip'
import zipfile
z = zipfile.ZipFile(filename)
for i in z.infolist():
print i.filename, i.file_size
z.read('somefile')
Next we use 'zip -FF foo.zip' to fix the zipfile, before reading it:
filename = 'foo.zip'
import zipfile
try:
z = zipfile.ZipFile(filename)
except zipfile.BadZipfile:
import commands
commands.getoutput('zip -FF '+filename)
z = zipfile.ZipFile(filename)
for i in z.infolist():
print i.filename, i.file_size
try:
z.read('somefile')
except zipfile.BadZipfile:
print 'Bad CRC-32'
In short: use 'zip -FF file.zip' to fix the file. It will restore the filelist.
Python (language)
Opinions expressed by DZone contributors are their own.
Comments