Reading Corrupted/partial Zip Files In Python
Join the DZone community and get the full member experience.Join For Free
First a simple script for reading non-corruted zipfiles in python:
filename = 'foo.zip' import zipfile z = zipfile.ZipFile(filename) for i in z.infolist(): print i.filename, i.file_size z.read('somefile')Next we use 'zip -FF foo.zip' to fix the zipfile, before reading it:
filename = 'foo.zip' import zipfile try: z = zipfile.ZipFile(filename) except zipfile.BadZipfile: import commands commands.getoutput('zip -FF '+filename) z = zipfile.ZipFile(filename) for i in z.infolist(): print i.filename, i.file_size try: z.read('somefile') except zipfile.BadZipfile: print 'Bad CRC-32'In short: use 'zip -FF file.zip' to fix the file. It will restore the filelist.
Opinions expressed by DZone contributors are their own.