python zipfile module

简介

zipfile是python里用来做zip格式编码的压缩和解压缩的

zipfile里有两个非常重要的class, 分别是ZipFileZipInfo, 在绝大多数的情况下,我们只需要使用这两个class就ok了

  • ZipFile是主要的类,用来创建和读取zip文件
  • ZipInfo是存储的zip文件的每个文件的信息的

用法

模块定义了以下内容:

exception zipfile.BadZipFile

为损坏的 ZIP 文件抛出的错误。

exception zipfile.BadZipfile

BadZipFile的别名,与旧版本 Python 保持兼容性,3.2 版后已移除.

exception zipfile.LargeZipFile

当 ZIP 文件需要 ZIP64 功能但是未启用时会抛出此错误。

class zipfile.ZipFile

用于读写 ZIP 文件的类。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
class zipfile.ZipFile(file, mode='r', compression=ZIP_STORED, allowZip64=True, compresslevel=None)

作用:打开一个 ZIP 文件
形参:
file: 一个指向文件的路径(字符串),一个类文件对象或者一个 path-like object
mode:
'r' 读取一个存在的文件;
'w' 来截断并写入新的文件;
'a' 来添加到一个存在的文件,如果 mode 为 'a' 且 file 为已存在的文件,则格外的文件将被加入,如果 mode 为 'a' 并且文件不存在, 则会新建。;
'x' 来仅新建并写入新的文件。如果 mode 为 'x' 并且 file 指向已经存在的文件,则抛出FileExistsError;
如果 mode 为 'r' 或 'a', 则文件应当可定位。
compression:默认ZIP_STORED,可选 ZIP_STORED, ZIP_DEFLATED, ZIP_BZIP2 or ZIP_LZMA;
allowZip64: 默认True,如果allowZip64为真,则zipfile将在zipfile大于4 GiB时创建使用ZIP64扩展名的ZIP文件。如果是false,则当ZIP文件需要ZIP64扩展名时将引发异常;
compresslevel: 控制写入归档文件时使用的压缩级别。若使用ZIP_STORED或ZIP_LZMA没有效果。当使用ZIP_DEFLATED时可选0到9。使用ZIP_BZIP2时可选1到9。
ZipFile.namelist()

Return a list of archive members by name.

1
2
3
4
5
6
7
import zipfile

with zipfile.ZipFile('D:\\Github\\test_comprese.zip') as my_zip:
name_list = my_zip.namelist()

>>> name_list
['test_comprese/', 'test_comprese/csv/', 'test_comprese/csv/iris.csv', 'test_comprese/tips.csv']
ZipFile.infolist()

返回包含每个存档成员的ZipInfo对象的列表。对象的顺序与它们在磁盘上实际ZIP文件相同。

1
2
3
4
5
6
7
8
9
10
import zipfile

with zipfile.ZipFile('D:\\Github\\test_comprese.zip') as my_zip:
file_info_list = my_zip.infolist()

>>> file_info_list
[<ZipInfo filename='test_comprese/' external_attr=0x10>,
<ZipInfo filename='test_comprese/csv/' external_attr=0x10>,
<ZipInfo filename='test_comprese/csv/iris.csv' compress_type=deflate external_attr=0x20 file_size=4600 compress_size=862>,
<ZipInfo filename='test_comprese/tips.csv' compress_type=deflate external_attr=0x20 file_size=7943 compress_size=1695>]
ZipFile.getinfo(name)

返回关于name信息的ZipInfo对象。若name不在该zip文件中,引发 KeyError。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
import zipfile

with zipfile.ZipFile('D:\\Github\\test_comprese.zip') as my_zip:
file_info = my_zip.getinfo('test_comprese/')

>>> file_info
<ZipInfo filename='test_comprese/' external_attr=0x10>

>>> file_info.is_dir()
True
>>> file_info.filename
'test_comprese/'
>>> file_info.date_time
(2019, 12, 29, 14, 15, 16)

注:
file_info 可用方法参考目录 class zipfile.ZipInfo
ZipFile.open(name, mode=’r’, pwd=None, *, force_zip64=False)

Access a member of the archive as a binary file-like object.

name can be either the name of a file within the archive or a ZipInfo object.

mode must be ‘r’ (the default) or ‘w’.

  • With mode ‘r’ the file-like object (ZipExtFile) is read-only and provides the following methods: read(), readline(), readlines(), seek(), tell(), iter(), next(). These objects can operate independently of the ZipFile.
  • With mode=’w’, a writable file handle is returned, which supports the write() method. While a writable file handle is open, attempting to read or write other files in the ZIP file will raise a ValueError.

pwd is the password used to decrypt encrypted ZIP files.

force_zip64, when writing a file, if the file size is not known in advance but may exceed 2 GiB, pass force_zip64=True to ensure that the header format is capable of supporting large files. If the file size is known in advance, construct a ZipInfo object with file_size set, and use that as the name parameter.

1
2
3
with ZipFile('D:\\Github\\test_comprese.zip') as myzip:
with myzip.open('test_comprese/csv/iris.csv') as myfile:
print(myfile.read())
ZipFile.extract(member, path=None, pwd=None)

Extract a member from the archive to the current working directory;

member must be its full name or a ZipInfo object. Its file information is extracted as accurately as possible.

path specifies a different directory to extract to.

pwd is the password used for encrypted files.

Returns the normalized path created (a directory or new file).

1
2
with zipfile.ZipFile('D:\\Github\\test_comprese.zip') as my_zip:
my_zip.extract('test_comprese/csv/iris.csv', "D:\\my_dir")
ZipFile.extractall(path=None, members=None, pwd=None)

Extract all members from the archive to the current working directory.

path specifies a different directory to extract to.

members is optional and must be a subset of the list returned by namelist().

pwd is the password used for encrypted files.

1
2
with zipfile.ZipFile('D:\\Github\\test_comprese.zip') as my_zip:
my_zip.extractall("D:\\my_dir")
ZipFile.printdir()

Print a table of contents for the archive to sys.stdout.

1
2
3
4
5
6
7
8
9
with zipfile.ZipFile('D:\\Github\\test_comprese.zip') as my_zip:
my_zip.printdir()

>>>
File Name Modified Size
test_comprese/ 2019-12-29 14:15:16 0
test_comprese/csv/ 2019-12-29 14:15:16 0
test_comprese/csv/iris.csv 2018-07-01 23:41:02 4600
test_comprese/tips.csv 2018-07-01 23:41:02 7943
ZipFile.setpassword(pwd)

Set pwd as default password to extract encrypted files.

ZipFile.read(name, pwd=None)

Return the bytes of the file name in the archive.

name is the name of the file in the archive, or a ZipInfo object. The archive must be open for read or append.

pwd is the password used for encrypted files and, if specified, it will override the default password set with setpassword(). Calling read() on a ZipFile that uses a compression method other than ZIP_STORED, ZIP_DEFLATED, ZIP_BZIP2 or ZIP_LZMA will raise a NotImplementedError. An error will also be raised if the corresponding compression module is not available.

1
2
with zipfile.ZipFile('D:\\Github\\test_comprese.zip') as my_zip:
content = my_zip.read('test_comprese/csv/iris.csv')
ZipFile.write(filename, arcname=None, compress_type=None, compresslevel=None)

Write the file named filename to the archive, giving it the archive name arcname (by default, this will be the same as filename, but without a drive letter and with leading path separators removed). If given, compress_type overrides the value given for the compression parameter to the constructor for the new entry. Similarly, compresslevel will override the constructor if given. The archive must be open with mode ‘w’, ‘x’ or ‘a’.

ZipFile.writestr(zinfo_or_arcname, data, compress_type=None, compresslevel=None)

Write a file into the archive. The contents is data, which may be either a str or a bytes instance; if it is a str, it is encoded as UTF-8 first. zinfo_or_arcname is either the file name it will be given in the archive, or a ZipInfo instance. If it’s an instance, at least the filename, date, and time must be given. If it’s a name, the date and time is set to the current date and time. The archive must be opened with mode ‘w’, ‘x’ or ‘a’.

If given, compress_type overrides the value given for the compression parameter to the constructor for the new entry, or in the zinfo_or_arcname (if that is a ZipInfo instance). Similarly, compresslevel will override the constructor if given.

ZipFile.testzip()

Read all the files in the archive and check their CRC’s and file headers. Return the name of the first bad file, or else return None.

ZipFile.filename

Name of the ZIP file.

ZipFile.debug

The level of debug output to use. This may be set from 0 (the default, no output) to 3 (the most output). Debugging information is written to sys.stdout.

ZipFile.comment

The comment associated with the ZIP file as a bytes object. If assigning a comment to a ZipFile instance created with mode ‘w’, ‘x’ or ‘a’, it should be no longer than 65535 bytes. Comments longer than this will be truncated.

ZipFile.close()

class zipfile.PyZipFile

用于创建包含 Python 库的 ZIP 归档的类。

class zipfile.ZipInfo

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
class zipfile.ZipInfo(filename='NoName', date_time=(1980, 1, 1, 0, 0, 0))

ZipInfo.is_dir()
Return True if this archive member is a directory.
This uses the entry's name: directories should always end with /.

ZipInfo.filename
Name of the file in the archive.

ZipInfo.date_time
The time and date of the last modification to the archive member.

ZipInfo.compress_type
Type of compression for the archive member.

ZipInfo.comment
Comment for the individual archive member as a bytes object.

ZipInfo.extra
Expansion field data. The PKZIP Application Note contains some comments on the internal structure of the data contained in this bytes object.

ZipInfo.create_system
System which created ZIP archive.

ZipInfo.create_version
PKZIP version which created ZIP archive.

ZipInfo.extract_version
PKZIP version needed to extract archive.

ZipInfo.flag_bits
ZIP flag bits.

ZipInfo.volume
Volume number of file header.

ZipInfo.internal_attr
Internal attributes.

ZipInfo.external_attr
External file attributes.

ZipInfo.header_offset
Byte offset to the file header.

ZipInfo.CRC
CRC-32 of the uncompressed file.

ZipInfo.compress_size
Size of the compressed data.

ZipInfo.file_size
Size of the uncompressed file.

用于表示档案内一个成员信息的类。 此类的实例会由 ZipFile 对象的 getinfo() 和 infolist() 方法返回。 大多数 zipfile 模块的用户都不必创建它们,只需使用此模块所创建的实例。 filename 应当是档案成员的全名,date_time 应当是包含六个字段的描述最近修改时间的元组;这些字段的描述请参阅 ZipInfo Objects。

zipfile.is_zipfile(filename)

根据文件的 Magic Number,如果 filename 是一个有效的 ZIP 文件则返回 True,否则返回 False。 filename 也可能是一个文件或类文件对象。

zipfile.ZIP_STORED

未被压缩的归档成员的数字常数。

zipfile.ZIP_DEFLATED

常用的 ZIP 压缩方法的数字常数。需要 zlib 模块。

zipfile.ZIP_BZIP2

BZIP2 压缩方法的数字常数。需要 bz2 模块。

zipfile.ZIP_LZMA

LZMA 压缩方法的数字常数。需要 lzma 模块。

命令行界面

The zipfilemodule provides a simple command-line interface to interact with ZIP archives.

If you want to create a new ZIP archive, specify its name after the -c option and then list the filename(s) that should be included:

1
$ python -m zipfile -c monty.zip spam.txt eggs.txt

Passing a directory is also acceptable:

1
$ python -m zipfile -c monty.zip life-of-brian_1979/

If you want to extract a ZIP archive into the specified directory, use the -e option:

1
$ python -m zipfile -e monty.zip target-dir/

For a list of the files in a ZIP archive, use the -l option:

1
$ python -m zipfile -l monty.zip

命令行选项

  • -l`` <zipfile>

  • --list`` <zipfile>

    List files in a zipfile.

  • -c`` <zipfile> <source1> ... <sourceN>

  • --create`` <zipfile> <source1> ... <sourceN>

    Create zipfile from source files.

  • -e`` <zipfile> <output_dir>

  • --extract`` <zipfile> <output_dir>

    Extract zipfile into target directory.

  • -t`` <zipfile>

  • --test`` <zipfile>

    Test whether the zipfile is valid or not.

参考

https://docs.python.org/3.7/library/zipfile.html#zipinfo-objects

-------------本文结束感谢您的阅读-------------