In some cases, when I load an existing pickle file, and after that dump it again, the size is almost halved.
I wonder why, and the first suspect is the protocol version.
Can I somehow find out with which protocol version a file was pickled?
There may be a more elegant way but to get down to the metal you can use
import pickle import pickletools s = pickle.dumps('Test') proto_op = next(pickletools.genops(s)) assert proto_op.name == 'PROTO' proto_ver = proto_op
To figure out the version required to decode this, you’ll need to maximum protocol version of each opcode:
proto_ver = max(op.proto for op in pickletools.genops(s))
A convenient solution in command line by using
$ python -m pickletools filename.pickle 0: x80 PROTO 5 2: x95 FRAME 14451 11: ] EMPTY_LIST 12: x94 MEMOIZE (as 0) ... 14459: b BUILD 14460: a APPEND 14461: . STOP highest protocol among opcodes = 5
The first line with
PROTO showing the pickle Protocol version of the file. And the last line also give you infomation abouth the protocol.