Q on Python serialization/deserialization

Question:

What chances do I have to instantiate, keep and serialize/deserialize to/from binary data Python classes reflecting this pattern (adopted from RFC 2246 [TLS]):

   enum { apple, orange } VariantTag;
   struct {
       uint16 number;
       opaque string<0..10>; /* variable length */
   } V1;
   struct {
       uint32 number;
       opaque string[10];    /* fixed length */
   } V2;
   struct {
       select (VariantTag) { /* value of selector is implicit */
           case apple: V1;   /* VariantBody, tag = apple */
           case orange: V2;  /* VariantBody, tag = orange */
       } variant_body;       /* optional label on variant */
   } VariantRecord;

Basically I would have to define a (variant) class VariantRecord, which varies depending on the value of VariantTag. That’s not that difficult. The challenge is to find a most generic way to build a class, which serializes/deserializes to and from a byte stream… Pickle, Google protocol buffer, marshal is all not an option.

I made little success with having an explicit “def serialize” in my class, but I’m not very happy with it, because it’s not generic enough.

I hope I could express the problem.

My current solution in case VariantTag = apple would look like this, but I don’t like it too much

import binascii
import struct

class VariantRecord(object):
  def __init__(self, number, opaque):
    self.number = number
    self.opaque = opaque
  def serialize(self):
    out = struct.pack('>HB%ds' % len(self.opaque), self.number, len(self.opaque), self.opaque)
    return out


v = VariantRecord(10, 'Hello')
print binascii.hexlify(v.serialize())

>> 000a0548656c6c6f

Regards

Asked By: neil

||

Answers:

Two suggestions:

  1. For the variable length structure use a fixed format
    and just slice the result.
  2. Use struct.Struct

e.g. If I’ve understood your formats correctly (is the length byte that appeared in your example but wasn’t mentioned originally present in the other variant also?)

>>> import binascii
>>> import struct
>>> V1 = struct.Struct(">H10p")
>>> V2 = struct.Struct(">L10p")
>>> def serialize(variant, n, s):
    if variant:
        return V2.pack(n,s)
    else:
        return V1.pack(n,s)[:len(s)+3]

    
>>> print binascii.hexlify(serialize(False, 10, 'hello')) #V1
000a0568656c6c6f
>>> print binascii.hexlify(serialize(True, 10, 'hello')) #V2
0000000a0568656c6c6f00000000
>>> 
Answered By: Duncan
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.