How to compress data in C# to be decompressed in zlib python

Question:

I have a python zlib decompressor that takes default parameters as follows, where data is string:

  import zlib
  data_decompressed = zlib.decompress(data)

But, I don’t know how I can compress a string in c# to be decompressed in python. I’ve tray the next piece of code but when I trie to decompresse ‘incorrect header check’ exception is trown.

    static byte[] ZipContent(string entryName)
    {
        // remove whitespace from xml and convert to byte array
        byte[] normalBytes;
        using (StringWriter writer = new StringWriter())
        {
            //xml.Save(writer, SaveOptions.DisableFormatting);
            System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();
            normalBytes = encoding.GetBytes(writer.ToString());
        }

        // zip into new, zipped, byte array
        using (Stream memOutput = new MemoryStream())
        using (ZipOutputStream zipOutput = new ZipOutputStream(memOutput))
        {
            zipOutput.SetLevel(6);

            ZipEntry entry = new ZipEntry(entryName);
            entry.CompressionMethod = CompressionMethod.Deflated;
            entry.DateTime = DateTime.Now;
            zipOutput.PutNextEntry(entry);

            zipOutput.Write(normalBytes, 0, normalBytes.Length);
            zipOutput.Finish();

            byte[] newBytes = new byte[memOutput.Length];
            memOutput.Seek(0, SeekOrigin.Begin);
            memOutput.Read(newBytes, 0, newBytes.Length);

            zipOutput.Close();

            return newBytes;
        }
    }

Anyone could help me please?
Thank you.

UPDATE 1:

I’ve tried with defalte function as Shiraz Bhaiji has posted:

    public static byte[] Deflate(byte[] data)
    {
        if (null == data || data.Length < 1) return null;
        byte[] compressedBytes;

        //write into a new memory stream wrapped by a deflate stream
        using (MemoryStream ms = new MemoryStream())
        {
            using (DeflateStream deflateStream = new DeflateStream(ms, CompressionMode.Compress, true))
            {
                //write byte buffer into memorystream
                deflateStream.Write(data, 0, data.Length);
                deflateStream.Close();

                //rewind memory stream and write to base 64 string
                compressedBytes = new byte[ms.Length];
                ms.Seek(0, SeekOrigin.Begin);
                ms.Read(compressedBytes, 0, (int)ms.Length);

            }
        }
        return compressedBytes;
    }

The problem is that to work properly in python code I’ve to add the “-zlib.MAX_WBITS” argument to decompress as follows:

    data_decompressed = zlib.decompress(data, -zlib.MAX_WBITS)

So, my new question is: is it possible to code a deflate method in C# which compression result could be decompressed with zlib.decompress(data) as defaults?

Asked By: NEBUC

||

Answers:

As you described with your edit, zlib.decompress(data, -zlib.MAX_WBITS) is the correct way to decompress data from C#’s DeflateStream. There are two formats at play here:

  1. deflate – as in specification RFC 1951 – this is what’s C# is producing
  2. zlib – as in specification RFC 1950 – this is what’s Python is expecting by default

What is the difference between the two? It’s small, really:

zlib = [compression flag byte] + [flags byte] + deflate + [adler checksum]

(there are also optional dictionary bytes but we don’t have to worry about them)

Therefore, to get zlib format from deflate, we need to prepend two bytes of flags, and append Adler-32 checksum. Luckily we have an answer on stackoverflow for the flags, see What does a zlib header look like? and implementing Adler-32 is not that hard. So suppose you have your MemoryStream ms, we would first write the two flag bytes

ms.Write(new byte[] {0x78,0x9c});

…then we would do exactly what’s in your answer

using (DeflateStream deflateStream = new DeflateStream(ms, CompressionMode.Compress, true))
{
       deflateStream.Write(data, 0, data.Length);
       deflateStream.Close();

}

and, at last, compute the checksum and append it to the end of the stream:

uint a = 0;
uint b = 0;
for(int i = 0; i < data.Length; ++i)
{
    a = (a + data[i]) % 65521;
    b = (b + a) % 65521;
}

Sadly, I don’t know a pretty way of writing uints into the stream. This is an ugly way:

ms.Write(new byte[] { (byte)(b>>8),
                      (byte)b,
                      (byte)(a>>8),
                      (byte)a
});
Answered By: mz71
Categories: questions Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.