Equivalent C union in python?

Question:

Say I’m having a following code in C

union u_type
{
    uint32_t data;
    uint8_t  chunk[4];
} 32bitsdata;

32bitsdata.chunk[0] = some number;
32bitsdata.chunk[1] = some number;
32bitsdata.chunk[2] = some number;
32bitsdata.chunk[3] = some number;

printf("Data in 32 bits: %dn", 32bitsdata.data);

How could I do similar thing in python?

I’m trying to read a binary file (byte by byte) – already got it working, and combining every 3 bytes into one int. Heard struct would do the trick, but I’m not really sure how.

Best,

Henry

Asked By: shjnlee

||

Answers:

Here is what you would do. First, let’s create the raw bytes we need, I’ll cheat and use numpy:

>>> import numpy as np
>>> arr = np.array((8,4,2,4,8), dtype=np.uint32)
>>> arr
array([8, 4, 2, 4, 8], dtype=uint32)
>>> raw_bytes = arr.tobytes()
>>> raw_bytes
b'x08x00x00x00x04x00x00x00x02x00x00x00x04x00x00x00x08x00x00x00'

These could have easily been read from a file. Now, using the struct module is trivial. We use the unsigned int format character 'I':

>>> import struct
>>> list(struct.iter_unpack('I', raw_bytes))
[(8,), (4,), (2,), (4,), (8,)]

Note, each time we iterate we get back a tuple, since our struct has one member, it is a list of singleton tuples. But this is trivial to get into a flat python list:

>>> [t[0] for t in struct.iter_unpack('I', raw_bytes)]
[8, 4, 2, 4, 8]

Another alternative is to read them into an array.array:

>>> import array
>>> my_array = array.array('I', raw_bytes)
>>> my_array
array('I', [8, 4, 2, 4, 8])
Answered By: juanpa.arrivillaga

What about ctypes?

from ctypes import (
        Union, Array, 
        c_uint8, c_uint32, 
        cdll, CDLL
) 

class uint8_array(Array):
    _type_ = c_uint8
    _length_ = 4

class u_type(Union):
    _fields_ = ("data", c_uint32), ("chunk", uint8_array)

# load printf function from Dynamic Linked Libary libc.so.6 (I'm using linux)
libc = CDLL(cdll.LoadLibrary('libc.so.6')._name)
printf = libc.printf

if __name__ == "__main__":
    # initialize union
    _32bitsdata = u_type()
    # set values to chunk
    _32bitsdata.chunk[:] = (1, 2, 3, 4)
    # and print it
    printf(b"Data in 32 bits: %dn", _32bitsdata.data)
Answered By: Nick Tone

You asked about C union, but if your objective is to group 3 bytes into an int, you could use Python struct.unpack instead.

import struct

chunk = bytearray()
chunk.append(0x00)   # some number
chunk.append(0xc0)   # some number
chunk.append(0xff)   # some number
chunk.append(0xee)   # some number

# Convert to a 32-bit unsigned int.
# You didn't specify the byte-order, so I'm using big-endian.
# If you want little-endian instead, replace the '>' symbol by '<'.
data = struct.unpack('>I', chunk)[0]  # unpack returns a tupple, but we only need the first value

print(hex(data))  # the terminal prints 0xc0ffee

If you’re doing fancy numerical manipulation, you’d probably want to use the numpy library anyway, so consider the “view” method of numpy’s ndarray type. The original ndarray can be viewed and modified via the view-array.

>>> import numpy as np
>>> a = np.uint32([1234567890])
>>> b = a.view(np.uint8)
>>> print(a)
[1234567890]
>>> print(b)
[210   2 150  73]
>>> b[2] = 10
>>> print(*b)
210 2 10 73
>>> print(*a)
1225392850
Answered By: Dave Rove
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.