Using Python How can I read the bits in a byte?

Question:

I have a file where the first byte contains encoded information. In Matlab I can read the byte bit by bit with var = fread(file, 8, 'ubit1'), and then retrieve each bit by var(1), var(2), etc.

Is there any equivalent bit reader in python?

Asked By: David

||

Answers:

The smallest unit you’ll be able to work with is a byte. To work at the bit level you need to use bitwise operators.

x = 3
#Check if the 1st bit is set:
x&1 != 0
#Returns True

#Check if the 2nd bit is set:
x&2 != 0
#Returns True

#Check if the 3rd bit is set:
x&4 != 0
#Returns False
Answered By: Brian R. Bondy

You won’t be able to read each bit one by one – you have to read it byte by byte. You can easily extract the bits out, though:

f = open("myfile", 'rb')
# read one byte
byte = f.read(1)
# convert the byte to an integer representation
byte = ord(byte)
# now convert to string of 1s and 0s
byte = bin(byte)[2:].rjust(8, '0')
# now byte contains a string with 0s and 1s
for bit in byte:
    print bit
Answered By: Daniel G

There are two possible ways to return the i-th bit of a byte. The “first bit” could refer to the high-order bit or it could refer to the lower order bit.

Here is a function that takes a string and index as parameters and returns the value of the bit at that location. As written, it treats the low-order bit as the first bit. If you want the high order bit first, just uncomment the indicated line.

def bit_from_string(string, index):
       i, j = divmod(index, 8)

       # Uncomment this if you want the high-order bit first
       # j = 8 - j

       if ord(string[i]) & (1 << j):
              return 1
       else:
              return 0

The indexing starts at 0. If you want the indexing to start at 1, you can adjust index in the function before calling divmod.

Example usage:

>>> for i in range(8):
>>>       print i, bit_from_string('x04', i)
0 0
1 0
2 1
3 0
4 0
5 0
6 0
7 0

Now, for how it works:

A string is composed of 8-bit bytes, so first we use divmod() to break the index into to parts:

  • i: the index of the correct byte within the string
  • j: the index of the correct bit within that byte

We use the ord() function to convert the character at string[i] into an integer type. Then, (1 << j) computes the value of the j-th bit by left-shifting 1 by j. Finally, we use bitwise-and to test if that bit is set. If so return 1, otherwise return 0.

Answered By: Daniel Stutzbach

Read the bits from a file, low bits first.

def bits(f):
    bytes = (ord(b) for b in f.read())
    for b in bytes:
        for i in xrange(8):
            yield (b >> i) & 1

for b in bits(open('binary-file.bin', 'r')):
    print b
Answered By: user97370

This is pretty fast I would think:

import itertools
data = range(10)
format = "{:0>8b}".format
newdata = (False if n == '0' else True for n in itertools.chain.from_iterable(map(format, data)))
print(newdata) # prints tons of True and False
Answered By: vitiral

With numpy it is easy like this:

Bytes = numpy.fromfile(filename, dtype = "uint8")
Bits = numpy.unpackbits(Bytes)

More info here:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromfile.html

Answered By: Mikhail V

Supposing you have a file called bloom_filter.bin which contains an array of bits and you want to read the entire file and use those bits in an array.

First create the array where the bits will be stored after reading,

from bitarray import bitarray
a=bitarray(size)           #same as the number of bits in the file

Open the file,
using open or with, anything is fine…I am sticking with open here,

f=open('bloom_filter.bin','rb')

Now load all the bits into the array ‘a’ at one shot using,

f.readinto(a)

‘a’ is now a bitarray containing all the bits

Answered By: Tarun

To read a byte from a file: bytestring = open(filename, 'rb').read(1). Note: the file is opened in the binary mode.

To get bits, convert the bytestring into an integer: byte = bytestring[0] (Python 3) or byte = ord(bytestring[0]) (Python 2) and extract the desired bit: (byte >> i) & 1:

>>> for i in range(8): (b'a'[0] >> i) & 1
... 
1
0
0
0
0
1
1
0
>>> bin(b'a'[0])
'0b1100001'
Answered By: jfs

Joining some of the previous answers I would use:

[int(i) for i in "{0:08b}".format(byte)]

For each byte read from the file. The results for an 0x88 byte example is:

>>> [int(i) for i in "{0:08b}".format(0x88)]
[1, 0, 0, 0, 1, 0, 0, 0]

You can assign it to a variable and work as per your initial request.
The “{0.08}” is to guarantee the full byte length

Answered By: Francisco

I think this is a more pythonic way:

a = 140
binary = format(a, 'b')

The result of this block is:

‘10001100’

I was to get bit planes of the image and this function helped me to write this block:

def img2bitmap(img: np.ndarray) -> list:
    if img.dtype != np.uint8 or img.ndim > 2:
        raise ValueError("Image is not uint8 or gray")
    bit_mat = [np.zeros(img.shape, dtype=np.uint8) for _ in range(8)]
    for row_number in range(img.shape[0]):
        for column_number in range(img.shape[1]):
            binary = format(img[row_number][column_number], 'b')
            for idx, bit in enumerate("".join(reversed(binary))[:]):
                bit_mat[idx][row_number, column_number] = 2 ** idx if int(bit) == 1 else 0
    return bit_mat

Also by this block, I was able to make primitives image from extracted bit planes

img = cv2.imread('test.jpg', cv2.IMREAD_GRAYSCALE)
out = img2bitmap(img)
original_image = np.zeros(img.shape, dtype=np.uint8)
for i in range(original_image.shape[0]):
    for j in range(original_image.shape[1]):
        for data in range(8):
            x = np.array([original_image[i, j]], dtype=np.uint8)
            data = np.array([data], dtype=np.uint8)
            flag = np.array([0 if out[data[0]][i, j] == 0 else 1], dtype=np.uint8)
            mask = flag << data[0]
            x[0] = (x[0] & ~mask) | ((flag[0] << data[0]) & mask)
            original_image[i, j] = x[0]
Answered By: AmirMasoud
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.