How to convert string to binary?

Question:

I am in need of a way to get the binary representation of a string in python. e.g.

st = "hello world"
toBinary(st)

Is there a module of some neat way of doing this?

Asked By: user1090614

||

Answers:

You can access the code values for the characters in your string using the ord() built-in function. If you then need to format this in binary, the string.format() method will do the job.

a = "test"
print(' '.join(format(ord(x), 'b') for x in a))

(Thanks to Ashwini Chaudhary for posting that code snippet.)

While the above code works in Python 3, this matter gets more complicated if you’re assuming any encoding other than UTF-8. In Python 2, strings are byte sequences, and ASCII encoding is assumed by default. In Python 3, strings are assumed to be Unicode, and there’s a separate bytes type that acts more like a Python 2 string. If you wish to assume any encoding other than UTF-8, you’ll need to specify the encoding.

In Python 3, then, you can do something like this:

a = "test"
a_bytes = bytes(a, "ascii")
print(' '.join(["{0:b}".format(x) for x in a_bytes]))

The differences between UTF-8 and ascii encoding won’t be obvious for simple alphanumeric strings, but will become important if you’re processing text that includes characters not in the ascii character set.

Answered By: Mark R. Wilkins

Something like this?

>>> st = "hello world"
>>> ' '.join(format(ord(x), 'b') for x in st)
'1101000 1100101 1101100 1101100 1101111 100000 1110111 1101111 1110010 1101100 1100100'

#using `bytearray`
>>> ' '.join(format(x, 'b') for x in bytearray(st, 'utf-8'))
'1101000 1100101 1101100 1101100 1101111 100000 1110111 1101111 1110010 1101100 1100100'
Answered By: Ashwini Chaudhary

If by binary you mean bytes type, you can just use encode method of the string object that encodes your string as a bytes object using the passed encoding type. You just need to make sure you pass a proper encoding to encode function.

In [9]: "hello world".encode('ascii')                                                                                                                                                                       
Out[9]: b'hello world'

In [10]: byte_obj = "hello world".encode('ascii')                                                                                                                                                           

In [11]: byte_obj                                                                                                                                                                                           
Out[11]: b'hello world'

In [12]: byte_obj[0]                                                                                                                                                                                        
Out[12]: 104

Otherwise, if you want them in form of zeros and ones –binary representation– as a more pythonic way you can first convert your string to byte array then use bin function within map :

>>> st = "hello world"
>>> map(bin,bytearray(st))
['0b1101000', '0b1100101', '0b1101100', '0b1101100', '0b1101111', '0b100000', '0b1110111', '0b1101111', '0b1110010', '0b1101100', '0b1100100']
 

Or you can join it:

>>> ' '.join(map(bin,bytearray(st)))
'0b1101000 0b1100101 0b1101100 0b1101100 0b1101111 0b100000 0b1110111 0b1101111 0b1110010 0b1101100 0b1100100'

Note that in python3 you need to specify an encoding for bytearray function :

>>> ' '.join(map(bin,bytearray(st,'utf8')))
'0b1101000 0b1100101 0b1101100 0b1101100 0b1101111 0b100000 0b1110111 0b1101111 0b1110010 0b1101100 0b1100100'

You can also use binascii module in python 2:

>>> import binascii
>>> bin(int(binascii.hexlify(st),16))
'0b110100001100101011011000110110001101111001000000111011101101111011100100110110001100100'

hexlify return the hexadecimal representation of the binary data then you can convert to int by specifying 16 as its base then convert it to binary with bin.

Answered By: Mazdak

This is an update for the existing answers which used bytearray() and can not work that way anymore:

>>> st = "hello world"
>>> map(bin, bytearray(st))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: string argument without an encoding

Because, as explained in the link above, if the source is a string, you must also give the encoding:

>>> map(bin, bytearray(st, encoding='utf-8'))
<map object at 0x7f14dfb1ff28>
Answered By: Billal Begueradj
def method_a(sample_string):
    binary = ' '.join(format(ord(x), 'b') for x in sample_string)

def method_b(sample_string):
    binary = ' '.join(map(bin,bytearray(sample_string,encoding='utf-8')))


if __name__ == '__main__':

    from timeit import timeit

    sample_string = 'Convert this ascii strong to binary.'

    print(
        timeit(f'method_a("{sample_string}")',setup='from __main__ import method_a'),
        timeit(f'method_b("{sample_string}")',setup='from __main__ import method_b')
    )

# 9.564299999998184 2.943955828988692

method_b is substantially more efficient at converting to a byte array because it makes low level function calls instead of manually transforming every character to an integer, and then converting that integer into its binary value.

Answered By: Ben

We just need to encode it.

'string'.encode('ascii')
Answered By: Tao

In Python version 3.6 and above you can use f-string to format result.

str = "hello world"
print(" ".join(f"{ord(i):08b}" for i in str))

01101000 01100101 01101100 01101100 01101111 00100000 01110111 01101111 01110010 01101100 01100100
  • The left side of the colon, ord(i), is the actual object whose value
    will be formatted and inserted into the output. Using ord() gives you
    the base-10 code point for a single str character.

  • The right hand side of the colon is the format specifier. 08 means
    width 8, 0 padded, and the b functions as a sign to output the
    resulting number in base 2 (binary).

Answered By: Vlad Bezden
a = list(input("Enter a stringt: "))
def fun(a):
    c =' '.join(['0'*(8-len(bin(ord(i))[2:]))+(bin(ord(i))[2:]) for i in a])
    return c
print(fun(a))
Answered By: Solo Ship
''.join(format(i, 'b') for i in bytearray(str, encoding='utf-8'))

This works okay since its easy to now revert back to the string as no
zeros will be added to reach the 8 bits to form a byte hence easy to
revert to string to avoid complexity of removing the zeros added.

Answered By: Francis kinyuru
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.