Is there a formatted byte string literal in Python 3.6+?

Question:

I’m looking for a formatted byte string literal. Specifically, something equivalent to

name = "Hello"
bytes(f"Some format string {name}")

Possibly something like fb"Some format string {name}".

Does such a thing exist?

Asked By: Enrico Borba

||

Answers:

This was one of the bigger changes made from python 2 to python3. They handle unicode and strings differently.

This s how you’d convert to bytes.

string = "some string format"
string.encode()
print(string)

This is how you’d decode to string.

string.decode()

I had a better appreciation for the difference between Python 2 versus 3 change to unicode through this coursera lecture by Charles Severence. You can watch the entire 17 minute video or fast forward to somewhere around 10:30 if you want to get to the differences between python 2 and 3 and how they handle characters and specifically unicode.

I understand your actual question is how you could format a string that has both strings and bytes.

inBytes = b"testing"
inString = 'Hello'
type(inString) #This will yield <class 'str'>
type(inBytes) #this will yield <class 'bytes'>

Here you could see that I have a string a variable and a bytes variable.

This is how you would combine a byte and string into one string.

formattedString=(inString + ' ' + inBytes.encode())
Answered By: Dom DaFonte

No. The idea is explicitly dismissed in the PEP:

For the same reason that we don’t support bytes.format(), you may
not combine 'f' with 'b' string literals. The primary problem
is that an object’s __format__() method may return Unicode data
that is not compatible with a bytes string.

Binary f-strings would first require a solution for
bytes.format(). This idea has been proposed in the past, most
recently in PEP 461. The discussions of such a feature usually
suggest either

  • adding a method such as __bformat__() so an object can control how it is converted to bytes, or

  • having bytes.format() not be as general purpose or extensible as str.format().

Both of these remain as options in the future, if such functionality
is desired.

Answered By: jwodder

From python 3.6.2 this percent formatting for bytes works for some use cases:

print(b"Some stuff %a. Some other stuff" % my_byte_or_unicode_string)

But as AXO commented:

This is not the same. %a (or %r) will give the representation of the string, not the string iteself. For example b'%a' % b'bytes' will give b"b'bytes'", not b'bytes'.

Which may or may not matter depending on if you need to just present the formatted byte_or_unicode_string in a UI or if you potentially need to do further manipulation.

Answered By: Bob Jordan

In 3.6+ you can do:

>>> a = 123
>>> f'{a}'.encode()
b'123'
Answered By: johnson

You were actually super close in your suggestion; if you add an encoding kwarg to your bytes() call, then you get the desired behavior:

>>> name = "Hello"
>>> bytes(f"Some format string {name}", encoding="utf-8")

b'Some format string Hello'

Caveat: This works in 3.8 for me, but note at the bottom of the Bytes Object headline in the docs seem to suggest that this should work with any method of string formatting in all of 3.x (using str.format() for versions <3.6 since that’s when f-strings were added, but the OP specifically asks about 3.6+).

Answered By: dayofthepenguin