What does a backslash mean in a string literal?

Question:

I made a list of strings like this:

>>> example = ["","1","2","3","4","5","6","7","8","9","10","11"]

But if I try to print it, I get a very different looking result:

>>> print(example)
['x00', 'x01', 'x02', 'x03', 'x04', 'x05', 'x06', 'x07', '\8', '\9', 'x08', 't']

Why does this happen? Does the character have some special meaning here?

Asked By: tehoo

||

Answers:

The backslash is used to escape special (unprintable) characters in string literals. n is for newline, t for tab, f for a form-feed (rarely used) and several more exist.

When you give the string literal "" you effectively denote a string with exactly one character which is the (unprintable) NUL character (a 0-byte). You can represent this as in string literals. The same goes for 1 (which is a 1-byte in a string) etc.

Actually, the 8 and 9 are different because after a backslash you have to denote the value of the byte you want in octal notation, e. g. using digits 07 only. So effectively, the backslash before the 8 and before the 9 has no special meaning and 8 results in two characters, namely the backslash verbatim and the 8 as a digit verbatim.

When you now print the representation of such a string literal (e. g. by having it in a list you print), then the Python interpreter recreates a representation for the internal string (which is supposed to look like a string literal). This is not the string contents, but the version of the string as you can denote it in a Python program, i. e. enclosed in quotes and using backslashes to escape special characters. The Python interpreter doesn’t represent special characters using the octal notation, though. It uses the hexadecimal notation instead which introduces each special character with a x followed by exactly two hexadecimal characters.

That means that becomes x00, 1 becomes x01 etc. The 8, as mentioned, is in fact the representation of two characters, namely the backslash and the digit 8. The backslash is then escaped by the Python interpreter to a double backslash \, and the 8 is appended as normal character.

The input 10 is the character with value 8 (because octal 10 is decimal 8 and also hexadecimal 8, look up octal and hexadecimal numbers to learn about that). So the input 10 becomes x08. The 11 is the character with value 9 which is a tab character for which a special notation exists, that is t.

Answered By: Alfe
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.