Why does the following string work without an additional escape?

Question:

In the following:

>>> r'd+','d+', '\d+'
('\d+', '\d+', '\d+')

Why does the backslash in 'd+' not need to be escaped? Why does this give the same result as the other two literals?

Similarly:

>>> r'[a-z]+1', '[a-z]+1'
('[a-z]+\1', '[a-z]+x01')

Why does the 1 get converted into a hex escape?

Asked By: David542

||

Answers:

Because d is not an escape code. So, however you type it, it is interpreted as a literal then a d.
If you type \d, then the \ is interpreted as an escaped , followed by a d.

The situation is different if you choose a letter part of an escape code.

r'n+','n+', '\n+'

('\n+', 'n+', '\n+')

The first one (because raw) and the last one (because is escaped) is a 3-letter string containing a a n and a +.
The second one is a 2 letter string, containing a 'n' (a newline) and a +

The second one is even more straightforward. Nothing strange here. r'1' is a backslash then a one. '1' is the character whose ASCII code is 1, whose canonical representation is 'x01'
'1', 'x01' or '01' are the same thing. Python cannot remember what specific syntax you used to type it. All it knows is it that is the character of code 1. So, it displays it in the "canonical way".

Exactly like 'A' 'x41' or '101' are the same thing. And would all be printed with the canonical representation, which is 'A'

Answered By: chrslg

String and Bytes literals has tables showing which backslash combinations are actually escape sequences that have a special meaning. Combinations outside of these tables are not escapes, are not part of the raw string rules and are treated as regular characters. "d" is two characters as is r"d". You’ll find, for instance, that "n" (a single newline character) will work differently than d.

1 is an ooo octal escape. When printed, python shows the same character value as a hex escape. Interestingly, 8 isn’t octal but instead of raising an error, python just treats it as two characters (because its not an escape).

Answered By: tdelaney
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.