What does preceding a string literal with "r" mean?
Question:
I first saw it used in building regular expressions across multiple lines as a method argument to re.compile()
, so I assumed that r
stands for RegEx.
For example:
regex = re.compile(
r'^[A-Z]'
r'[A-Z0-9-]'
r'[A-Z]$', re.IGNORECASE
)
So what does r
mean in this case? Why do we need it?
Answers:
It means that escapes won’t be translated. For example:
r'n'
is a string with a backslash followed by the letter n
. (Without the r
it would be a newline.)
b
does stand for byte-string and is used in Python 3, where strings are Unicode by default. In Python 2.x strings were byte-strings by default and you’d use u
to indicate Unicode.
The r
means that the string is to be treated as a raw string, which means all escape codes will be ignored.
For an example:
'n'
will be treated as a newline character, while r'n'
will be treated as the characters
followed by n
.
When an 'r'
or 'R'
prefix is present,
a character following a backslash is
included in the string without change,
and all backslashes are left in the
string. For example, the string
literal r"n"
consists of two
characters: a backslash and a
lowercase 'n'
. String quotes can be
escaped with a backslash, but the
backslash remains in the string; for
example, r"""
is a valid string
literal consisting of two characters:
a backslash and a double quote; r""
is not a valid string literal (even a
raw string cannot end in an odd number
of backslashes). Specifically, a raw
string cannot end in a single
backslash (since the backslash would
escape the following quote character).
Note also that a single backslash
followed by a newline is interpreted
as those two characters as part of the
string, not as a line continuation.
Source: Python string literals
I first saw it used in building regular expressions across multiple lines as a method argument to re.compile()
, so I assumed that r
stands for RegEx.
For example:
regex = re.compile(
r'^[A-Z]'
r'[A-Z0-9-]'
r'[A-Z]$', re.IGNORECASE
)
So what does r
mean in this case? Why do we need it?
It means that escapes won’t be translated. For example:
r'n'
is a string with a backslash followed by the letter n
. (Without the r
it would be a newline.)
b
does stand for byte-string and is used in Python 3, where strings are Unicode by default. In Python 2.x strings were byte-strings by default and you’d use u
to indicate Unicode.
The r
means that the string is to be treated as a raw string, which means all escape codes will be ignored.
For an example:
'n'
will be treated as a newline character, while r'n'
will be treated as the characters followed by
n
.
When an
'r'
or'R'
prefix is present,
a character following a backslash is
included in the string without change,
and all backslashes are left in the
string. For example, the string
literalr"n"
consists of two
characters: a backslash and a
lowercase'n'
. String quotes can be
escaped with a backslash, but the
backslash remains in the string; for
example,r"""
is a valid string
literal consisting of two characters:
a backslash and a double quote;r""
is not a valid string literal (even a
raw string cannot end in an odd number
of backslashes). Specifically, a raw
string cannot end in a single
backslash (since the backslash would
escape the following quote character).
Note also that a single backslash
followed by a newline is interpreted
as those two characters as part of the
string, not as a line continuation.
Source: Python string literals