How to fix "<string> DeprecationWarning: invalid escape sequence" in Python?
Question:
I’m getting lots of warnings like this in Python:
DeprecationWarning: invalid escape sequence A
orcid_regex = 'A[0-9]{4}-[0-9]{4}-[0-9]{4}-[0-9]{3}[0-9X]Z'
DeprecationWarning: invalid escape sequence /
AUTH_TOKEN_PATH_PATTERN = '^/api/groups'
DeprecationWarning: invalid escape sequence
"""
DeprecationWarning: invalid escape sequence .
DOI_PATTERN = re.compile('(https?://(dx.)?doi.org/)?10.[0-9]{4,}[.0-9]*/.*')
<unknown>:20: DeprecationWarning: invalid escape sequence (
<unknown>:21: DeprecationWarning: invalid escape sequence (
What do they mean? And how can I fix them?
Answers:
is the escape character in Python string literals.
For example if you want to put a tab character in a string you would do:
>>> print("foo t bar")
foo bar
If you want to put a literal
in a string you have to use \
:
>>> print("foo \ bar")
foo bar
Or use a “raw string”:
>>> print(r"foo bar")
foo bar
You can’t just go putting backslashes in string literals whenever you want one. A backslash isn’t valid when not followed by one of the valid escape sequences, and newer versions of Python print a deprecation warning. For example A
isn’t an escape sequence:
$ python3.6 -Wd -c '"A"'
<string>:1: DeprecationWarning: invalid escape sequence A
If your backslash sequence does accidentally match one of Python’s escape sequences, but you didn’t mean it to, that’s even worse.
So you should always use raw strings or \
.
It’s important to remember that a string literal is still a string literal even if that string is intended to be used as a regular expression. Python’s regular expression syntax supports lots of special sequences that begin with
. For example A
matches the start of a string. But A
is not valid in a Python string literal! This is invalid:
my_regex = "Afoo"
Instead you should do this:
my_regex = r"Afoo"
Docstrings are another one to remember: docstrings are string literals too, and invalid
sequences are invalid in docstrings too! Use raw strings (r"""..."""
) for docstrings if they contain
‘s.
I’m getting lots of warnings like this in Python:
DeprecationWarning: invalid escape sequence A
orcid_regex = 'A[0-9]{4}-[0-9]{4}-[0-9]{4}-[0-9]{3}[0-9X]Z'
DeprecationWarning: invalid escape sequence /
AUTH_TOKEN_PATH_PATTERN = '^/api/groups'
DeprecationWarning: invalid escape sequence
"""
DeprecationWarning: invalid escape sequence .
DOI_PATTERN = re.compile('(https?://(dx.)?doi.org/)?10.[0-9]{4,}[.0-9]*/.*')
<unknown>:20: DeprecationWarning: invalid escape sequence (
<unknown>:21: DeprecationWarning: invalid escape sequence (
What do they mean? And how can I fix them?
is the escape character in Python string literals.
For example if you want to put a tab character in a string you would do:
>>> print("foo t bar")
foo bar
If you want to put a literal in a string you have to use
\
:
>>> print("foo \ bar")
foo bar
Or use a “raw string”:
>>> print(r"foo bar")
foo bar
You can’t just go putting backslashes in string literals whenever you want one. A backslash isn’t valid when not followed by one of the valid escape sequences, and newer versions of Python print a deprecation warning. For example A
isn’t an escape sequence:
$ python3.6 -Wd -c '"A"'
<string>:1: DeprecationWarning: invalid escape sequence A
If your backslash sequence does accidentally match one of Python’s escape sequences, but you didn’t mean it to, that’s even worse.
So you should always use raw strings or \
.
It’s important to remember that a string literal is still a string literal even if that string is intended to be used as a regular expression. Python’s regular expression syntax supports lots of special sequences that begin with . For example
A
matches the start of a string. But A
is not valid in a Python string literal! This is invalid:
my_regex = "Afoo"
Instead you should do this:
my_regex = r"Afoo"
Docstrings are another one to remember: docstrings are string literals too, and invalid sequences are invalid in docstrings too! Use raw strings (
r"""..."""
) for docstrings if they contain ‘s.