DeprecationWarning: invalid escape sequence – what to use instead of d?

Question:

I’ve met a problem with re module in Python 3.6.5.
I have this pattern in my regular expression:

'\nRevision: (d+)\n'

But when I run it, I’m getting a DeprecationWarning.

I searched for the problem on SO, and haven’t found the answer, actually – what should I use instead of d+? Just [0-9]+ or maybe something else?

Asked By: mchfrnc

||

Answers:

Python 3 interprets string literals as Unicode strings, and therefore your d is treated as an escaped Unicode character.

Declare your RegEx pattern as a raw string instead by prepending r, as below:

r'nRevision: (d+)n'

This also means you can drop the escapes for n as well since these will just be parsed as newline characters by re.

Answered By: ACascarino

You get a Deprecation Warning for

'\nRevision: (d+)\n'

because Python interprets d as invalid escape sequence. As is, Python doesn’t substitute that sub-string, but warns about it since Version 3.6:

Unlike Standard C, all unrecognized escape sequences are left in the string unchanged, i.e., the backslash is left in the result. (This behavior is useful when debugging: if an escape sequence is mistyped, the resulting output is more easily recognized as broken.) It is also important to note that the escape sequences only recognized in string literals fall into the category of unrecognized escapes for bytes literals.

Changed in version 3.6: Unrecognized escape sequences produce a DeprecationWarning. In a future Python version they will be a SyntaxWarning and eventually a SyntaxError.

(source)


Thus, you can fix this warning by either escaping that back-slash properly or using raw strings.

That means, escape more:

'\nRevision: (\d+)\n'

Or, use a raw string literal (where doesn’t start an escape sequence):

r'nRevision: (d+)n'
Answered By: maxschlepzig
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.