Python raw strings and trailing backslash

Question:

I ran across something once upon a time and wondered if it was a Python “bug” or at least a misfeature. I’m curious if anyone knows of any justifications for this behavior. I thought of it just now reading “Code Like a Pythonista,” which has been enjoyable so far. I’m only familiar with the 2.x line of Python.

Raw strings are strings that are prefixed with an r. This is great because I can use backslashes in regular expressions and I don’t need to double everything everywhere. It’s also handy for writing throwaway scripts on Windows, so I can use backslashes there also. (I know I can also use forward slashes, but throwaway scripts often contain content cut&pasted from elsewhere in Windows.)

So great! Unless, of course, you really want your string to end with a backslash. There’s no way to do that in a ‘raw’ string.

In [9]: r'n'
Out[9]: '\n'

In [10]: r'abcn'
Out[10]: 'abc\n'

In [11]: r'abc'
------------------------------------------------
   File "<ipython console>", line 1
     r'abc'
           ^
SyntaxError: EOL while scanning string literal


In [12]: r'abc\'
Out[12]: 'abc\\'

So one backslash before the closing quote is an error, but two backslashes gives you two backslashes! Certainly I’m not the only one that is bothered by this?

Thoughts on why ‘raw’ strings are ‘raw, except for backslash-quote’? I mean, if I wanted to embed a single quote in there I’d just use double quotes around the string, and vice versa. If I wanted both, I’d just triple quote. If I really wanted three quotes in a row in a raw string, well, I guess I’d have to deal, but is this considered “proper behavior”?

This is particularly problematic with folder names in Windows, where the backslash is the path delimeter.

Asked By: dash-tom-bang

||

Answers:

It’s a FAQ.

And in response to “you really want your string to end with a backslash. There’s no way to do that in a ‘raw’ string.”: the FAQ shows how to workaround it.

>>> r'abc' '\' == 'ab\c\'
True
>>>
Answered By: John Machin

Raw strings are meant mostly for readably writing the patterns for regular expressions, which never need a trailing backslash; it’s an accident that they may come in handy for Windows (where you could use forward slashes in most cases anyway — the Microsoft C library which underlies Python accepts either form!). It’s not cosidered acceptable to make it (nearly) impossible to write a regular expression pattern containing both single and double quotes, just to reinforce the accident in question.

(“Nearly” because triple-quoting would almost alway help… but it could be a little bit of a pain sometimes).

So, yes, raw strings were designed to behave that way (forbidding odd numbers of trailing backslashes), and it is considered perfectly “proper behavior” for them to respect the design decisions Guido made when he invented them;-).

Answered By: Alex Martelli

Thoughts on why ‘raw’ strings are ‘raw, except for backslash-quote’? I
mean, if I wanted to embed a single quote in there I’d just use double
quotes around the string, and vice versa.

But that would then raise the question as to why raw strings are ‘raw, except for embedded quotes?’

You have to have some escape mechanism, otherwise you can never use the outer quote characters inside the string at all. And then you need an escape mechanism for the escape mechanism.

Answered By: user207421

Another way to workaround this is:

 >>> print(r"Raw with trailing backslash "[:-1])
 Raw with trailing backslash

Updated for Python 3 and removed unnecessary slash at the end which implied an escape.

Note that personally I doubt I would use the above. I guess maybe if it was a huge string with more than just a path. For the above I’d prefer non-raw and double up the slashes.

Answered By: GravityWell
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.