How to derive a string for the newline characters in a platform-independent way and use it in a regular expression pattern?

Question:

I have a question about how to represent the newline characters as a string in Python. I thought I could use the built-in function repr to achieve this. So I try to verify the feasibility of this method by running the following code:

import os

lineBreakAsStr = repr(os.linesep)
print(f'lineBreakAsStr = {lineBreakAsStr}') # line 4
print(lineBreakAsStr == '\r\n')           # line 5

I expect the result of line 5 should be ‘ True ‘ if the function repr can convert the value of os.linesep to a string successfully. But in my Windows 7 PC, the output of line 4 is ‘ lineBreakAsStr = ‘rn’ ‘ and the output of line 5 is ‘ False ‘.

Can anyone explain to me why?
And how should I get the string which stands for newline characters from the value of os.linesep and put it in a regular expression pattern instead of using a fixed string like ‘ \r\n ‘?

Below is a code snippet to demonstrate what I want to do. ( I prefer to use the code in line 13 to the code in line 14. But the code in 13 does not work. It has to be modified in some way to find the substring I want to find. ):

import os, re

def f(pattern, data):
  p =  re.compile(pattern)
  m = p.search(data)
  if m is not None:
    print(m.group())
  else:
    print('Not match.')

dataSniffedInConsole = ('procd: - init -\\r\\nPlease press Enter '
                        'to activate this console.\\r\\n')
lineBreakAsStr = repr(os.linesep)   # line 13
# lineBreakAsStr = '\\\\r\\\\n' # line 14

pattern = rf'Please press Enter to activate this console.{lineBreakAsStr}'

f(pattern, dataSniffedInConsole)
Asked By: thomas_chang

||

Answers:

Using repr will put quotes around the string. The quotes are probably causing your issue.

>>> newline = repr(os.linesep)
>>> print(newline)
'\r\n'
>>> newline == "'\r\n'"
True

A quick fix to your problem is to remove the quotes:

>>> newline = repr(os.linesep).strip("'")
>>> print(newline)
\r\n
>>> newline == "'\r\n'"
False
>>> newline == "\r\n"
True

I recommend you find a way to read the raw data from the console rather than a representation of it. Using the raw data will be much easier to process.

Answered By: BirdLogic
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.