How to derive a string for the newline characters in a platform-independent way and use it in a regular expression pattern?
Question:
I have a question about how to represent the newline characters as a string in Python. I thought I could use the built-in function repr
to achieve this. So I try to verify the feasibility of this method by running the following code:
import os
lineBreakAsStr = repr(os.linesep)
print(f'lineBreakAsStr = {lineBreakAsStr}') # line 4
print(lineBreakAsStr == '\r\n') # line 5
I expect the result of line 5 should be ‘ True ‘ if the function repr
can convert the value of os.linesep
to a string successfully. But in my Windows 7 PC, the output of line 4 is ‘ lineBreakAsStr = ‘rn’ ‘ and the output of line 5 is ‘ False ‘.
Can anyone explain to me why?
And how should I get the string which stands for newline characters from the value of os.linesep
and put it in a regular expression pattern instead of using a fixed string like ‘ \r\n ‘?
Below is a code snippet to demonstrate what I want to do. ( I prefer to use the code in line 13 to the code in line 14. But the code in 13 does not work. It has to be modified in some way to find the substring I want to find. ):
import os, re
def f(pattern, data):
p = re.compile(pattern)
m = p.search(data)
if m is not None:
print(m.group())
else:
print('Not match.')
dataSniffedInConsole = ('procd: - init -\\r\\nPlease press Enter '
'to activate this console.\\r\\n')
lineBreakAsStr = repr(os.linesep) # line 13
# lineBreakAsStr = '\\\\r\\\\n' # line 14
pattern = rf'Please press Enter to activate this console.{lineBreakAsStr}'
f(pattern, dataSniffedInConsole)
Answers:
Using repr
will put quotes around the string. The quotes are probably causing your issue.
>>> newline = repr(os.linesep)
>>> print(newline)
'\r\n'
>>> newline == "'\r\n'"
True
A quick fix to your problem is to remove the quotes:
>>> newline = repr(os.linesep).strip("'")
>>> print(newline)
\r\n
>>> newline == "'\r\n'"
False
>>> newline == "\r\n"
True
I recommend you find a way to read the raw data from the console rather than a representation of it. Using the raw data will be much easier to process.
I have a question about how to represent the newline characters as a string in Python. I thought I could use the built-in function repr
to achieve this. So I try to verify the feasibility of this method by running the following code:
import os
lineBreakAsStr = repr(os.linesep)
print(f'lineBreakAsStr = {lineBreakAsStr}') # line 4
print(lineBreakAsStr == '\r\n') # line 5
I expect the result of line 5 should be ‘ True ‘ if the function repr
can convert the value of os.linesep
to a string successfully. But in my Windows 7 PC, the output of line 4 is ‘ lineBreakAsStr = ‘rn’ ‘ and the output of line 5 is ‘ False ‘.
Can anyone explain to me why?
And how should I get the string which stands for newline characters from the value of os.linesep
and put it in a regular expression pattern instead of using a fixed string like ‘ \r\n ‘?
Below is a code snippet to demonstrate what I want to do. ( I prefer to use the code in line 13 to the code in line 14. But the code in 13 does not work. It has to be modified in some way to find the substring I want to find. ):
import os, re
def f(pattern, data):
p = re.compile(pattern)
m = p.search(data)
if m is not None:
print(m.group())
else:
print('Not match.')
dataSniffedInConsole = ('procd: - init -\\r\\nPlease press Enter '
'to activate this console.\\r\\n')
lineBreakAsStr = repr(os.linesep) # line 13
# lineBreakAsStr = '\\\\r\\\\n' # line 14
pattern = rf'Please press Enter to activate this console.{lineBreakAsStr}'
f(pattern, dataSniffedInConsole)
Using repr
will put quotes around the string. The quotes are probably causing your issue.
>>> newline = repr(os.linesep)
>>> print(newline)
'\r\n'
>>> newline == "'\r\n'"
True
A quick fix to your problem is to remove the quotes:
>>> newline = repr(os.linesep).strip("'")
>>> print(newline)
\r\n
>>> newline == "'\r\n'"
False
>>> newline == "\r\n"
True
I recommend you find a way to read the raw data from the console rather than a representation of it. Using the raw data will be much easier to process.