End-line characters from lines read from text file, using Python
Question:
When reading lines from a text file using python, the end-line character often needs to be truncated before processing the text, as in the following example:
f = open("myFile.txt", "r")
for line in f:
line = line[:-1]
# do something with line
Is there an elegant way or idiom for retrieving text lines without the end-line character?
Answers:
What’s wrong with your code? I find it to be quite elegant and simple. The only problem is that if the file doesn’t end in a newline, the last line returned won’t have a 'n'
as the last character, and therefore doing line = line[:-1]
would incorrectly strip off the last character of the line.
The most elegant way to solve this problem would be to define a generator which took the lines of the file and removed the last character from each line only if that character is a newline:
def strip_trailing_newlines(file):
for line in file:
if line[-1] == 'n':
yield line[:-1]
else:
yield line
f = open("myFile.txt", "r")
for line in strip_trailing_newlines(f):
# do something with line
Simple. Use splitlines()
L = open("myFile.txt", "r").read().splitlines();
for line in L:
process(line) # this 'line' will not have 'n' character at the end
You may also consider using line.rstrip() to remove the whitespaces at the end of your line.
The idiomatic way to do this in Python is to use rstrip(‘n’):
for line in open('myfile.txt'): # opened in text-mode; all EOLs are converted to 'n'
line = line.rstrip('n')
process(line)
Each of the other alternatives has a gotcha:
- file(‘…’).read().splitlines() has to load the whole file in memory at once.
- line = line[:-1] will fail if the last line has no EOL.
Long time ago, there was Dear, clean, old, BASIC code that could run on 16 kb core machines:
like that:
if (not open(1,"file.txt")) error "Could not open 'file.txt' for reading"
while(not eof(1))
line input #1 a$
print a$
wend
close
Now, to read a file line by line, with far better hardware and software (Python), we must reinvent the wheel:
def line_input (file):
for line in file:
if line[-1] == 'n':
yield line[:-1]
else:
yield line
f = open("myFile.txt", "r")
for line_input(f):
# do something with line
I am induced to think that something has gone the wrong way somewhere…
What do you thing about this approach?
with open(filename) as data:
datalines = (line.rstrip('rn') for line in data)
for line in datalines:
...do something awesome...
Generator expression avoids loading whole file into memory and with
ensures closing the file
When reading lines from a text file using python, the end-line character often needs to be truncated before processing the text, as in the following example:
f = open("myFile.txt", "r")
for line in f:
line = line[:-1]
# do something with line
Is there an elegant way or idiom for retrieving text lines without the end-line character?
What’s wrong with your code? I find it to be quite elegant and simple. The only problem is that if the file doesn’t end in a newline, the last line returned won’t have a 'n'
as the last character, and therefore doing line = line[:-1]
would incorrectly strip off the last character of the line.
The most elegant way to solve this problem would be to define a generator which took the lines of the file and removed the last character from each line only if that character is a newline:
def strip_trailing_newlines(file):
for line in file:
if line[-1] == 'n':
yield line[:-1]
else:
yield line
f = open("myFile.txt", "r")
for line in strip_trailing_newlines(f):
# do something with line
Simple. Use splitlines()
L = open("myFile.txt", "r").read().splitlines();
for line in L:
process(line) # this 'line' will not have 'n' character at the end
You may also consider using line.rstrip() to remove the whitespaces at the end of your line.
The idiomatic way to do this in Python is to use rstrip(‘n’):
for line in open('myfile.txt'): # opened in text-mode; all EOLs are converted to 'n'
line = line.rstrip('n')
process(line)
Each of the other alternatives has a gotcha:
- file(‘…’).read().splitlines() has to load the whole file in memory at once.
- line = line[:-1] will fail if the last line has no EOL.
Long time ago, there was Dear, clean, old, BASIC code that could run on 16 kb core machines:
like that:
if (not open(1,"file.txt")) error "Could not open 'file.txt' for reading"
while(not eof(1))
line input #1 a$
print a$
wend
close
Now, to read a file line by line, with far better hardware and software (Python), we must reinvent the wheel:
def line_input (file):
for line in file:
if line[-1] == 'n':
yield line[:-1]
else:
yield line
f = open("myFile.txt", "r")
for line_input(f):
# do something with line
I am induced to think that something has gone the wrong way somewhere…
What do you thing about this approach?
with open(filename) as data:
datalines = (line.rstrip('rn') for line in data)
for line in datalines:
...do something awesome...
Generator expression avoids loading whole file into memory and with
ensures closing the file