Writelines writes lines without newline, Just fills the file
Question:
I have a program that writes a list to a file.
The list is a list of pipe delimited lines and the lines should be written to the file like this:
123|GSV|Weather_Mean|hello|joe|43.45
122|GEV|temp_Mean|hello|joe|23.45
124|GSI|Weather_Mean|hello|Mike|47.45
BUT it wrote them line this ahhhh:
123|GSV|Weather_Mean|hello|joe|43.45122|GEV|temp_Mean|hello|joe|23.45124|GSI|Weather_Mean|hello|Mike|47.45
This program wrote all the lines into like one line without any line breaks.. This hurts me a lot and I gotta figure-out how to reverse this but anyway, where is my program wrong here? I thought write lines should write lines down the file rather than just write everything to one line..
fr = open(sys.argv[1], 'r') # source file
fw = open(sys.argv[2]+"/masked_"+sys.argv[1], 'w') # Target Directory Location
for line in fr:
line = line.strip()
if line == "":
continue
columns = line.strip().split('|')
if columns[0].find("@") > 1:
looking_for = columns[0] # this is what we need to search
else:
looking_for = "[email protected]"
if looking_for in d:
# by default, iterating over a dictionary will return keys
new_line = d[looking_for]+'|'+'|'.join(columns[1:])
line_list.append(new_line)
else:
new_idx = str(len(d)+1)
d[looking_for] = new_idx
kv = open(sys.argv[3], 'a')
kv.write(looking_for+" "+new_idx+'n')
kv.close()
new_line = d[looking_for]+'|'+'|'.join(columns[1:])
line_list.append(new_line)
fw.writelines(line_list)
Answers:
The documentation for writelines()
states:
writelines()
does not add line separators
So you’ll need to add them yourself. For example:
line_list.append(new_line + "n")
whenever you append a new item to line_list
.
This is actually a pretty common problem for newcomers to Python—especially since, across the standard library and popular third-party libraries, some reading functions strip out newlines, but almost no writing functions (except the log
-related stuff) add them.
So, there’s a lot of Python code out there that does things like:
fw.write('n'.join(line_list) + 'n')
or
fw.write(line + 'n' for line in line_list)
Either one is correct, and of course you could even write your own writelinesWithNewlines function that wraps it up…
But you should only do this if you can’t avoid it.
It’s better if you can create/keep the newlines in the first place—as in Greg Hewgill’s suggestions:
line_list.append(new_line + "n")
And it’s even better if you can work at a higher level than raw lines of text, e.g., by using the csv module in the standard library, as esuaro suggests.
For example, right after defining fw
, you might do this:
cw = csv.writer(fw, delimiter='|')
Then, instead of this:
new_line = d[looking_for]+'|'+'|'.join(columns[1:])
line_list.append(new_line)
You do this:
row_list.append(d[looking_for] + columns[1:])
And at the end, instead of this:
fw.writelines(line_list)
You do this:
cw.writerows(row_list)
Finally, your design is “open a file, then build up a list of lines to add to the file, then write them all at once”. If you’re going to open the file up top, why not just write the lines one by one? Whether you’re using simple writes or a csv.writer
, it’ll make your life simpler, and your code easier to read. (Sometimes there can be simplicity, efficiency, or correctness reasons to write a file all at once—but once you’ve moved the open
all the way to the opposite end of the program from the write
, you’ve pretty much lost any benefits of all-at-once.)
writelines()
does not add line separators. You can alter the list of strings by using map()
to add a new n
(line break) at the end of each string.
items = ['abc', '123', '!@#']
items = map(lambda x: x + 'n', items)
w.writelines(items)
As others have mentioned, and counter to what the method name would imply, writelines
does not add line separators. This is a textbook case for a generator. Here is a contrived example:
def item_generator(things):
for item in things:
yield item
yield 'n'
def write_things_to_file(things):
with open('path_to_file.txt', 'wb') as f:
f.writelines(item_generator(things))
Benefits: adds newlines explicitly without modifying the input or output values or doing any messy string concatenation. And, critically, does not create any new data structures in memory. IO (writing to a file) is when that kind of thing tends to actually matter. Hope this helps someone!
As others have noted, writelines
is a misnomer (it ridiculously does not add newlines to the end of each line).
To do that, explicitly add it to each line:
with open(dst_filename, 'w') as f:
f.writelines(s + 'n' for s in lines)
As we have well established here, writelines
does not append the newlines for you. But, what everyone seems to be missing, is that it doesn’t have to when used as a direct “counterpart” for readlines()
and the initial read persevered the newlines!
When you open a file for reading in binary mode (via 'rb'
), then use readlines()
to fetch the file contents into memory, split by line, the newlines remain attached to the end of your lines! So, if you then subsequently write them back, you don’t likely want writelines
to append anything!
So if, you do something like:
with open('test.txt','rb') as f: lines=f.readlines()
with open('test.txt','wb') as f: f.writelines(lines)
You should end up with the same file content you started with.
As we want to only separate lines, and the writelines
function in python does not support adding separator between lines, I have written the simple code below which best suits this problem:
sep = "n" # defining the separator
new_lines = sep.join(lines) # lines as an iterator containing line strings
and finally:
with open("file_name", 'w') as file:
file.writelines(new_lines)
and you are done.
Credits to Brent Faust
.
Python >= 3.6 with format string:
with open(dst_filename, 'w') as f:
f.writelines(f'{s}n' for s in lines)
lines
can be a set
.
If you are oldschool (like me) you may add f.write('n')
below the second line.
I have a program that writes a list to a file.
The list is a list of pipe delimited lines and the lines should be written to the file like this:
123|GSV|Weather_Mean|hello|joe|43.45
122|GEV|temp_Mean|hello|joe|23.45
124|GSI|Weather_Mean|hello|Mike|47.45
BUT it wrote them line this ahhhh:
123|GSV|Weather_Mean|hello|joe|43.45122|GEV|temp_Mean|hello|joe|23.45124|GSI|Weather_Mean|hello|Mike|47.45
This program wrote all the lines into like one line without any line breaks.. This hurts me a lot and I gotta figure-out how to reverse this but anyway, where is my program wrong here? I thought write lines should write lines down the file rather than just write everything to one line..
fr = open(sys.argv[1], 'r') # source file
fw = open(sys.argv[2]+"/masked_"+sys.argv[1], 'w') # Target Directory Location
for line in fr:
line = line.strip()
if line == "":
continue
columns = line.strip().split('|')
if columns[0].find("@") > 1:
looking_for = columns[0] # this is what we need to search
else:
looking_for = "[email protected]"
if looking_for in d:
# by default, iterating over a dictionary will return keys
new_line = d[looking_for]+'|'+'|'.join(columns[1:])
line_list.append(new_line)
else:
new_idx = str(len(d)+1)
d[looking_for] = new_idx
kv = open(sys.argv[3], 'a')
kv.write(looking_for+" "+new_idx+'n')
kv.close()
new_line = d[looking_for]+'|'+'|'.join(columns[1:])
line_list.append(new_line)
fw.writelines(line_list)
The documentation for writelines()
states:
writelines()
does not add line separators
So you’ll need to add them yourself. For example:
line_list.append(new_line + "n")
whenever you append a new item to line_list
.
This is actually a pretty common problem for newcomers to Python—especially since, across the standard library and popular third-party libraries, some reading functions strip out newlines, but almost no writing functions (except the log
-related stuff) add them.
So, there’s a lot of Python code out there that does things like:
fw.write('n'.join(line_list) + 'n')
or
fw.write(line + 'n' for line in line_list)
Either one is correct, and of course you could even write your own writelinesWithNewlines function that wraps it up…
But you should only do this if you can’t avoid it.
It’s better if you can create/keep the newlines in the first place—as in Greg Hewgill’s suggestions:
line_list.append(new_line + "n")
And it’s even better if you can work at a higher level than raw lines of text, e.g., by using the csv module in the standard library, as esuaro suggests.
For example, right after defining fw
, you might do this:
cw = csv.writer(fw, delimiter='|')
Then, instead of this:
new_line = d[looking_for]+'|'+'|'.join(columns[1:])
line_list.append(new_line)
You do this:
row_list.append(d[looking_for] + columns[1:])
And at the end, instead of this:
fw.writelines(line_list)
You do this:
cw.writerows(row_list)
Finally, your design is “open a file, then build up a list of lines to add to the file, then write them all at once”. If you’re going to open the file up top, why not just write the lines one by one? Whether you’re using simple writes or a csv.writer
, it’ll make your life simpler, and your code easier to read. (Sometimes there can be simplicity, efficiency, or correctness reasons to write a file all at once—but once you’ve moved the open
all the way to the opposite end of the program from the write
, you’ve pretty much lost any benefits of all-at-once.)
writelines()
does not add line separators. You can alter the list of strings by using map()
to add a new n
(line break) at the end of each string.
items = ['abc', '123', '!@#']
items = map(lambda x: x + 'n', items)
w.writelines(items)
As others have mentioned, and counter to what the method name would imply, writelines
does not add line separators. This is a textbook case for a generator. Here is a contrived example:
def item_generator(things):
for item in things:
yield item
yield 'n'
def write_things_to_file(things):
with open('path_to_file.txt', 'wb') as f:
f.writelines(item_generator(things))
Benefits: adds newlines explicitly without modifying the input or output values or doing any messy string concatenation. And, critically, does not create any new data structures in memory. IO (writing to a file) is when that kind of thing tends to actually matter. Hope this helps someone!
As others have noted, writelines
is a misnomer (it ridiculously does not add newlines to the end of each line).
To do that, explicitly add it to each line:
with open(dst_filename, 'w') as f:
f.writelines(s + 'n' for s in lines)
As we have well established here, writelines
does not append the newlines for you. But, what everyone seems to be missing, is that it doesn’t have to when used as a direct “counterpart” for readlines()
and the initial read persevered the newlines!
When you open a file for reading in binary mode (via 'rb'
), then use readlines()
to fetch the file contents into memory, split by line, the newlines remain attached to the end of your lines! So, if you then subsequently write them back, you don’t likely want writelines
to append anything!
So if, you do something like:
with open('test.txt','rb') as f: lines=f.readlines()
with open('test.txt','wb') as f: f.writelines(lines)
You should end up with the same file content you started with.
As we want to only separate lines, and the writelines
function in python does not support adding separator between lines, I have written the simple code below which best suits this problem:
sep = "n" # defining the separator
new_lines = sep.join(lines) # lines as an iterator containing line strings
and finally:
with open("file_name", 'w') as file:
file.writelines(new_lines)
and you are done.
Credits to Brent Faust
.
Python >= 3.6 with format string:
with open(dst_filename, 'w') as f:
f.writelines(f'{s}n' for s in lines)
lines
can be a set
.
If you are oldschool (like me) you may add f.write('n')
below the second line.