Iterating over lines in a file python

Question:

I have seen these two ways to process a file:

file = open("file.txt")
for line in file:
    #do something

file = open("file.txt")
contents = file.read()
for line in contents:
    # do something

I know that in the first case, the file will act like a list, so the for loop iterates over the file as if it were a list. What exactly happens in the second case, where we read the file and then iterate over the contents? What are the consequences of taking each approach, and how should I choose between them?

Asked By: Homap

||

Answers:

In the first one you are iterating over the file, line by line. In this scenario, the entire file data is not read into the memory at once; instead, only the current line is read into memory. This is useful for handling very large files, and good for robustness if you don’t know if the file is going to be large or not.

In the second one, file.read() returns the complete file data as a string. When you are iterating over it, you are actually iterating over the file’s data character by character. This reads the complete file data into memory.

Here’s an example to show this behavior.

a.txt file contains

Hello
Bye

Code:

>>> f = open('a.txt','r')
>>> for l in f:
...     print(l)
...
Hello

Bye


>>> f = open('a.txt','r')
>>> r = f.read()
>>> print(repr(r))
'HellonBye'
>>> for c in r:
...     print(c)
...
H
e
l
l
o


B
y
e
Answered By: Anand S Kumar

The second case reads in the contents of the file into one big string. If you iterate over a string, you get each character in turn. If you want to get each line in turn, you can do this:

for line in contents.split('n'):
     # do something

Or you can read in the contents as a list of lines using readlines() instead of read().

with open('file.txt','r') as fin:
    lines = fin.readlines()
for line in lines:
    # do something
Answered By: khelwood
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.