Reading a Line From File Without Advancing [Pythonic Approach]

Question:

What’s a pythonic approach for reading a line from a file but not advancing where you are in the file?

For example, if you have a file of

cat1
cat2
cat3

and you do file.readline() you will get cat1n . The next file.readline() will return cat2n .

Is there some functionality like file.some_function_here_nextline() to get cat1n then you can later do file.readline() and get back cat1n?

Asked By: user1431282

||

Answers:

You could use wrap the file up with itertools.tee and get back two iterators, bearing in mind the caveats stated in the documentation

For example

from itertools import tee
import contextlib
from StringIO import StringIO
s = '''
cat1
cat2
cat3
'''

with contextlib.closing(StringIO(s)) as f:
  handle1, handle2 = tee(f)
  print next(handle1)
  print next(handle2)

 cat1
 cat1
Answered By: iruvar

As far as I know, there’s no builtin functionality for this, but such a function is easy to write, since most Python file objects support seek and tell methods for jumping around within a file. So, the process is very simple:

  • Find the current position within the file using tell.
  • Perform a read (or write) operation of some kind.
  • seek back to the previous file pointer.

This allows you to do nice things like read a chunk of data from the file, analyze it, and then potentially overwrite it with different data. A simple wrapper for the functionality might look like:

def peek_line(f):
    pos = f.tell()
    line = f.readline()
    f.seek(pos)
    return line

print peek_line(f) # cat1
print peek_line(f) # cat1

You could implement the same thing for other read methods just as easily. For instance, implementing the same thing for file.read:

def peek(f, length=1):
    pos = f.tell()
    data = f.read(length) # Might try/except this line, and finally: f.seek(pos)
    f.seek(pos)
    return data

print peek(f, 4) # cat1
print peek(f, 4) # cat1
Answered By: Henry Keiter

Manually doing it is not that hard:

f = open('file.txt')
line = f.readline()
print line
>>> cat1
# the calculation is: - (length of string + 1 because of the n)
# the second parameter is needed to move from the actual position of the buffer
f.seek((len(line)+1)*-1, 1)
line = f.readline()
print line
>>> cat1

You can wrap this in a method like this:

def lookahead_line(file):
    line = file.readline()
    count = len(line) + 1
    file.seek(-count, 1)
    return file, line

And use it like this:

f = open('file.txt')
f, line = lookahead_line(f)
print line

Hope this helps!

Answered By: Paulo Bu

The more_itertools library offers a peekable class that allows you to peek() ahead without advancing an iterable.

with open("file.txt", "r") as f:
    p = mit.peekable(f.readlines())

p.peek()
# 'cat1n'

next(p)
# 'cat1n'

We can view the next line before calling next() to advance the iterable p. We can now view the next line by calling peek() again.

p.peek()
# 'cat2n'

See also the more_itertools docs, as peekable allows you to prepend() items to an iterable before advancing as well.

Answered By: pylang

Solutions with tell()/seek() will not work with stdin and other iterators. More generic implementation can be as simple as this:

class lookahead_iterator(object):
    __slots__ = ["_buffer", "_iterator", "_next"]
    def __init__(self, iterable):
        self._buffer = [] 
        self._iterator = iter(iterable)
        self._next = self._iterator.next
    def __iter__(self):
        return self 
    def _next_peeked(self):
        v = self._buffer.pop(0)
        if 0 == len(self._buffer):
            self._next = self._iterator.next
        return v
    def next(self):
        return self._next()
    def peek(self):
        v = next(self._iterator)
        self._buffer.append(v)
        self._next = self._next_peeked
        return v

Usage:

with open("source.txt", "r") as lines:
    lines = lookahead_iterator(lines)
    magic = lines.peek()
    if magic.startswith("#"):
        return parse_bash(lines)
    if magic.startswith("/*"):
        return parse_c(lines)
    if magic.startswith("//"):
        return parse_cpp(lines)
    raise ValueError("Unrecognized file")
Answered By: wonder.mice
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.