In python, what is a functional, and memory efficient way to read standard in, line by line?

Question:

I have this https://stackoverflow.com/a/1450396/1810962 answer from another post which almost achieves it:

import sys
data = sys.stdin.readlines()
preProcessed = map(lambda line: line.rstrip(), data)

I can now operate on the lines in data in a functional way by applying filter, map, etc. However, it loads the entire standard in into memory. Is there a lazy way to build a stream of lines?

Asked By: joseph

||

Answers:

Just iterate on sys.stdin, it will iterate on the lines.

Then, you can stack generator expressions, or use map and filter if you prefer. Each line that gets in will go through the pipeline, no list gets built in the process.

Here are examples of each:

import sys

stripped_lines = (line.strip() for line in sys.stdin)
lines_with_prompt = ('--> ' + line for line in stripped_lines)
uppercase_lines = map(lambda line: line.upper(), lines_with_prompt)
lines_without_dots = filter(lambda line: '.' not in line, uppercase_lines)

for line in lines_without_dots:
    print(line)

And in action, in the terminal:

thierry@amd:~$ ./test.py 
My first line
--> MY FIRST LINE 
goes through the pipeline
--> GOES THROUGH THE PIPELINE
but not this one, filtered because of the dot. 
This last one will go through
--> THIS LAST ONE WILL GO THROUGH

A shorter example with map only, where map will iterate on the lines of stdin:

import sys

uppercase_lines = map(lambda line: line.upper(), sys.stdin)

for line in uppercase_lines:
    print(line)

In action:

thierry@amd:~$ ./test2.py 
this line will turn
THIS LINE WILL TURN

to uppercase
TO UPPERCASE
Answered By: Thierry Lathuille
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.