taking multiline input with sys.stdin

Question:

I have the following function:

def getInput():
    # define buffer (list of lines)
    buffer = []
    run = True
    while run:
        # loop through each line of user input, adding it to buffer
        for line in sys.stdin.readlines():
            if line == 'quitn':
                run = False
            else:
                buffer.append(line.replace('n',''))
    # return list of lines
    return buffer

which is called in my function takeCommands(), which is called to actually run my program.

However, this doesn’t do anything. I’m hoping to add each line to an array, and once a line == ‘quit’ it stops taking user input. I’ve tried both for line in sys.stdin.readlines() and for line sys.stdin, but neither of them register any of my input (I’m running it in Windows Command Prompt). Any ideas? Thanks

Asked By: Jakemmarsh

||

Answers:

I’d use itertools.takewhile for this:

import sys
import itertools
print list(itertools.takewhile(lambda x: x.strip() != 'quit', sys.stdin))

Another way to do this would be to use the 2-argument iter form:

print list(iter(raw_input,'quit'))

This has the advantage that raw_input takes care of all of the line-buffering issues and it will strip the newlines for you already — But it will loop until you run out of memory if the user forgets to add a quit to the script.

Both of these pass the test:

python test.py <<EOF
foo
bar
baz
quit
cat
dog
cow
EOF
Answered By: mgilson

So, took your code out of the function and ran some tests.

import sys
buffer = []
while run:
    line = sys.stdin.readline().rstrip('n')
    if line == 'quit':
        run = False
    else:
        buffer.append(line)

print buffer

Changes:

  • Removed the ‘for’ loop
  • Using ‘readline’ instead of ‘readlines’
  • strip’d out the ‘n’ after input, so all processing afterwards is much easier.

Another way:

import sys
buffer = []
while True:
    line = sys.stdin.readline().rstrip('n')
    if line == 'quit':
        break
    else:
        buffer.append(line)
print buffer

Takes out the ‘run’ variable, as it is not really needed.

Answered By: Wing Tang Wong

There are multiple separate problems with this code:

while run:
    # loop through each line of user input, adding it to buffer
    for line in sys.stdin.readlines():
        if line == 'quit':
            run = False

First, you have an inner loop that won’t finish until all lines have been processed, even if you type "quit" at some point. Setting run = False doesn’t break out of that loop. Instead of quitting as soon as you type "quit", it will keep going until it’s looked at all of the lines, and then quit if you typed "quit" at any point.

You can fix this one pretty easily by adding a break after the run = False.


But, with or without that fix, if you didn’t type "quit" during that first time through the outer loop, since you’ve already read all input, there’s nothing else to read, so you’ll just keep running an empty inner loop over and over forever that you can never exit.

You have a loop that means "read and process all the input". You want to do that exactly once. So, what should the outer loop be? It should not be anyway; the way to do something once is to not use a loop. So, to fix this one, get rid of run and the while run: loop; just use the inner loop.


Then, if you type "quit", line will actually be "quitn", because readlines does not strip newlines.

You fix this one by either testing for "quitn", or stripping the lines.


Finally, even if you fix all of these problems, you’re still waiting forever before doing anything. readlines returns a list of lines. The only way it can possibly do that is by reading all of the lines that will ever be on stdin. You can’t even start looping until you’ve read all those lines.

When standard input is a file, that happens when the file ends, so it’s not too terrible. But when standard input is the Windows command prompt, the command prompt never ends.* So, this takes forever. You don’t get to start processing the list of lines, because it takes forever to wait for the list of lines.

The solution is to not use readlines(). Really, there is never a good reason to call readlines() on anything, stdin or not. Anything that readlines works on is already an iterable full of lines, just like the list that readlines would give you, except that it’s "lazy": it can give you the lines one at a time, instead of waiting and giving you all of them at once. (And even if you really need the list, just do list(f) instead of f.readlines().)

So, instead of for line in sys.stdin.readlines():, just do for line in sys.stdin: (Or, better, replace the explicit loop completely and use a sequence of iterator transformations, as in mgilson’s answer.)


The fixes JBernardo, Wing Tang Wong, etc. proposed are all correct, and necessary. The reason none of them fixed your problems is that if you have 4 bugs and fix 1, your code still doesn’t work. That’s exactly why "doesn’t work" isn’t a useful measure of anything in programming, and you have to debug what’s actually going wrong to know whether you’re making progress.


* I lied a bit about stdin never being finished. If you type a control-Z (you may or may not need to follow it with a return), then stdin is finished. But if your assignment is to make it quit as soon as the user types "quit"< turning in something that only quits when the user types "quit" and then return, control-Z, return again probably won’t be considered successful.

Answered By: abarnert

The following works (on Linux, at least).

import pathlib

pathlib.Path("/proc/self/fd/0").read_text()

Above, /proc/self/fd/0 means "stdin".

Press Ctrl + D to end the multi-line input.

Answered By: turdus-merula
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.