How do I iterate over all lines of files passed on the command line?

Question:

I usually do this in Perl:

whatever.pl

while(<>) {
    #do whatever;
}

then cat foo.txt | whatever.pl

Now, I want to do this in Python. I tried sys.stdin but I have no idea how to do as I have done in Perl. How can I read the input?

Asked By: Tg.

||

Answers:

Try this:

import fileinput
for line in fileinput.input():
    process(line)
Answered By: Don Werve
import sys
def main():
    for line in sys.stdin:
        print line
if __name__=='__main__':
    sys.exit(main())
Answered By: Mark Roddy

Something like this:

import sys

for line in sys.stdin:
    # whatever
Answered By: David Z
import sys

for line in sys.stdin:
    # do stuff w/line
Answered By: Can Berk Güder

I hate to beat a dead horse, but may I suggest using a pure function?

import sys

def main(stdin):
  for line in stdin:
    print("You said: " + line.strip())

if __name__ == "__main__":
  main(sys.stdin)

This approach is nice because main is dependent purely on its input and you can unit test it with anything that obeys the line-delimited input stream paradigm.

Don Werve’s answer of:

import fileinput
for line in fileinput.input():
    process(line)

is excellent, and probably exactly what you’re looking for. However, be aware that if you use it as-is you can encounter snags when using it with the argparse module, or if you specify any command-line switches when running your script.

For example, if you run:

./my_script.py --verbose file1.txt file2.txt file3.txt

you’ll get an error message saying that there is no such file named --verbose. So what do to if you’re using command-line switches?

What you have to do is isolate the input files in a list, and them pass them into the files argument of fileinput.input(). And if you’re using the argparse module, you can extract the input_files like this:

import argparse
parser = argparse.ArgumentParser()
parser.add_argument('-v', '--verbose', action='store_true')
parser.add_argument('input_files', nargs='*')

# Extract out command-line information:
args = parser.parse_args()
verbose = args.verbose
input_files = args.input_files

From here, we can pass our input_files into fileinput.input() with the files= argument:

import fileinput
for line in fileinput.input(files=input_files):
    process(line)

What’s nice about this is that if no input files are specified when calling your script, then input_files will be an empty list. And when you pass in an empty list as the files= argument, then fileinput.input() will iterate over sys.stdin.

That’s very convenient, as it behaves very similarly to Perl’s while (<>) { ... } construct.


Of course, you only need to consider this if there are arguments in your command-line that do not represent files to be read in. That is, if every single one of your arguments are always treated as files to be read in, then the following typical solution listed at the top of pydoc fileinput works perfectly fine:

import fileinput
for line in fileinput.input():
    process(line)
Answered By: J-L
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.