Read from File, or STDIN
Question:
I’ve written a command line utility that uses getopt for parsing arguments given on the command line. I would also like to have a filename be an optional argument, such as it is in other utilities like grep, cut etc. So, I would like it to have the following usage
tool -d character -f integer [filename]
How can I implement the following?
- if a filename is given, read from the file.
- if a filename is not given, read from STDIN.
Answers:
The fileinput module may do what you want – assuming the non-option arguments are in args
then:
import fileinput
for line in fileinput.input(args):
print line
If args
is empty then fileinput.input()
will read from stdin; otherwise it reads from each file in turn, in a similar manner to Perl’s while(<>)
.
In the simplest terms:
import sys
# parse command line
if file_name_given:
inf = open(file_name_given)
else:
inf = sys.stdin
At this point you would use inf
to read from the file. Depending on whether a filename was given, this would read from the given file or from stdin.
When you need to close the file, you can do this:
if inf is not sys.stdin:
inf.close()
However, in most cases it will be harmless to close sys.stdin
if you’re done with it.
Something like:
if input_from_file:
f = open(file_name, "rt")
else:
f = sys.stdin
inL = f.readline()
while inL:
print inL.rstrip()
inL = f.readline()
To make use of python’s with
statement, one can use the following code:
import sys
with open(sys.argv[1], 'r') if len(sys.argv) > 1 else sys.stdin as f:
# read data using f
# ......
I prefer to use "-" as an indicator that you should read from stdin, it’s more explicit:
import sys
with open(sys.argv[1], 'r') if sys.argv[1] != "-" else sys.stdin as f:
pass # do something here
I like the general idiom of using a context manager, but the (too) trivial solution ends up closing sys.stdin
when you are out of the with
statement, which I want to avoid.
Borrowing from this answer, here is a workaround:
import sys
import contextlib
@contextlib.contextmanager
def _smart_open(filename, mode='Ur'):
if filename == '-':
if mode is None or mode == '' or 'r' in mode:
fh = sys.stdin
else:
fh = sys.stdout
else:
fh = open(filename, mode)
try:
yield fh
finally:
if filename != '-':
fh.close()
if __name__ == '__main__':
args = sys.argv[1:]
if args == []:
args = ['-']
for filearg in args:
with _smart_open(filearg) as handle:
do_stuff(handle)
I suppose you could achieve something similar with os.dup()
but the code I cooked up to do that turned out to be more complex and more magical, whereas the above is somewhat clunky but very straightforward.
Not a direct answer but related.
Normally when you write a python script you could use the argparse
package.
If this is the case you can use:
parser = argparse.ArgumentParser()
parser.add_argument('infile', nargs='?', type=argparse.FileType('r'), default=sys.stdin)
‘?’. One argument will be consumed from the command line if possible,
and produced as a single item. If no command-line argument is present,
the value from default will be produced.
and here we set default to sys.stdin
;
so If there is a file it will read it , and if not it will take the input from stdin “Note: that we are using positional argument in the example above”
for more visit: https://docs.python.org/2/library/argparse.html#nargs
Switch to argparse
(it’s also part of the standard library) and use an
argparse.FileType
with a default value of stdin:
import argparse, sys
p = argparse.ArgumentParser()
p.add_argument('input', nargs='?',
type=argparse.FileType(), default=sys.stdin)
args = p.parse_args()
print(args.input.readlines())
This will not let you specify encoding and other parameters for stdin,
however; if you want to do that you need to make the argument non-optional
and let FileType
do its thing with stdin when -
is given as an
argument:
p.add_argument('input', type=FileType(encoding='UTF-8'))
Take heed that this latter case will not honour binary mode ('b'
) I/O. If
you need only that, you can use the default argument technique above, but
extract the binary I/O object, e.g., default=sys.stdout.buffer
for
stdout. However, this will still break if the user specifies -
anyway.
(With -
stdin/stdout is always wrapped in a TextIOWrapper
.)
If you want it to work with -
, or have any other arguments you need to
provide when opening the file, you can fix the argument if it got wrapped
wrong:
p.add_argument('output', type=argparse.FileType('wb'))
args = p.parse_args()
if hasattr(args.output, 'buffer'):
# If the argument was '-', FileType('wb') ignores the 'b' when
# wrapping stdout. Fix that by grabbing the underlying binary writer.
args.output = args.output.buffer
(Hat tip to medhat for
mentioning add_argument()
‘s type
parameter.)
A KISS solution is:
if file == "-":
content = sys.stdin.read()
else:
with open(file) as f:
content = f.read()
print(content) # Or whatever you want to do with the content of the file.
I’ve written a command line utility that uses getopt for parsing arguments given on the command line. I would also like to have a filename be an optional argument, such as it is in other utilities like grep, cut etc. So, I would like it to have the following usage
tool -d character -f integer [filename]
How can I implement the following?
- if a filename is given, read from the file.
- if a filename is not given, read from STDIN.
The fileinput module may do what you want – assuming the non-option arguments are in args
then:
import fileinput
for line in fileinput.input(args):
print line
If args
is empty then fileinput.input()
will read from stdin; otherwise it reads from each file in turn, in a similar manner to Perl’s while(<>)
.
In the simplest terms:
import sys
# parse command line
if file_name_given:
inf = open(file_name_given)
else:
inf = sys.stdin
At this point you would use inf
to read from the file. Depending on whether a filename was given, this would read from the given file or from stdin.
When you need to close the file, you can do this:
if inf is not sys.stdin:
inf.close()
However, in most cases it will be harmless to close sys.stdin
if you’re done with it.
Something like:
if input_from_file:
f = open(file_name, "rt")
else:
f = sys.stdin
inL = f.readline()
while inL:
print inL.rstrip()
inL = f.readline()
To make use of python’s with
statement, one can use the following code:
import sys
with open(sys.argv[1], 'r') if len(sys.argv) > 1 else sys.stdin as f:
# read data using f
# ......
I prefer to use "-" as an indicator that you should read from stdin, it’s more explicit:
import sys
with open(sys.argv[1], 'r') if sys.argv[1] != "-" else sys.stdin as f:
pass # do something here
I like the general idiom of using a context manager, but the (too) trivial solution ends up closing sys.stdin
when you are out of the with
statement, which I want to avoid.
Borrowing from this answer, here is a workaround:
import sys
import contextlib
@contextlib.contextmanager
def _smart_open(filename, mode='Ur'):
if filename == '-':
if mode is None or mode == '' or 'r' in mode:
fh = sys.stdin
else:
fh = sys.stdout
else:
fh = open(filename, mode)
try:
yield fh
finally:
if filename != '-':
fh.close()
if __name__ == '__main__':
args = sys.argv[1:]
if args == []:
args = ['-']
for filearg in args:
with _smart_open(filearg) as handle:
do_stuff(handle)
I suppose you could achieve something similar with os.dup()
but the code I cooked up to do that turned out to be more complex and more magical, whereas the above is somewhat clunky but very straightforward.
Not a direct answer but related.
Normally when you write a python script you could use the argparse
package.
If this is the case you can use:
parser = argparse.ArgumentParser()
parser.add_argument('infile', nargs='?', type=argparse.FileType('r'), default=sys.stdin)
‘?’. One argument will be consumed from the command line if possible,
and produced as a single item. If no command-line argument is present,
the value from default will be produced.
and here we set default to sys.stdin
;
so If there is a file it will read it , and if not it will take the input from stdin “Note: that we are using positional argument in the example above”
for more visit: https://docs.python.org/2/library/argparse.html#nargs
Switch to argparse
(it’s also part of the standard library) and use an
argparse.FileType
with a default value of stdin:
import argparse, sys
p = argparse.ArgumentParser()
p.add_argument('input', nargs='?',
type=argparse.FileType(), default=sys.stdin)
args = p.parse_args()
print(args.input.readlines())
This will not let you specify encoding and other parameters for stdin,
however; if you want to do that you need to make the argument non-optional
and let FileType
do its thing with stdin when -
is given as an
argument:
p.add_argument('input', type=FileType(encoding='UTF-8'))
Take heed that this latter case will not honour binary mode ('b'
) I/O. If
you need only that, you can use the default argument technique above, but
extract the binary I/O object, e.g., default=sys.stdout.buffer
for
stdout. However, this will still break if the user specifies -
anyway.
(With -
stdin/stdout is always wrapped in a TextIOWrapper
.)
If you want it to work with -
, or have any other arguments you need to
provide when opening the file, you can fix the argument if it got wrapped
wrong:
p.add_argument('output', type=argparse.FileType('wb'))
args = p.parse_args()
if hasattr(args.output, 'buffer'):
# If the argument was '-', FileType('wb') ignores the 'b' when
# wrapping stdout. Fix that by grabbing the underlying binary writer.
args.output = args.output.buffer
(Hat tip to medhat for
mentioning add_argument()
‘s type
parameter.)
A KISS solution is:
if file == "-":
content = sys.stdin.read()
else:
with open(file) as f:
content = f.read()
print(content) # Or whatever you want to do with the content of the file.