Python3 subprocess output

Question:

I want to run the Linux word count utility wc to determine the number of lines currently in the /var/log/syslog, so that I can detect that it’s growing. I’ve tried various test, and while I get the results back from wc, it includes both the line count as well as the command (e.g., var/log/syslog).

So it’s returning:
1338 /var/log/syslog
But I only want the line count, so I want to strip off the /var/log/syslog portion, and just keep 1338.

I have tried converting it to string from bytestring, and then stripping the result, but no joy. Same story for converting to string and stripping, decoding, etc – all fail to produce the output I’m looking for.

These are some examples of what I get, with 1338 lines in syslog:

  • b’1338 /var/log/syslogn’
  • 1338 /var/log/syslog

Here’s some test code I’ve written to try and crack this nut, but no solution:

import subprocess

#check_output returns byte string
stdoutdata = subprocess.check_output("wc --lines /var/log/syslog", shell=True)
print("2A stdoutdata: " + str(stdoutdata))
stdoutdata = stdoutdata.decode("utf-8")
print("2B stdoutdata: " + str(stdoutdata))    
stdoutdata=stdoutdata.strip()
print("2C stdoutdata: " + str(stdoutdata))    

The output from this is:

  • 2A stdoutdata: b’1338 /var/log/syslogn’

  • 2B stdoutdata: 1338 /var/log/syslog

  • 2C stdoutdata: 1338 /var/log/syslog

  • 2D stdoutdata: 1338 /var/log/syslog

Asked By: user2565677

||

Answers:

I suggest that you use subprocess.getoutput() as it does exactly what you want—run a command in a shell and get its string output (as opposed to byte string output). Then you can split on whitespace and grab the first element from the returned list of strings.

Try this:

import subprocess
stdoutdata = subprocess.getoutput("wc --lines /var/log/syslog")
print("stdoutdata: " + stdoutdata.split()[0])
Answered By: Joseph Dunn

To avoid invoking a shell and decoding filenames that might be an arbitrary byte sequence (except '') on *nix, you could pass the file as stdin:

import subprocess

with open(b'/var/log/syslog', 'rb') as file:
    nlines = int(subprocess.check_output(['wc', '-l'], stdin=file))
print(nlines)

Or you could ignore any decoding errors:

import subprocess

stdoutdata = subprocess.check_output(['wc', '-l', '/var/log/syslog'])
nlines = int(stdoutdata.decode('ascii', 'ignore').partition(' ')[0])
print(nlines)
Answered By: jfs

Since Python 3.6 you can make check_output() return a str instead of bytes by giving it an encoding parameter:

check_output('wc --lines /var/log/syslog', encoding='UTF-8')

But since you just want the count, and both split() and int() are usable with bytes, you don’t need to bother with the encoding:

linecount = int(check_output('wc -l /var/log/syslog').split()[0])

While some things might be easier with an external program (e.g., counting log line entries printed by journalctl), in this particular case you don’t need to use an external program. The simplest Python-only solution is:

with open('/var/log/syslog', 'rt') as f:
    linecount = len(f.readlines())

This does have the disadvantage that it reads the entire file into memory; if it’s a huge file instead initialize linecount = 0 before you open the file and use a for line in f: linecount += 1 loop instead of readlines() to have only a small part of the file in memory as you count.

Answered By: cjs

Equivalent to Curt J. Sampson’s answer is also this one (it’s returning a string):

subprocess.check_output('wc -l /path/to/your/file | cut -d " " -f1', universal_newlines=True, shell=True)

from docs:

If encoding or errors are specified, or text is true, file objects for
stdin, stdout and stderr are opened in text mode using the specified
encoding and errors or the io.TextIOWrapper default. The
universal_newlines argument is equivalent to text and is provided for
backwards compatibility. By default, file objects are opened in binary
mode.

Something similar, but a bit more complex using subprocess.run():

subprocess.run(command, shell=True, check=True, universal_newlines=True, stdout=subprocess.PIPE).stdout

as subprocess.check_output() could be equivalent to subprocess.run().

Answered By: Catalin B.

getoutput (and the closer replacement getstatusoutput) are not a direct replacement of check_output – there are security changes in 3.x that prevent some previous commands from working that way (my script was attempting to work with iptables and failing with the new commands). Better to adapt to the new python3 output and add the argument universal_newlines=True:

check_output(command, universal_newlines=True)

This command will behave as you expect check_output, but return string output instead of bytes. It’s a direct replacement.

Answered By: tk421storm
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.