python: read file continuously, even after it has been logrotated
Question:
I have a simple python script, where I read logfile continuosly (same as tail -f
)
while True:
line = f.readline()
if line:
print line,
else:
time.sleep(0.1)
How can I make sure that I can still read the logfile, after it has been rotated by logrotate?
i.e. I need to do the same what tail -F
would do.
I am using python 2.7
Answers:
You can do it by keeping track of where you are in the file and reopening it when you want to read. When the log file rotates, you notice that the file is smaller and since you reopen, you handle any unlinking too.
import time
cur = 0
while True:
try:
with open('myfile') as f:
f.seek(0,2)
if f.tell() < cur:
f.seek(0,0)
else:
f.seek(cur,0)
for line in f:
print line.strip()
cur = f.tell()
except IOError, e:
pass
time.sleep(1)
This example hides errors like file not found because I’m not sure of logrotate details such as small periods of time where the file is not available.
NOTE: In python 3, things are different. A regular open
translates bytes
to str
and the interim buffer used for that conversion means that seek
and tell
don’t operate properly (except when seeking to 0 or the end of file). Instead, open in binary mode ("rb") and do the decode manually line by line. You’ll have to know the file encoding and what that encoding’s newline looks like. For utf-8, its b"n"
(one of the reasons utf-8 is superior to utf-16, btw).
As long as you only plan to do this on Unix, the most robust way is probably to check so that the open file still refers to the same i-node as the name, and reopen it when that is no longer the case. You can get the i-number of the file from os.stat
and os.fstat
, in the st_ino
field.
It could look like this:
import os, sys, time
name = "logfile"
current = open(name, "r")
curino = os.fstat(current.fileno()).st_ino
while True:
while True:
buf = current.read(1024)
if buf == "":
break
sys.stdout.write(buf)
try:
if os.stat(name).st_ino != curino:
new = open(name, "r")
current.close()
current = new
curino = os.fstat(current.fileno()).st_ino
continue
except IOError:
pass
time.sleep(1)
I doubt this works on Windows, but since you’re speaking in terms of tail
, I’m guessing that’s not a problem. 🙂
Thanks to @tdelaney and @Dolda2000’s answers, I ended up with what follows. It should work on both Linux and Windows, and also handle logrotate’s copytruncate
or create
options (respectively copy then truncate size to 0 and move then recreate file).
file_name = 'my_log_file'
seek_end = True
while True: # handle moved/truncated files by allowing to reopen
with open(file_name) as f:
if seek_end: # reopened files must not seek end
f.seek(0, 2)
while True: # line reading loop
line = f.readline()
if not line:
try:
if f.tell() > os.path.getsize(file_name):
# rotation occurred (copytruncate/create)
f.close()
seek_end = False
break
except FileNotFoundError:
# rotation occurred but new file still not created
pass # wait 1 second and retry
time.sleep(1)
do_stuff_with(line)
A limitation when using copytruncate
option is that if lines are appended to the file while time-sleeping, and rotation occurs before wake-up, the last lines will be “lost” (they will still be in the now “old” log file, but I cannot see a decent way to “follow” that file to finish reading it). This limitation is not relevant with “move and create” create
option because f
descriptor will still point to the renamed file and therefore last lines will be read before the descriptor is closed and opened again.
I made a variation of the awesome above one by @pawamoy into a generator function one for my log monitoring and following needs.
def tail_file(file):
"""generator function that yields new lines in a file
:param file:File Path as a string
:type file: str
:rtype: collections.Iterable
"""
seek_end = True
while True: # handle moved/truncated files by allowing to reopen
with open(file) as f:
if seek_end: # reopened files must not seek end
f.seek(0, 2)
while True: # line reading loop
line = f.readline()
if not line:
try:
if f.tell() > os.path.getsize(file):
# rotation occurred (copytruncate/create)
f.close()
seek_end = False
break
except FileNotFoundError:
# rotation occurred but new file still not created
pass # wait 1 second and retry
time.sleep(1)
yield line
Which can be easily used like the below
import os, time
access_logfile = '/var/log/syslog'
loglines = tail_file(access_logfile)
for line in loglines:
print(line)
Using ‘tail -F
-F same as –follow=name –retr
-f, –follow[={name|descriptor}] output appended data as the file grows;
–retry keep trying to open a file if it is inaccessible
-F option will follow the name of the file not descriptor.
So when logrotate happens, it will follow the new file.
import subprocess
def tail(filename: str) -> Generator[str, None, None]:
proc = subprocess.Popen(["tail", "-F", filename], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
while True:
line = proc.stdout.readline()
if line:
yield line.decode("utf-8")
else:
break
for line in tail("/config/logs/openssh/current"):
print(line.strip())
I have a simple python script, where I read logfile continuosly (same as tail -f
)
while True:
line = f.readline()
if line:
print line,
else:
time.sleep(0.1)
How can I make sure that I can still read the logfile, after it has been rotated by logrotate?
i.e. I need to do the same what tail -F
would do.
I am using python 2.7
You can do it by keeping track of where you are in the file and reopening it when you want to read. When the log file rotates, you notice that the file is smaller and since you reopen, you handle any unlinking too.
import time
cur = 0
while True:
try:
with open('myfile') as f:
f.seek(0,2)
if f.tell() < cur:
f.seek(0,0)
else:
f.seek(cur,0)
for line in f:
print line.strip()
cur = f.tell()
except IOError, e:
pass
time.sleep(1)
This example hides errors like file not found because I’m not sure of logrotate details such as small periods of time where the file is not available.
NOTE: In python 3, things are different. A regular open
translates bytes
to str
and the interim buffer used for that conversion means that seek
and tell
don’t operate properly (except when seeking to 0 or the end of file). Instead, open in binary mode ("rb") and do the decode manually line by line. You’ll have to know the file encoding and what that encoding’s newline looks like. For utf-8, its b"n"
(one of the reasons utf-8 is superior to utf-16, btw).
As long as you only plan to do this on Unix, the most robust way is probably to check so that the open file still refers to the same i-node as the name, and reopen it when that is no longer the case. You can get the i-number of the file from os.stat
and os.fstat
, in the st_ino
field.
It could look like this:
import os, sys, time
name = "logfile"
current = open(name, "r")
curino = os.fstat(current.fileno()).st_ino
while True:
while True:
buf = current.read(1024)
if buf == "":
break
sys.stdout.write(buf)
try:
if os.stat(name).st_ino != curino:
new = open(name, "r")
current.close()
current = new
curino = os.fstat(current.fileno()).st_ino
continue
except IOError:
pass
time.sleep(1)
I doubt this works on Windows, but since you’re speaking in terms of tail
, I’m guessing that’s not a problem. 🙂
Thanks to @tdelaney and @Dolda2000’s answers, I ended up with what follows. It should work on both Linux and Windows, and also handle logrotate’s copytruncate
or create
options (respectively copy then truncate size to 0 and move then recreate file).
file_name = 'my_log_file'
seek_end = True
while True: # handle moved/truncated files by allowing to reopen
with open(file_name) as f:
if seek_end: # reopened files must not seek end
f.seek(0, 2)
while True: # line reading loop
line = f.readline()
if not line:
try:
if f.tell() > os.path.getsize(file_name):
# rotation occurred (copytruncate/create)
f.close()
seek_end = False
break
except FileNotFoundError:
# rotation occurred but new file still not created
pass # wait 1 second and retry
time.sleep(1)
do_stuff_with(line)
A limitation when using copytruncate
option is that if lines are appended to the file while time-sleeping, and rotation occurs before wake-up, the last lines will be “lost” (they will still be in the now “old” log file, but I cannot see a decent way to “follow” that file to finish reading it). This limitation is not relevant with “move and create” create
option because f
descriptor will still point to the renamed file and therefore last lines will be read before the descriptor is closed and opened again.
I made a variation of the awesome above one by @pawamoy into a generator function one for my log monitoring and following needs.
def tail_file(file):
"""generator function that yields new lines in a file
:param file:File Path as a string
:type file: str
:rtype: collections.Iterable
"""
seek_end = True
while True: # handle moved/truncated files by allowing to reopen
with open(file) as f:
if seek_end: # reopened files must not seek end
f.seek(0, 2)
while True: # line reading loop
line = f.readline()
if not line:
try:
if f.tell() > os.path.getsize(file):
# rotation occurred (copytruncate/create)
f.close()
seek_end = False
break
except FileNotFoundError:
# rotation occurred but new file still not created
pass # wait 1 second and retry
time.sleep(1)
yield line
Which can be easily used like the below
import os, time
access_logfile = '/var/log/syslog'
loglines = tail_file(access_logfile)
for line in loglines:
print(line)
Using ‘tail -F
-F same as –follow=name –retr
-f, –follow[={name|descriptor}] output appended data as the file grows;
–retry keep trying to open a file if it is inaccessible
-F option will follow the name of the file not descriptor.
So when logrotate happens, it will follow the new file.
import subprocess
def tail(filename: str) -> Generator[str, None, None]:
proc = subprocess.Popen(["tail", "-F", filename], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
while True:
line = proc.stdout.readline()
if line:
yield line.decode("utf-8")
else:
break
for line in tail("/config/logs/openssh/current"):
print(line.strip())