Parsing Nginx logs

Question:

I am trying to use Python to parse the Nginx error log file to determine if something happened in the last 15 minutes then do some things based off that.

I have nothing significant to show yet because I am totally unsure about how to do this. I have the rest of my script done in the sense it does everything I need except parsing the log file. I have looked google and on SO but have not found anything that would help me. I figured out how to check the last say 10 lines but that doesn’t solve the time issue.

I’m hoping someone can give me some direction. Or an example of how to parse a log file including the time and error message *show below as error I need to find

I know there is no code and SO normally wants code though I have nothing to show for this part of the script and don’t think there is a better stack exchange place to ask this question. Seems a bit basic for the software engineering one.

This is an example of the log file entry that I need to find a

2019/03/15 14:22:59 [error] 14064#0: <error I need to find>, client: XXX.XXX.XXX.XXX, server: example.com, request: "POST /hello", host: "example.com"
Asked By: user9753902

||

Answers:

You can use a regular expression pattern to find the different parts of the logs that you are interested in. You can isolate the different parts using round brackets, ( and ), into "groups". For example, if you are interested in the date and the error message of a line in the log file you could use Python’s re module like this:

import re
pattern = `^(d+/d+/d+ d+:d+:d+)s+S+s+S+s+(.+), client`
match = re.search(pattern, line) # where line is a single line in the log
date_time = match.group(0)
error_message = match.group(1)

You can see what each part of the pattern I used is for and play around with it here.

Since you’re only interested in the logs of the last 15 minutes, you could either use another regular expression or Python’s datetime module to parse the date and compare it to the current time. You could also do a combination of the two and write a less complex pattern that would eliminate obviously old logs before converting the date to a datetime object.

To create a datetime object from the date string that you got above, you can use the datetime.strptime method. It parses a string into a datetime object given a format. You can specify the format using the directives listed here. You could write a method like this to check if the date string is within the past 15 minutes:

from datetime import datetime, timedelta

MAX_DIFF = timedelta(minutes=15)
DATE_FORMAT = "%Y/%m/%d %H:%M:%S"  

def is_recent_date(date_string):
    current_time = datetime.now() 
    date_object = datetime.strptime(date_string, DATE_FORMAT)
    diff = current_time - date_object
    return diff < MAX_DIFF
Answered By: D Malan
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.