How to filter logs from gunicorn?
Question:
I have a Flask API with gunicorn. Gunicorn logs all the requests to my API, i.e.
172.17.0.1 - - [19/Sep/2018:13:50:58 +0000] "GET /api/v1/myview HTTP/1.1" 200 16 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36"
However, I want to filter the logs to exclude a certain endpoint which is called from some other service all few seconds.
I wrote a filter to exclude this endpoint from being logged:
class NoReadyFilter(logging.Filter):
def filter(self, record):
return record.getMessage().find('/api/v1/ready') == -1
and if I add this filter to the werkzeug
logger and use the Flask development server, the filter works. Requests to /api/v1/ready
won’t appear in the log files. However, I can’t seem to add the filter to the gunicorn
logger. With the following code, requests to /api/v1/ready
still appear:
if __name__ != '__main__':
gunicorn_logger = logging.getLogger('gunicorn.glogging.Logger')
gunicorn_logger.setLevel(logging.INFO)
gunicorn_logger.addFilter(NoReadyFilter())
How can you add a filter to the gunicorn logger? I tried adding it to the gunicorn.error
-logger as suggested here, but it didn’t help.
Answers:
I finally found a way of doing it by creating a subclass
class CustomGunicornLogger(glogging.Logger):
def setup(self, cfg):
super().setup(cfg)
# Add filters to Gunicorn logger
logger = logging.getLogger("gunicorn.access")
logger.addFilter(NoReadyFilter())
that inherits from gunicorn.glogging.Logger
. You can then provide this class as a parameter for gunicorn
, e.g.
gunicorn --logger-class "myproject.CustomGunicornLogger" app
It’s an old question, but what you did is not working because you get the wrong gunicorn logger. The access log is not on error
logger but on access
logger (cf https://github.com/benoitc/gunicorn/blob/b2dc0364630c26cc315ee417f9c20ce05ad01211/gunicorn/glogging.py#L61)
Define your class like you did :
class NoReadyFilter(logging.Filter):
def filter(self, record):
return record.getMessage().find('/api/v1/ready') == -1
Then in the main entrypoint of your app :
if __name__ != "__main__":
gunicorn_logger = logging.getLogger("gunicorn.access")
gunicorn_logger.addFilter(NoReadyFilter())
gunicorn run command : gunicorn --access-logfile=- --log-file=- -b 0.0.0.0:5000 entrypoint:app
While a custom logging class would work, it is probably an overkill for a simple access log filter. Instead, I would use Gunicorn’s on_starting() server hook to add a filter to the access logger.
The hook can be added in the settings file (default gunicorn.conf.py
), so all gunicorn configuration stays in one place.
import logging
import re
wsgi_app = 'myapp.wsgi'
bind = '0.0.0.0:9000'
workers = 5
accesslog = '-'
class RequestPathFilter(logging.Filter):
def __init__(self, *args, path_re, **kwargs):
super().__init__(*args, **kwargs)
self.path_filter = re.compile(path_re)
def filter(self, record):
req_path = record.args['U']
if not self.path_filter.match(req_path):
return True # log this entry
# ... additional conditions can be added here ...
return False # do not log this entry
def on_starting(server):
server.log.access_log.addFilter(RequestPathFilter(path_re=r'^/api/v1/ready$'))
Some notes on this sample implementation:
RequestPathFilter
can also be nested on_starting()
to hide it from external modules.
- Filtering is applied on
record.args
. This contains the raw values used to construct the logging message.
- Apply filtering on the results of
record.getMessage()
instead of the raw values is bad because:
- Gunicorn will already have done the work of constructing the message.
- Filtering mechanism can be manipulated by the client. This would allow e.g. an attacker to hide their activities by setting their user agent to
Wget/1.20.1/api/v1/ready (linux-gnu)
.
I have a Flask API with gunicorn. Gunicorn logs all the requests to my API, i.e.
172.17.0.1 - - [19/Sep/2018:13:50:58 +0000] "GET /api/v1/myview HTTP/1.1" 200 16 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36"
However, I want to filter the logs to exclude a certain endpoint which is called from some other service all few seconds.
I wrote a filter to exclude this endpoint from being logged:
class NoReadyFilter(logging.Filter):
def filter(self, record):
return record.getMessage().find('/api/v1/ready') == -1
and if I add this filter to the werkzeug
logger and use the Flask development server, the filter works. Requests to /api/v1/ready
won’t appear in the log files. However, I can’t seem to add the filter to the gunicorn
logger. With the following code, requests to /api/v1/ready
still appear:
if __name__ != '__main__':
gunicorn_logger = logging.getLogger('gunicorn.glogging.Logger')
gunicorn_logger.setLevel(logging.INFO)
gunicorn_logger.addFilter(NoReadyFilter())
How can you add a filter to the gunicorn logger? I tried adding it to the gunicorn.error
-logger as suggested here, but it didn’t help.
I finally found a way of doing it by creating a subclass
class CustomGunicornLogger(glogging.Logger):
def setup(self, cfg):
super().setup(cfg)
# Add filters to Gunicorn logger
logger = logging.getLogger("gunicorn.access")
logger.addFilter(NoReadyFilter())
that inherits from gunicorn.glogging.Logger
. You can then provide this class as a parameter for gunicorn
, e.g.
gunicorn --logger-class "myproject.CustomGunicornLogger" app
It’s an old question, but what you did is not working because you get the wrong gunicorn logger. The access log is not on error
logger but on access
logger (cf https://github.com/benoitc/gunicorn/blob/b2dc0364630c26cc315ee417f9c20ce05ad01211/gunicorn/glogging.py#L61)
Define your class like you did :
class NoReadyFilter(logging.Filter):
def filter(self, record):
return record.getMessage().find('/api/v1/ready') == -1
Then in the main entrypoint of your app :
if __name__ != "__main__":
gunicorn_logger = logging.getLogger("gunicorn.access")
gunicorn_logger.addFilter(NoReadyFilter())
gunicorn run command : gunicorn --access-logfile=- --log-file=- -b 0.0.0.0:5000 entrypoint:app
While a custom logging class would work, it is probably an overkill for a simple access log filter. Instead, I would use Gunicorn’s on_starting() server hook to add a filter to the access logger.
The hook can be added in the settings file (default gunicorn.conf.py
), so all gunicorn configuration stays in one place.
import logging
import re
wsgi_app = 'myapp.wsgi'
bind = '0.0.0.0:9000'
workers = 5
accesslog = '-'
class RequestPathFilter(logging.Filter):
def __init__(self, *args, path_re, **kwargs):
super().__init__(*args, **kwargs)
self.path_filter = re.compile(path_re)
def filter(self, record):
req_path = record.args['U']
if not self.path_filter.match(req_path):
return True # log this entry
# ... additional conditions can be added here ...
return False # do not log this entry
def on_starting(server):
server.log.access_log.addFilter(RequestPathFilter(path_re=r'^/api/v1/ready$'))
Some notes on this sample implementation:
RequestPathFilter
can also be nestedon_starting()
to hide it from external modules.- Filtering is applied on
record.args
. This contains the raw values used to construct the logging message. - Apply filtering on the results of
record.getMessage()
instead of the raw values is bad because:- Gunicorn will already have done the work of constructing the message.
- Filtering mechanism can be manipulated by the client. This would allow e.g. an attacker to hide their activities by setting their user agent to
Wget/1.20.1/api/v1/ready (linux-gnu)
.