Python logging into file as a dictionary or JSON

Question:

I am trying to set up logging where I can log in both stdout and on to a file. This i have accomplished using the following code:

logging.basicConfig(
        level=logging.DEBUG, format='%(asctime)-15s %(levelname)-8s %(message)s',
        datefmt='%a, %d %b %Y %H:%M:%S', handlers=[logging.FileHandler(path), logging.StreamHandler()])

The output of this something like this:

2018-05-02 18:43:33,295 DEBUG    Starting new HTTPS connection (1): google.com
2018-05-02 18:43:33,385 DEBUG    https://google.com:443 "GET / HTTP/1.1" 301 220
2018-05-02 18:43:33,389 DEBUG    Starting new HTTPS connection (1): www.google.com
2018-05-02 18:43:33,490 DEBUG    https://www.google.com:443 "GET / HTTP/1.1" 200 None

What I am trying to accomplish is logging this output to a file not as it is printing to stdout, but as a dictionary or JSON object similar to something like this (while keeping the stdout as it is at the moment):

[{'time': '2018-05-02 18:43:33,295', 'level': 'DEBUG', 'message': 'Starting new HTTPS connection (1): google.com'}, {...}, {...}]

Is this doable? I understand that I can post process this log file after my process is finished, but I am looking for a more elegant solution because certain things i am logging are quite big objects themselves.

Asked By: securisec

||

Answers:

So based on @abarnert, i found this Link which provided a good path to making this concept work for the most part. The code as it stands is:

logger=logging.getLogger()
logger.setLevel(logging.DEBUG)

file_handler=logging.FileHandler('foo.log')
stream_handler=logging.StreamHandler()

stream_formatter=logging.Formatter(
    '%(asctime)-15s %(levelname)-8s %(message)s')
file_formatter=logging.Formatter(
    "{'time':'%(asctime)s', 'name': '%(name)s', 
    'level': '%(levelname)s', 'message': '%(message)s'}"
)

file_handler.setFormatter(file_formatter)
stream_handler.setFormatter(stream_formatter)

logger.addHandler(file_handler)
logger.addHandler(stream_handler)

Although it does not fully meet the requirement, it doesnt require any pre processing, and allows me to create two log handlers.

Afterwards, i can use something like:

with open('foo.log') as f:
    logs = f.read().splitlines()
for l in logs:
    for key, value in eval(l):
        do something ...

to pull dict objects instead of fighting with improperly formatted JSON to accomplish what i had set out to accomplish.

Still am hoping for a more elegant solution.

Answered By: securisec

With this code you can add the full traceback, timestamp and level to a json file of choice.

import json
import traceback
from datetime import datetime

def addLogging(logDict:dict):
    loggingsFile = 'loggings.json'

    with open(loggingsFile) as f:
        data = json.load(f)

    data.append(logDict)

    with open(loggingsFile, 'w') as f:
        json.dump(data, f)

def currentTimeUTC():
    return datetime.now().strftime('%d/%m/%Y %H:%M:%S')

try:
    print(5 / 0)
except ZeroDivisionError:
    fullTraceback = str(traceback.format_exc())
    addLogging({'timestamp': currentTimeUTC(), 'level': 'error', 'traceback': fullTraceback})

Output:

[
    {
        "timestamp": "09/06/2020 17:38:00",
        "level": "error",
        "traceback": "Traceback (most recent call last):n  File "d:testFullTraceback.py", line 19, in <module>n    print(5/0)nZeroDivisionError: division by zeron"
    }
]
Answered By: Kobe Janssens

If you don’t mind pip installing a module to help, there is the json_log_formatter module. The json output does have more attributes than requested. The repo mentions customizing the attributes outputted, but I have not yet integrated that into a working example just yet.

json_log_formatter

import logging
import json_log_formatter

# Set Basic Logging 
self.loggers = logging.getLogger(__name__)
self.loggers.setLevel(logging.DEBUG)
self.formatter = logging.Formatter(fmt='%(asctime)-15s %(levelname)-8s %(message)s', datefmt = '%a, %d %b %Y %H:%M:%S')


# Config for JSON File Handler
self.logFileHandler = logging.FileHandler(SOME-PATH, mode='a')
self.fileFormatter = json_log_formatter.VerboseJSONFormatter()
self.logFileHandler.setFormatter(self.fileFormatter)
self.logFileHandler.setLevel(logging.INFO)
self.loggers.addHandler(self.logFileHandler)


# Config for Stream Handler
self.logStreamHandler = logging.StreamHandler()
self.logStreamHandler.setFormatter(self.formatter)
self.logStreamHandler.setLevel(logging.INFO)
self.loggers.addHandler(self.logStreamHandler)
Answered By: deesolie

I too dealt with this and I personally believe that an external library might be an overkill for something like this.

I studied a bit the code behind logging.Formatter and came up with a subclass which in my case does the trick (my goal was to have a JSON file that Filebeat can read to further log into ElasticSearch).

Class:

import logging
import json


class JsonFormatter(logging.Formatter):
    """
    Formatter that outputs JSON strings after parsing the LogRecord.

    @param dict fmt_dict: Key: logging format attribute pairs. Defaults to {"message": "message"}.
    @param str time_format: time.strftime() format string. Default: "%Y-%m-%dT%H:%M:%S"
    @param str msec_format: Microsecond formatting. Appended at the end. Default: "%s.%03dZ"
    """
    def __init__(self, fmt_dict: dict = None, time_format: str = "%Y-%m-%dT%H:%M:%S", msec_format: str = "%s.%03dZ"):
        self.fmt_dict = fmt_dict if fmt_dict is not None else {"message": "message"}
        self.default_time_format = time_format
        self.default_msec_format = msec_format
        self.datefmt = None

    def usesTime(self) -> bool:
        """
        Overwritten to look for the attribute in the format dict values instead of the fmt string.
        """
        return "asctime" in self.fmt_dict.values()

    def formatMessage(self, record) -> dict:
        """
        Overwritten to return a dictionary of the relevant LogRecord attributes instead of a string. 
        KeyError is raised if an unknown attribute is provided in the fmt_dict. 
        """
        return {fmt_key: record.__dict__[fmt_val] for fmt_key, fmt_val in self.fmt_dict.items()}

    def format(self, record) -> str:
        """
        Mostly the same as the parent's class method, the difference being that a dict is manipulated and dumped as JSON
        instead of a string.
        """
        record.message = record.getMessage()
        
        if self.usesTime():
            record.asctime = self.formatTime(record, self.datefmt)

        message_dict = self.formatMessage(record)

        if record.exc_info:
            # Cache the traceback text to avoid converting it multiple times
            # (it's constant anyway)
            if not record.exc_text:
                record.exc_text = self.formatException(record.exc_info)

        if record.exc_text:
            message_dict["exc_info"] = record.exc_text

        if record.stack_info:
            message_dict["stack_info"] = self.formatStack(record.stack_info)

        return json.dumps(message_dict, default=str)

Usage:

The formatter must simply be passed to the logging handler.

    json_handler = FileHandler("foo.json")
    json_formatter = JsonFormatter({"level": "levelname", 
                                    "message": "message", 
                                    "loggerName": "name", 
                                    "processName": "processName",
                                    "processID": "process", 
                                    "threadName": "threadName", 
                                    "threadID": "thread",
                                    "timestamp": "asctime"})
    json_handler.setFormatter(json_formatter)

Explanation :

While the logging.Formatter takes a string which it interpolates to output the formatted log record, the JsonFormatter takes a dictionary where the key will be the key of the logged value in the JSON string and the value is a string corresponding to an attribute of the LogRecord that can be logged.
(List available in the docs here)

Main "problem" would be parsing dates and timestamps, and the default formatter implementation has these class attributes, default_time_format and default_msec_format.

default_msec_format gets passed to time.strftime() and default_msec_format is interpolated to append miliseconds as time.strftime() doesn’t provide formatting options for those.

The principle is that those are now instance attributes which can be provided in the form of time_format and msec_format to customize how the parent’s class (unchanged, as it’s not overwritten) formatTime() method behaves.

You could technically override it if you want to customize time formatting, but I personally found that using something else would either be redundant or limit the actual formatting options. But feel free to adjust as to your needs.

Output:

An example JSON record logged by the formatting options above, with the default time formatting options set in the class, would be:

{"level": "INFO", "message": "Starting service...", "loggerName": "root", "processName": "MainProcess", "processID": 25103, "threadName": "MainThread", "threadID": 4721200640, "timestamp": "2021-12-04T08:25:07.610Z"}
Answered By: Bogdan Mircea

I wanted to save full LogRecord object, so I can later inspect my log with maximum of integration with module. So I inspected object like that:

class Handler_json(Handler):

def emit(self, record: LogRecord) -> None:
    json_data = {}
    for attr in filter(lambda attr: not attr.endswith("__"), dir(record)):
        json_data[attr] = record.__getattribute__(attr)
    del json_data["getMessage"]
    print(json_data)

this is subclass of Handler, emit is rewritten method that is been called with every LogRecord. Dir returns all attributes and methods of the object. I am excluding special methods, and also deleting getMessage method, whitch is not needed for json object representation.

This can be nicely integrated in logging like that:

logger = getLogger(__name__)
logger.setLevel(DEBUG)
handle_json = Handler_json()
logger.addHandler(handle_json)
logger.info("my info")

result looks like this:

{
'args': (),
'created': 1639925351.0648422,
'exc_info': None,
'exc_text': None,
'filename': 'my_logging.py',
'funcName': 'restore_log_from_disk',
'levelname': 'INFO',
'levelno': 20,
'lineno': 142,
'module': 'my_logging',
'msecs': 64.84222412109375,
'msg': 'my info',
'name': '__main__',
'pathname': 
'/home/jindrich/PycharmProjects/my_logging.py',
'process': 146331,
'processName': 'MainProcess',
'relativeCreated': 1.6417503356933594,
'stack_info': None,
'thread': 140347192436544,
'threadName': 'MainThread'
}

then you can load data from disk and recreat objects after some digging in doc.

Answered By: jamburg

I was able to create it using python-json-logger lib, it was simple and really easily to use.

Django


from pythonjsonlogger import jsonlogger

##This is to add custom keys
class CustomJsonFormatter(jsonlogger.JsonFormatter):
    def add_fields(self, log_record, record, message_dict):
        super(CustomJsonFormatter, self).add_fields(log_record, record, message_dict)
        log_record['level'] = record.levelname


# Logging
LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'formatters': {
        'json': {
            '()': CustomJsonFormatter, # if you want to use custom logs class defined above
            # '()': jsonlogger.JsonFormatter, # without custom logs
            'format': '%(level)s %(levelname)s %(asctime)s %(module)s %(process)d %(thread)d %(message)s %(threadName)s %(special)s %(run)s
                %(name)s %(created)s %(processName)s %(relativeCreated)d %(funcName)s %(levelno)d %(msecs)d %(pathname)s %(lineno)d %(filename)s'
        },
    },
    'handlers': {
        'null': {
            'class': 'logging.NullHandler',
        },
        'console': {
            'level': 'DEBUG',
            'class': 'logging.StreamHandler',
            'formatter': 'json' 
        },
    },
    .....
}

Flask

from logging.config import dictConfig
from pythonjsonlogger import jsonlogger
import os

# This will set global root logging config across all the modules using in the app

##This is to add custom keys
class CustomJsonFormatter(jsonlogger.JsonFormatter):
    def add_fields(self, log_record, record, message_dict):
        super(CustomJsonFormatter, self).add_fields(log_record, record, message_dict)
        log_record['level'] = record.levelname


def setup():
    LOG_FILE = '/tmp/app/app.json'
    if not os.path.exists(os.path.dirname(LOG_FILE)):  # if LOG_FILE dir doesn't exist, creates it.
        os.makedirs(os.path.dirname(LOG_FILE))
    dictConfig({
        'version': 1,
        'formatters': {
            'default': {
                'format': '%(asctime)s - %(module)s - %(levelname)s - %(message)s',
            },
            'json': {
            '()': CustomJsonFormatter, # if you want to use custom logs class defined above
            # '()': jsonlogger.JsonFormatter, # without custom logs
            'format': '%(level)s %(levelname)s %(asctime)s %(module)s %(process)d %(thread)d %(message)s %(threadName)s %(special)s %(run)s
                %(name)s %(created)s %(processName)s %(relativeCreated)d %(funcName)s %(levelno)d %(msecs)d %(pathname)s %(lineno)d %(filename)s'
        },
        },
        'handlers': {'file': {
            'class': 'logging.handlers.RotatingFileHandler',
            'filename': LOG_FILE,
            'maxBytes': 10485760,
            'backupCount': 5,
            'formatter': 'json'
        },
        'console':{
            'class':'logging.StreamHandler',
            'formatter': 'json'
        }},
        'root': {
            'level': 'INFO',
            'handlers': ['file', 'console']
        }
    })

Hope this helps easy way setup.

Answered By: Avi

I could achieve this result using this custom formatter:

import json
import logging


class CustomJsonFormatter(logging.Formatter):
    def format(self, record: logging.LogRecord) -> str:
        super(CustomJsonFormatter, self).format(record)
        output = {k: str(v) for k, v in record.__dict__.items()}
        return json.dumps(output)


cf = CustomJsonFormatter()
sh = logging.StreamHandler()
sh.setFormatter(cf)

logger = logging.getLogger("my.module")
logger.addHandler(sh)

# simple json output
logger.warning("This is a great %s!", "log")
# enrich json output
logger.warning("This is an even greater %s!", "log", extra={'foo': 'bar'})

Output:

{"name": "my.module", "msg": "This is a great %s!", "args": "('log',)", "levelname": "WARNING", "levelno": "30", "pathname": "/Users/olivier/test.py", "filename": "test.py", "module": "test", "exc_info": "None", "exc_text": "None", "stack_info": "None", "lineno": "20", "funcName": "<module>", "created": "1661868378.5048351", "msecs": "504.8351287841797", "relativeCreated": "1.3060569763183594", "thread": "4640826880", "threadName": "MainThread", "processName": "MainProcess", "process": "81360", "message": "This is a great log!"}
{"name": "my.module", "msg": "This is an even greater %s!", "args": "('log',)", "levelname": "WARNING", "levelno": "30", "pathname": "/Users/olivier/test.py", "filename": "test.py", "module": "test", "exc_info": "None", "exc_text": "None", "stack_info": "None", "lineno": "22", "funcName": "<module>", "created": "1661868378.504962", "msecs": "504.9619674682617", "relativeCreated": "1.4328956604003906", "thread": "4640826880", "threadName": "MainThread", "processName": "MainProcess", "process": "81360", "foo": "bar", "message": "This is an even greater log!"}
Answered By: Olivier Desnoë

I wanted to have JSON output, so that I could handle it nicer in Promtail and Loki. I just updated the formatter in my logging.json, which I use as a dictConfig

"formatters": {
    "normalFormatter": {
        "format": "{"time": "%(asctime)s", "name": "[%(name)s]", "levelname": "%(levelname)s", "message": "%(message)s"}"
    }
}

Load the config and get the root logger like:

import json
import logging


# setup logger
with open("logging.json") as f:
    config_dict = json.load(f)
    logging.config.dictConfig(config_dict)

# get root logger
logger = logging.getLogger(__name__)

If you want to use dictConfig, please make sure you gave all necessary fields.

https://docs.python.org/3/library/logging.config.html

You may want to define a formatter, a handler (which is using the formatter) and the logger (which is using the handler)

e.g. logging.json

{
"version": 1,
"disable_existing_loggers": false,
"formatters": {
    "normalFormatter": {
        "format": "{"time": "%(asctime)s", "name": "[%(name)s]", "levelname": "%(levelname)s", "message": "%(message)s"}"
    }
},
"handlers": {
    "demohandler": {
        "level": "INFO",
        "formatter": "normalFormatter",
        "class": "logging.handlers.TimedRotatingFileHandler",
        "filename": "./files/logs/YourLogFile.log",
        "when": "d",
        "interval": 30,
        "backupCount": 4,
        "utc": true
    }
},
"loggers": {
    "root": {
        "handlers": ["demohandler"],
        "level": "INFO"
    },
    "someModule": {
        "handlers": ["demohandler"],
        "level": "INFO",            
        "propagate": 0
    }        
}
}
Answered By: David G.
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.