Python execution log

Question:

I’d like to create a log for a Python script execution. For example:

import pandas as pd
data = pd.read_excel('example.xlsx')
data.head()

How can I create a log for this script un order to know who run the script, when was executed, when did it finish. And ir for example, suppossing I take a sample of the df, how can I make to create a seed so I can share it to another person to execute it and have the same result?

Asked By: Rfl

||

Answers:

You could use the logging module that comes by default with Python.
You’ll have to add a few extra lines of code to configure it to log the information you require (time of execution and user executing the script) and specify a file name where the log messages should be stored at.

In respect to adding the information of "who" ran the script, it will depend on how you want to differentiate users. If your script is intended to be executed on some server, you might want to differentiate users by their IP addresses. Another solution is to use the getpass module, like I did in the example below.

Finally, when generating a sample from data, you can set an integer as seed to the parameter random_state to make the sample always contain the same rows.

Here’s a modified version of your script with the previously mentioned changes:

# == Necessary Imports =========================================================
import logging
import pandas as pd
import getpass


# == Script Configuration ======================================================
# Set a seed to enable reproducibility
SEED = 1

# Get the username of the person who is running the script.
USERNAME = getpass.getuser()

# Set a format to the logs.
LOG_FORMAT = '[%(levelname)s | ' + USERNAME + ' | %(asctime)s] - %(message)s'

# Name of the file to store the logs.
LOG_FILENAME = 'script_execution.log'

# Level in which messages are to be logged. Logging, by default has the
# following levels, ordered by ranking of severity:
# 1. DEBUG: detailed information, useful only when diagnosing a problem.
# 2. INFO: message that confirms that everything is working as it should.
# 3. WARNING: message with information that requires user attention
# 4. ERROR: an error has occurred and script is unable to perform some function.
# 5. CRITICAL: serious error occurred and script may stop running properly.
LOG_LEVEL = logging.INFO
# When you set the level, all messages from a higher level of severity are also
# logged. For example, when you set the log level to `INFO`, all `WARNING`,
# `ERROR` and `CRITICAL` messages are also logged, but `DEBUG` messages are not.


# == Set up logging ============================================================
logging.basicConfig(
    level=LOG_LEVEL,
    format=LOG_FORMAT,
    force=True,
    datefmt="%Y-%m-%d %H:%M:%S",
    handlers=[logging.FileHandler(LOG_FILENAME, "a", "utf-8"),
              logging.StreamHandler()]
)


# == Script Start ==============================================================
# Log the script execution start
logging.info('Script started execution!')

# Read data from the Excel file
data = pd.read_excel('example.xlsx')

# Retrieve a sample with 50% of the rows from `data`.
# When a `random_state` is set, `pd.DataFrame.sample` will always return
# the same dataframe, given that `data` doesn't change.
sample_data = data.sample(frac=0.5, random_state=SEED)

# Other stuff
# ...

# Log when the script finishes execution
logging.info('Script finished execution!')


Running the above code prints to the console the following messages:

[INFO | erikingwersen | 2023-02-13 23:17:14] - Script started execution!
[INFO | erikingwersen | 2023-02-13 23:17:14] - Script finished execution!

It also creates or updates a file named 'script_execution.log', located at the same directory as the script with the same information that gets printed to the console.

Answered By: Ingwersen_erik
  1. To create a log

You could use python’s standard logging moudle.

Logging HOWTO — Python 3.11.2 documentation

import logging
logging.basicConfig(filename='example.log', encoding='utf-8', level=logging.DEBUG)
logging.debug('This message should go to the log file')
logging.info('So should this')
logging.warning('And this, too')
logging.error('And non-ASCII stuff, too, like Øresund and Malmö')

1.1 To know who ran the script

import getpass
getpass.getuser()

1.2 To know when it ran

FORMAT = '%(asctime)s %(clientip)-15s %(user)-8s %(message)s'
logging.basicConfig(format=FORMAT)
d = {'clientip': '192.168.0.1', 'user': 'fbloggs'}
logger = logging.getLogger('tcpserver')
logger.warning('Protocol problem: %s', 'connection reset', extra=d)
  1. Create a seed so you can share it with another person to execute it and have the same result

You can use a parameter random_state

df['one_col'].sample(n=10, random_state=1)
Answered By: Dellon
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.