How can I profile a multithread program in Python?

Question:

I’m developing an inherently multithreaded module in Python, and I’d like to find out where it’s spending its time. cProfile only seems to profile the main thread. Is there any way of profiling all threads involved in the calculation?

Asked By: rog

||

Answers:

I don’t know any profiling-application that supports such thing for python – but You could write a Trace-class that writes log-files where you put in the information of when an operation is started and when it ended and how much time it consumed.

It’s a simple and quick solution for your problem.

Answered By: Gambrinus

Instead of running one cProfile, you could run separate cProfile instance in each thread, then combine the stats. Stats.add() does this automatically.

Answered By: vartec

If you’re okay with doing a bit of extra work, you can write your own profiling class that implements profile(self, frame, event, arg). That gets called whenever a function is called, and you can fairly easily set up a structure to gather statistics from that.

You can then use threading.setprofile to register that function on every thread. When the function is called you can use threading.currentThread() to see which it’s running on. More information (and ready-to-run recipe) here:

http://code.activestate.com/recipes/465831/

http://docs.python.org/library/threading.html#threading.setprofile

Answered By: DNS

Given that your different threads’ main functions differ, you can use the very helpful profile_func() decorator from here.

Answered By: Walter

Please see yappi (Yet Another Python Profiler).

Answered By: sumercip

From 2019: I liked vartec’s suggestion but would have really liked a code exemple. Therefore I built one – it is not crazy hard to implement but you do need to take a few things into account. Here’s a working sample (Python 3.6):

You can see that the results take into account the time spent by Thread1 & thread2 calls to the thread_func().

The only changes you need in your code is to subclass threading.Thread, override its run() method. Minimal changes for an easy way to profile threads.

import threading
import cProfile
from time import sleep
from pstats import Stats
import pstats
from time import time
import threading
import sys

# using different times to ensure the results reflect all threads
SHORT = 0.5
MED = 0.715874
T1_SLEEP = 1.37897
T2_SLEEP = 2.05746
ITER = 1
ITER_T = 4

class MyThreading(threading.Thread):
    """ Subclass to arrange for the profiler to run in the thread """
    def run(self):
        """ Here we simply wrap the call to self._target (the callable passed as arg to MyThreading(target=....) so that cProfile runs it for us, and thus is able to profile it. 
            Since we're in the current instance of each threading object at this point, we can run arbitrary number of threads & profile all of them 
        """
        try:
            if self._target:
                # using the name attr. of our thread to ensure unique profile filenames
                cProfile.runctx('self._target(*self._args, **self._kwargs)', globals=globals(), locals=locals(), filename= f'full_server_thread_{self.name}')
        finally:
            # Avoid a refcycle if the thread is running a function with
            # an argument that has a member that points to the thread.
            del self._target, self._args, self._kwargs

def main(args):
    """ Main func. """
    thread1_done =threading.Event()
    thread1_done.clear()
    thread2_done =threading.Event()
    thread2_done.clear()

    print("Main thread start.... ")
    t1 = MyThreading(target=thread_1, args=(thread1_done,), name="T1" )
    t2 = MyThreading(target=thread_2, args=(thread2_done,), name="T2" )
    print("Subthreads instances.... launching.")

    t1.start()          # start will call our overrident threading.run() method
    t2.start()

    for i in range(0,ITER):
        print(f"MAIN iteration: {i}")
        main_func_SHORT()
        main_func_MED()

    if thread1_done.wait() and thread2_done.wait():
        print("Threads are done now... ")
        return True

def main_func_SHORT():
    """ Func. called by the main T """
    sleep(SHORT)
    return True

def main_func_MED():
    sleep(MED)
    return True


def thread_1(done_flag):
    print("subthread target func 1 ")
    for i in range(0,ITER_T):
        thread_func(T1_SLEEP)
    done_flag.set()

def thread_func(SLEEP):
    print(f"Thread func")
    sleep(SLEEP)

def thread_2(done_flag):
    print("subthread target func 2 ")
    for i in range(0,ITER_T):
        thread_func(T2_SLEEP)
    done_flag.set()

if __name__ == '__main__':

    import sys
    args = sys.argv[1:]
    cProfile.run('main(args)', f'full_server_profile')
    stats = Stats('full_server_profile')
    stats.add('full_server_thread_T1')
    stats.add('full_server_thread_T2')
    stats.sort_stats('filename').print_stats()
Answered By: logicOnAbstractions

Check out mtprof from the Dask project:

https://github.com/dask/mtprof

It’s a drop-in replacement for cProfile that, if your threads are launched in the usual way and complete before your main thread, will roll-up their stats into the same reporting stats. Worked like a charm for me.

Answered By: gojomo
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.