How do I profile a Python script?


Project Euler and other coding contests often have a maximum time to run or people boast of how fast their particular solution runs. With Python, sometimes the approaches are somewhat kludgey – i.e., adding timing code to __main__.

What is a good way to profile how long a Python program takes to run?

Asked By: Chris Lawlor



Python includes a profiler called cProfile. It not only gives the total running time, but also times each function separately, and tells you how many times each function was called, making it easy to determine where you should make optimizations.

You can call it from within your code, or from the interpreter, like this:

import cProfile'foo()')

Even more usefully, you can invoke the cProfile when running a script:

python -m cProfile

To make it even easier, I made a little batch file called ‘profile.bat’:

python -m cProfile %1

So all I have to do is run:


And I get this:

1007 function calls in 0.061 CPU seconds

Ordered by: standard name
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    1    0.000    0.000    0.061    0.061 <string>:1(<module>)
 1000    0.051    0.000    0.051    0.000<lambda>)
    1    0.005    0.005    0.061    0.061<module>)
    1    0.000    0.000    0.061    0.061 {execfile}
    1    0.002    0.002    0.053    0.053 {map}
    1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler objects}
    1    0.000    0.000    0.000    0.000 {range}
    1    0.003    0.003    0.003    0.003 {sum}

EDIT: Updated link to a good video resource from PyCon 2013 titled
Python Profiling
Also via YouTube.

Answered By: Chris Lawlor

In Virtaal’s source there’s a very useful class and decorator that can make profiling (even for specific methods/functions) very easy. The output can then be viewed very comfortably in KCacheGrind.

Answered By: Walter

It’s worth pointing out that using the profiler only works (by default) on the main thread, and you won’t get any information from other threads if you use them. This can be a bit of a gotcha as it is completely unmentioned in the profiler documentation.

If you also want to profile threads, you’ll want to look at the threading.setprofile() function in the docs.

You could also create your own threading.Thread subclass to do it:

class ProfiledThread(threading.Thread):
    # Overrides
    def run(self):
        profiler = cProfile.Profile()
            return profiler.runcall(, self)
            profiler.dump_stats('myprofile-%d.profile' % (self.ident,))

and use that ProfiledThread class instead of the standard one. It might give you more flexibility, but I’m not sure it’s worth it, especially if you are using third-party code which wouldn’t use your class.

Answered By: Joe Shaw

The python wiki is a great page for profiling resources:

as is the python docs:

as shown by Chris Lawlor cProfile is a great tool and can easily be used to print to the screen:

python -m cProfile -s time <args>

or to file:

python -m cProfile -o output.file <args>

PS> If you are using Ubuntu, make sure to install python-profile

apt-get install python-profiler 

If you output to file you can get nice visualizations using the following tools

PyCallGraph : a tool to create call graph images

 pip install pycallgraph


 pycallgraph args


 gimp pycallgraph.png

You can use whatever you like to view the png file, I used gimp
Unfortunately I often get

dot: graph is too large for cairo-renderer bitmaps. Scaling by 0.257079 to fit

which makes my images unusably small. So I generally create svg files:

pycallgraph -f svg -o pycallgraph.svg <args>

PS> make sure to install graphviz (which provides the dot program):

pip install graphviz

Alternative Graphing using gprof2dot via @maxy / @quodlibetor :

pip install gprof2dot
python -m cProfile -o profile.pstats
gprof2dot -f pstats profile.pstats | dot -Tsvg -o mine.svg
Answered By: brent.payne

A nice profiling module is the line_profiler (called using the script It can be downloaded here.

My understanding is that cProfile only gives information about total time spent in each function. So individual lines of code are not timed. This is an issue in scientific computing since often one single line can take a lot of time. Also, as I remember, cProfile didn’t catch the time I was spending in say

Answered By: Ian Langmore

Following Joe Shaw’s answer about multi-threaded code not to work as expected, I figured that the runcall method in cProfile is merely doing self.enable() and self.disable() calls around the profiled function call, so you can simply do that yourself and have whatever code you want in-between with minimal interference with existing code.

Answered By: PypeBros

A while ago I made pycallgraph which generates a visualisation from your Python code. Edit: I’ve updated the example to work with 3.3, the latest release as of this writing.

After a pip install pycallgraph and installing GraphViz you can run it from the command line:

pycallgraph graphviz -- ./

Or, you can profile particular parts of your code:

from pycallgraph import PyCallGraph
from pycallgraph.output import GraphvizOutput

with PyCallGraph(output=GraphvizOutput()):

Either of these will generate a pycallgraph.png file similar to the image below:

enter image description here

Answered By: gak

Ever want to know what the hell that python script is doing? Enter the
Inspect Shell. Inspect Shell lets you print/alter globals and run
functions without interrupting the running script. Now with
auto-complete and command history (only on linux).

Inspect Shell is not a pdb-style debugger.

You could use that (and your wristwatch).

Answered By: Colonel Panic

@Maxy’s comment on this answer helped me out enough that I think it deserves its own answer: I already had cProfile-generated .pstats files and I didn’t want to re-run things with pycallgraph, so I used gprof2dot, and got pretty svgs:

$ sudo apt-get install graphviz
$ git clone
$ ln -s "$PWD"/gprof2dot/ ~/bin
$ -f pstats profile.pstats | dot -Tsvg -o callgraph.svg

and BLAM!

It uses dot (the same thing that pycallgraph uses) so output looks similar. I get the impression that gprof2dot loses less information though:

gprof2dot example output

Answered By: quodlibetor

My way is to use yappi ( It’s especially useful combined with an RPC server where (even just for debugging) you register method to start, stop and print profiling information, e.g. in this way:

def startProfiler():

def stopProfiler():

def printProfiler():
    stats = yappi.get_stats(yappi.SORTTYPE_TTOT, yappi.SORTORDER_DESC, 20)
    statPrint = 'n'
    namesArr = [len(str(stat[0])) for stat in stats.func_stats]
    log.debug("namesArr %s", str(namesArr))
    maxNameLen = max(namesArr)
    log.debug("maxNameLen: %s", maxNameLen)

    for stat in stats.func_stats:
        nameAppendSpaces = [' ' for i in range(maxNameLen - len(stat[0]))]
        log.debug('nameAppendSpaces: %s', nameAppendSpaces)
        blankSpace = ''
        for space in nameAppendSpaces:
            blankSpace += space

        log.debug("adding spaces: %s", len(nameAppendSpaces))
        statPrint = statPrint + str(stat[0]) + blankSpace + " " + str(stat[1]).ljust(8) + "t" + str(
            round(stat[2], 2)).ljust(8 - len(str(stat[2]))) + "t" + str(round(stat[3], 2)) + "n"

    log.log(1000, "nname" + ''.ljust(maxNameLen - 4) + " ncall tttot ttsub")
    log.log(1000, statPrint)

Then when your program work you can start profiler at any time by calling the startProfiler RPC method and dump profiling information to a log file by calling printProfiler (or modify the rpc method to return it to the caller) and get such output:

2014-02-19 16:32:24,128-|SVR-MAIN  |-(Thread-3   )-Level 1000: 
name                                                                                                                                      ncall     ttot    tsub
2014-02-19 16:32:24,128-|SVR-MAIN  |-(Thread-3   )-Level 1000:                                                                                                           22        0.11    0.05                                                22        0.11    0.0                                                    22        0.11    0.0                                       1         0.0     0.0                                                                                    1         0.0     0.0     4         0.0     0.0                                                                          1         0.0     0.0 4         0.0     0.0
<string>.__new__:8                                                                                                                        220       0.0     0.0                                                                                                       4         0.0     0.0                                                                                                 1         0.0     0.0
<string>.__new__:8                                                                                                                        4         0.0     0.0                                                                                                   1         0.0     0.0                                                                                                   4         0.0     0.0                                                                                  1         0.0     0.0                                                                                                      3         0.0     0.0                                                                                         1         0.0     0.0                                                                               1         0.0     0.0                                                                                         1         0.0     0.0               4         0.0     0.0 

It may not be very useful for short scripts but helps to optimize server-type processes especially given the printProfiler method can be called multiple times over time to profile and compare e.g. different program usage scenarios.

In newer versions of yappi, the following code will work:

def printProfile():
Answered By: Mr. Girgitt

Also worth mentioning is the GUI cProfile dump viewer RunSnakeRun. It allows you to sort and select, thereby zooming in on the relevant parts of the program. The sizes of the rectangles in the picture is proportional to the time taken. If you mouse over a rectangle it highlights that call in the table and everywhere on the map. When you double-click on a rectangle it zooms in on that portion. It will show you who calls that portion and what that portion calls.

The descriptive information is very helpful. It shows you the code for that bit which can be helpful when you are dealing with built-in library calls. It tells you what file and what line to find the code.

Also want to point at that the OP said ‘profiling’ but it appears he meant ‘timing’. Keep in mind programs will run slower when profiled.

enter image description here

Answered By: Pete


line_profiler (already presented here) also inspired pprofile, which is described as:

Line-granularity, thread-aware deterministic and statistic pure-python

It provides line-granularity as line_profiler, is pure Python, can be used as a standalone command or a module, and can even generate callgrind-format files that can be easily analyzed with [k|q]cachegrind.


There is also vprof, a Python package described as:

[…] providing rich and interactive visualizations for various Python program characteristics such as running time and memory usage.


Answered By: BenC

To add on to,

I wrote this module that allows you to use cProfile and view its output easily. More here:

$ python -m cprofilev /your/python/program
# Go to http://localhost:4000 to view collected statistics.

Also see: on how to make sense of the collected statistics.

Answered By: michael

cProfile is great for quick profiling but most of the time it was ending for me with the errors. Function runctx solves this problem by initializing correctly the environment and variables, hope it can be useful for someone:

import cProfile
cProfile.runctx('foo()', None, locals())
Answered By: Datageek

A new tool to handle profiling in Python is PyVmMonitor:

It has some unique features such as

  • Attach profiler to a running (CPython) program
  • On demand profiling with Yappi integration
  • Profile on a different machine
  • Multiple processes support (multiprocessing, django…)
  • Live sampling/CPU view (with time range selection)
  • Deterministic profiling through cProfile/profile integration
  • Analyze existing PStats results
  • Open DOT files
  • Programatic API access
  • Group samples by method or line
  • PyDev integration
  • PyCharm integration

Note: it’s commercial, but free for open source.

Answered By: Fabio Zadrozny

There’s a lot of great answers but they either use command line or some external program for profiling and/or sorting the results.

I really missed some way I could use in my IDE (eclipse-PyDev) without touching the command line or installing anything. So here it is.

Profiling without command line

def count():
    from math import sqrt
    for x in range(10**5):

if __name__ == '__main__':
    import cProfile, pstats"count()", "{}.profile".format(__file__))
    s = pstats.Stats("{}.profile".format(__file__))

See docs or other answers for more info.

Answered By: David Mašek

There’s also a statistical profiler called statprof. It’s a sampling profiler, so it adds minimal overhead to your code and gives line-based (not just function-based) timings. It’s more suited to soft real-time applications like games, but may be have less precision than cProfile.

The version in pypi is a bit old, so can install it with pip by specifying the git repository:

pip install git+git://

You can run it like this:

import statprof

with statprof.profile():

See also

Answered By: z0r

cProfile is great for profiling, while kcachegrind is great for visualizing the results. The pyprof2calltree in between handles the file conversion.

python -m cProfile -o script.profile
pyprof2calltree -i script.profile -o script.calltree
kcachegrind script.calltree

Required system packages:

  • kcachegrind (Linux), qcachegrind (MacOs)

Setup on Ubuntu:

apt-get install kcachegrind 
pip install pyprof2calltree

The result:

Screenshot of the result

Answered By: Federico

I ran into a handy tool called SnakeViz when researching this topic. SnakeViz is a web-based profiling visualization tool. It is very easy to install and use. The usual way I use it is to generate a stat file with %prun and then do analysis in SnakeViz.

The main viz technique used is Sunburst chart as shown below, in which the hierarchy of function calls is arranged as layers of arcs and time info encoded in their angular widths.

The best thing is you can interact with the chart. For example, to zoom in one can click on an arc, and the arc and its descendants will be enlarged as a new sunburst to display more details.

enter image description here

Answered By: zaxliu

When i’m not root on the server, I use and run my program like this:

python -o callgrind.1

Then I can open the report with any callgrind-compatible software, like qcachegrind

Answered By: Vincent Fenet

It would depend on what you want to see out of profiling. Simple time
metrics can be given by (bash).

time python

Even ‘/usr/bin/time’ can output detailed metrics by using ‘–verbose’ flag.

To check time metrics given by each function and to better understand how much time is spent on functions, you can use the inbuilt cProfile in python.

Going into more detailed metrics like performance, time is not the only metric. You can worry about memory, threads etc.
Profiling options:
1. line_profiler is another profiler used commonly to find out timing metrics line-by-line.
2. memory_profiler is a tool to profile memory usage.
3. heapy (from project Guppy) Profile how objects in the heap are used.

These are some of the common ones I tend to use. But if you want to find out more, try reading this book
It is a pretty good book on starting out with performance in mind. You can move onto advanced topics on using Cython and JIT(Just-in-time) compiled python.

Answered By: user7891524

Simplest and quickest way to find where all the time is going.

1. pip install snakeviz

2. python -m cProfile -o temp.dat <PROGRAM>.py

3. snakeviz temp.dat

Draws a pie chart in a browser. Biggest piece is the problem function. Very simple.

Answered By: CodeCabbie

I recently created tuna for visualizing Python runtime and import profiles; this may be helpful here.

enter image description here

Install with

pip install tuna

Create a runtime profile

python3 -m cProfile -o

or an import profile (Python 3.7+ required)

python3 -X importprofile 2> import.log

Then just run tuna on the file

Answered By: Nico Schlömer


Magic function for gprof2dot to profile any Python statement as a DOT graph in JupyterLab or Jupyter Notebook.

enter image description here

GitHub repo:


Make sure you’ve the Python package gprof2dot_magic.

pip install gprof2dot_magic

Its dependencies gprof2dot and graphviz will be installed as well


To enable the magic function, first load the gprof2dot_magic module

%load_ext gprof2dot_magic

and then profile any line statement as a DOT graph as such:

%gprof2dot print('hello world')

enter image description here

Answered By: Mattijn

The terminal-only (and simplest) solution, in case all those fancy UI’s fail to install or to run:
ignore cProfile completely and replace it with pyinstrument, that will collect and display the tree of calls right after execution.


$ pip install pyinstrument

Profile and display result:

$ python -m pyinstrument ./

Works with python2 and 3.

The documentation of the API, for profiling only a part of the code, can be found here.

Answered By: Francois

If you want to make a cumulative profiler,
meaning to run the function several times in a row and watch the sum of the results.

you can use this cumulative_profiler decorator:

it’s python >= 3.6 specific, but you can remove nonlocal for it work on older versions.

import cProfile, pstats

class _ProfileFunc:
    def __init__(self, func, sort_stats_by):
        self.func =  func
        self.profile_runs = []
        self.sort_stats_by = sort_stats_by

    def __call__(self, *args, **kwargs):
        pr = cProfile.Profile()
        pr.enable()  # this is the profiling section
        retval = self.func(*args, **kwargs)

        ps = pstats.Stats(*self.profile_runs).sort_stats(self.sort_stats_by)
        return retval, ps

def cumulative_profiler(amount_of_times, sort_stats_by='time'):
    def real_decorator(function):
        def wrapper(*args, **kwargs):
            nonlocal function, amount_of_times, sort_stats_by  # for python 2.x remove this row

            profiled_func = _ProfileFunc(function, sort_stats_by)
            for i in range(amount_of_times):
                retval, ps = profiled_func(*args, **kwargs)
            return retval  # returns the results of the function
        return wrapper

    if callable(amount_of_times):  # incase you don't want to specify the amount of times
        func = amount_of_times  # amount_of_times is the function in here
        amount_of_times = 5  # the default amount
        return real_decorator(func)
    return real_decorator


profiling the function baz

import time

def baz():
    return 1


baz ran 5 times and printed this:

         20 function calls in 15.003 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       10   15.003    1.500   15.003    1.500 {built-in method time.sleep}
        5    0.000    0.000   15.003    3.001 <ipython-input-9-c89afe010372>:3(baz)
        5    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

specifying the amount of times

def baz():
Answered By: moshevi

I just developed my own profiler inspired from pypref_time:

Update Version 2


pip install auto_profiler

Quick Start:

from auto_profiler import Profiler

with Profiler():

Using in Jupyter, let you have realtime view of elapsed times

Real Time view of auto profiler in jupyter

Update Version 1

By adding a decorator it will show a tree of time-consuming functions


Install by: pip install auto_profiler


import time # line number 1
import random

from auto_profiler import Profiler, Tree

def f1():

def mysleep(t):

def fact(i):
        return 1
    return i*fact(i-1)

def main():
    for i in range(5):


with Profiler(depth=4):

Example Output

Time   [Hits * PerHit] Function name [Called from] [function location]
8.974s [1 * 8.974]  main  [auto-profiler/]  [/test/]
├── 5.954s [5 * 1.191]  f1  [/test/]  [/test/]
│   └── 5.954s [5 * 1.191]  mysleep  [/test/]  [/test/]
│       └── 5.954s [5 * 1.191]  <time.sleep>
|   # The rest is for the example recursive function call fact
└── 3.020s [1 * 3.020]  fact  [/test/]  [/test/]
    ├── 0.849s [1 * 0.849]  f1  [/test/]  [/test/]
    │   └── 0.849s [1 * 0.849]  mysleep  [/test/]  [/test/]
    │       └── 0.849s [1 * 0.849]  <time.sleep>
    └── 2.171s [1 * 2.171]  fact  [/test/]  [/test/]
        ├── 1.552s [1 * 1.552]  f1  [/test/]  [/test/]
        │   └── 1.552s [1 * 1.552]  mysleep  [/test/]  [/test/]
        └── 0.619s [1 * 0.619]  fact  [/test/]  [/test/]
            └── 0.619s [1 * 0.619]  f1  [/test/]  [/test/]
Answered By: Ali

With a statistical profiler like austin, no instrumentation is required, meaning that you can get profiling data out of a Python application simply with

austin python3

The raw output isn’t very useful, but you can pipe that to
to get a flame graph representation of that data that gives you a breakdown of where the time (measured in microseconds of real time) is being spent.

austin python3 | > my_script_profile.svg

Alternatively, you can also use the web application for quick visualisation of the collected samples. If you have pprof installed, you can also get austin-python (with e.g. pipx install austin-python) and use the austin2pprof to covert to the pprof format.

However, if you have VS Code installed you could use the Austin extension for a more interactive experience, with source code heat maps, top functions and collected call stacks

Austin VS Code extension

If you’d rather use the terminal, you can also use the TUI, that also has a live graph mode:

Austin TUI graph mode

Answered By: Phoenix87

For getting quick profile stats on an IPython notebook.
One can embed line_profiler and memory_profiler straight into their notebooks.

Another useful package is Pympler. It is a powerful profiling package that’s capable to track classes,objects,functions,memory leaks etc. Examples below, Docs attached.

Get it!

!pip install line_profiler
!pip install memory_profiler
!pip install pympler

Load it!

%load_ext line_profiler
%load_ext memory_profiler

Use it!


%time print('Outputs CPU time,Wall Clock time') 
#CPU times: user 2 µs, sys: 0 ns, total: 2 µs Wall time: 5.96 µs


  • CPU times: CPU level execution time
  • sys times: system level execution time
  • total: CPU time + system time
  • Wall time: Wall Clock Time


%timeit -r 7 -n 1000 print('Outputs execution time of the snippet') 
#1000 loops, best of 7: 7.46 ns per loop
  • Gives best time out of given number of runs(r) in looping (n) times.
  • Outputs details on system caching:
    • When code snippets are executed multiple times, system caches a few opearations and doesn’t execute them again that may hamper the accuracy of the profile reports.


%prun -s cumulative 'Code to profile' 


  • number of function calls(ncalls)
  • has entries per function call(distinct)
  • time taken per call(percall)
  • time elapsed till that function call(cumtime)
  • name of the func/module called etc…

Cumulative profile


%memit 'Code to profile'
#peak memory: 199.45 MiB, increment: 0.00 MiB


  • Memory usage


#Example function
def fun():
  for i in range(10):

#Usage: %lprun <name_of_the_function> function
%lprun -f fun fun()


  • Line wise stats



sys.getsizeof('code to profile')
# 64 bytes

Returns the size of an object in bytes.

asizeof() from pympler

from pympler import asizeof
obj = [1,2,("hey","ha"),3]

pympler.asizeof can be used to investigate how much memory certain Python objects consume.
In contrast to sys.getsizeof, asizeof sizes objects recursively


tracker from pympler

from pympler import tracker
tr = tracker.SummaryTracker()
def fun():
  li = [1,2,3]
  di = {"ha":"haha","duh":"Umm"}

Tracks the lifetime of a function.

tracker output

Pympler package consists of a huge number of high utility functions to profile code. All of which cannot be covered here. See the documentation attached for verbose profile implementations.

Pympler doc

Answered By: Aditya Patnaik

Recently I created a plugin for PyCharm with which you can easily analyse and visualise the results of line_profiler in the PyCharm editor.

line_profiler has been mentioned in other answers as well and is a great tool to analyse exactly how much time is spent by the python interpreter in certain lines.

The PyCharm plugin I’ve created can be found here:

It needs a helper package in your python environment called line-profiler-pycharm which can be installed with pip or by the plugin itself.

After installing the plugin in PyCharm:

  1. Decorate any function you want to profile with the line_profiler_pycharm.profile decorator
  2. Run with the ‘Profile Lines’ runner

Screenshot of results:
Line Profiler Pycharm results

Answered By: jusx

I found cprofiler and other ressources to be more for optimization purpose rather than debugging.

I made my own testing module instead for simple python scripts speed testing. (In my case 1K+ lines py file was tested using ScriptProfilerPy and speedup the code by 10x in minutes afterwards.

The module ScriptProfilerPy() will run your code adding timestamp to it.
I put the module here:


from speed_testpy import ScriptProfilerPy


Output of the code after testing

Answered By: AlphaSeekness

I find this function is quick and easy to use if you do not want a command line option.

To use just add @profile above each function to be profiled.

def profile(fnc):
    Profiles any function in following class just by adding @profile above function
    import cProfile, pstats, io
    def inner (*args, **kwargs):
        pr = cProfile.Profile()
        retval = fnc (*args, **kwargs)
        s = io.StringIO()
        sortby = 'cumulative'   #Ordered
        ps = pstats.Stats(pr,stream=s).strip_dirs().sort_stats(sortby)
        n=10                    #reduced the list to be monitored
        return retval
    return inner 

output for each function looks like this

   Ordered by: cumulative time
   List reduced from 38 to 10 due to restriction <10>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.002    0.002
        1    0.000    0.000    0.002    0.002
        1    0.001    0.001    0.001    0.001 {built-in method fitz._fitz.new_Document}
        1    0.000    0.000    0.000    0.000
        1    0.000    0.000    0.000    0.000 {built-in method fitz._fitz.delete_Document}
        1    0.000    0.000    0.000    0.000
        1    0.000    0.000    0.000    0.000
        1    0.000    0.000    0.000    0.000<listcomp>)
       11    0.000    0.000    0.000    0.000
        1    0.000    0.000    0.000    0.000
Answered By: Cam

Scalene is a new python profiler that covers many use cases and has a minimal performance impact:

It can profile CPU, GPU and memory utilisation at a very granular level. It also notably supports multi-threaded / parallelized python code.

Answered By: user667489

Lots of great answers.

As a human being, my day to day measure of performance pretty much translates to "how long does it take to execute this function in seconds"…. often this is influenced by many factors, like what other processes are running, machine’s CPU architecture, etc.

My lazy, unscientific but relatively useful (to me) approach is to run cProfile multiple times, capture the execution time in seconds, and return the median value.

from io import StringIO
import cProfile
import pstats
import re
import statistics

def profile(cmd, n=100):
    def _profile(cmd):, 'statsfile')
        stream = StringIO()
        stats = pstats.Stats('statsfile', stream=stream)
        stream = stream.getvalue()    
        values = re.findall(r'[d.d]+',stream.splitlines()[2])
        return float(values[-1])
    vals = []
    for i in range(n):
    return statistics.median(vals)

# get the median time in seconds of 100 executions of foo()'foo()', n=100) 
Answered By: Fnord