Profiling python C extensions


I have developed a python C-extension that receives data from python and compute some cpu intensive calculations.
It’s possible to profile the C-extension?

The problem here is that writing a sample test in C to be profiled would be challenging because the code rely on particular inputs and data structures (generated by python control code).

Do you have any suggestions?

Asked By: pygabriel



With gprof, you can profile any program that was properly compiled and linked (gcc -pg etc, in gprof‘s case). If you’re using a Python version not built with gcc (e.g., the Windows precompiled version the PSF distributes), you’ll need to research what equivalent tools exist for that platform and toolchain (in the Windows PSF case, maybe mingw can help). There may be “irrelevant” data there (internal C functions in the Python runtime), and, if so, the percentages shown by gprof may not be applicable — but the absolute numbers (of calls, and durations thereof) are still valid, and you can post-process gprof‘s output (e.g., with a little Python script;-) to exclude the irrelevant data and compute the percentages you want.

Answered By: Alex Martelli

I’ve found my way using google-perftools. The trick was to wrap the functions StartProfiler and StopProfiler in python (throught cython in my case).

To profile the C extension is sufficient to wrap the python code inside the StartProfiler and StopProfiler calls.

from google_perftools_wrapped import StartProfiler, StopProfiler
import c_extension # extension to profile

... calling the interesting functions from the C extension module ...

Then to analyze for example you can export in callgrind format and see the result in kcachegrind:

pprof --callgrind > output.callgrind 
kcachegrind output.callgrind
Answered By: pygabriel

After the comment by pygabriel I decided to upload a package to pypi that implements a profiler for python extensions using the cpu-profiler from google-perftools:

Answered By: Fabian Pedregosa

One of my colleague told me ltrace(1). It helped me on the same situation quite a lot.

Assume the shared object name of your C extention is and you want to execute, then

ltrace -x -c python

Its output is like

% time     seconds  usecs/call     calls      function
------ ----------- ----------- --------- --------------------
 24.88   30.202126     7550531         4 ldap_result
 12.46   15.117625     7558812         2 l_ldap_result4
 12.41   15.059652     5019884         3 ldap_chase_v3referrals
 12.41   15.057678     3764419         4 ldap_new_connection
 12.40   15.050310     3762577         4 ldap_int_open_connection
 12.39   15.042360     3008472         5 ldap_send_server_request
 12.38   15.029055     3757263         4 ldap_connect_to_host
  0.05    0.057890       28945         2 ldap_get_option
  0.04    0.052182       26091         2 ldap_sasl_bind
  0.03    0.030760       30760         1 l_ldap_get_option
  0.03    0.030635       30635         1 LDAP_get_option
  0.02    0.029960       14980         2 ldap_initialize
  0.02    0.027988       27988         1 ldap_int_initialize
  0.02    0.026722       26722         1 l_ldap_simple_bind
  0.02    0.026386       13193         2 ldap_send_initial_request
  0.02    0.025810       12905         2 ldap_int_select

Special care is needed if your shared object has - or + in its file name. These characters aren’t treated as is (see man 1 ltrace for details).

The wildcard * can be a workaround such as -x @myext* in place of -x

Answered By: somay

I found py-spy very easy to use. See this blog post for an explanation of its native extension support.


  • pip installable
  • cpu sampling based
  • no compiler flags required
  • executes your program or attaches to a running process
  • multiple output formats (I recommend --format speedscope)
  • configurable sampling rate
Answered By: Hammer
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.