## Question:

I have tested the timing of the built-in np.gradient function with respect to an hard-coded function which compute the gradient using the same central difference scheme. By using the cProfile module I have found significantly smaller time of np.gradient. This suggest that my implementation of the gradient computation in Python is extremely naive. I would like to understand how np.gradient is implement so to understand how to properly implement math operation in python. (I am aware that the hard-coded function does not exactly compute the gradient as np.gradient, since the boundary values are not included, but this should not affect the results).

Here is the python code I have used for the test:

``````import numpy as np
import cProfile

# Create ndarray
N = 1000
f = np.empty((N,N))

# Function to test the built-in gradient function of numpy
for s in range(n):

Ny, Nx = np.shape(T)
dTdx = np.zeros((Ny, Nx))
dTdy = np.zeros((Ny, Nx))
for j in range(1,Nx-1):
for i in range(1,Nx-1):
dTdx[j,i] = (T[j,i+1] - T[j,i-1])/2.
dTdy[j,i] = (T[j+1,i] - T[j-1,i])/2.
return dTdx, dTdy

# Function to test the hard-coded gradient function
for s in range(n):

``````

and this is the output on my machine:

``````        144 function calls in 16.818 seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
20    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(shape)
1    0.001    0.001   16.818   16.818 <string>:1(<module>)
20    0.000    0.000    0.000    0.000 fromnumeric.py:1922(_shape_dispatcher)
20    0.000    0.000    0.000    0.000 fromnumeric.py:1926(shape)
20   16.803    0.840   16.804    0.840 test_speed.py:20(hc_grad)
1    0.013    0.013   16.817   16.817 test_speed.py:30(test_hc_grad)
1    0.000    0.000   16.818   16.818 {built-in method builtins.exec}
20    0.000    0.000    0.000    0.000 {built-in method numpy.core._multiarray_umath.implement_array_function}
40    0.001    0.000    0.001    0.000 {built-in method numpy.zeros}
1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

704 function calls (624 primitive calls) in 0.262 seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
40    0.000    0.000    0.001    0.000 <__array_function__ internals>:2(empty_like)
20    0.000    0.000    0.248    0.012 <__array_function__ internals>:2(gradient)
40    0.000    0.000    0.001    0.000 <__array_function__ internals>:2(ndim)
1    0.001    0.001    0.262    0.262 <string>:1(<module>)
20    0.000    0.000    0.000    0.000 _asarray.py:110(asanyarray)
40    0.000    0.000    0.000    0.000 _asarray.py:23(asarray)
40    0.000    0.000    0.000    0.000 fromnumeric.py:3102(_ndim_dispatcher)
40    0.000    0.000    0.001    0.000 fromnumeric.py:3106(ndim)
40    0.000    0.000    0.000    0.000 function_base.py:798(_gradient_dispatcher)
20    0.245    0.012    0.247    0.012 function_base.py:803(gradient)
40    0.000    0.000    0.000    0.000 multiarray.py:75(empty_like)
40    0.000    0.000    0.000    0.000 numerictypes.py:285(issubclass_)
20    0.000    0.000    0.000    0.000 numerictypes.py:359(issubdtype)
1    0.014    0.014    0.261    0.261 test_speed.py:15(test_grad_np)
1    0.000    0.000    0.262    0.262 {built-in method builtins.exec}
60    0.000    0.000    0.000    0.000 {built-in method builtins.issubclass}
40    0.000    0.000    0.000    0.000 {built-in method builtins.len}
60    0.000    0.000    0.000    0.000 {built-in method numpy.array}
100/20    0.001    0.000    0.248    0.012 {built-in method numpy.core._multiarray_umath.implement_array_function}
40    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
``````

If you’re really curious, why not have a look at the `np.gradient` source code?

I’m not an expert on `np.gradient`, but just from playing around with your current implementation, a lot can be improved by using vectorial code. I believe the following should be equivalent:

``````def hc_grad(T):
dTdx = np.zeros_like(T)
dTdy = np.zeros_like(T)

dTdx[1:-1, 1:-1] = (T[1:-1, 2:] - T[1:-1, :-2]) / 2
dTdy[1:-1, 1:-1] = (T[2:, 1:-1] - T[:-2, 1:-1]) / 2
``````

which already runs a lot faster on my computer.

Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.