Recurrence with numpy

Question:

Is there any way to make recurrence without using for’s in numpy?

Using np.add with out keyword do the trick with dtype="int"

import numpy as np
N = 100

fib = np.zeros(N, dtype=np.int)
fib[:2] = 1.
np.add(fib[:-2], fib[1:-1], out=fib[2:])

print(fib[:10])

output: [ 1 1 2 3 5 8 13 21 34 55]

However, if dtype is changed to np.float

import numpy as np
N = 100

fib = np.zeros(N, dtype=np.float)
fib[:2] = 1.
np.add(fib[:-2], fib[1:-1], out=fib[2:])

print(fib[:10])

output: [ 1. 1. 2. 1. 3. 1. 4. 1. 5. 1.]

Could someone tell me why? or any other way to make recursion?

Asked By: pablo2303

||

Answers:

One maybe not super-efficent but fun solution would be to abuse scipy.linalg.solve_banded like so

import numpy as np
from scipy import linalg

N = 50
a = np.zeros((N,)) + np.array([[1, -1, -1]]).T
b = np.zeros((N,))
b[0] = 1
linalg.solve_banded((2, 0), a, b)
# array([  1.00000000e+00,   1.00000000e+00,   2.00000000e+00,
#          3.00000000e+00,   5.00000000e+00,   8.00000000e+00,
#          1.30000000e+01,   2.10000000e+01,   3.40000000e+01,
#          5.50000000e+01,   8.90000000e+01,   1.44000000e+02,
#          2.33000000e+02,   3.77000000e+02,   6.10000000e+02,
#          9.87000000e+02,   1.59700000e+03,   2.58400000e+03,
#          4.18100000e+03,   6.76500000e+03,   1.09460000e+04,
#          1.77110000e+04,   2.86570000e+04,   4.63680000e+04,
#          7.50250000e+04,   1.21393000e+05,   1.96418000e+05,
#          3.17811000e+05,   5.14229000e+05,   8.32040000e+05,
#          1.34626900e+06,   2.17830900e+06,   3.52457800e+06,
#          5.70288700e+06,   9.22746500e+06,   1.49303520e+07,
#          2.41578170e+07,   3.90881690e+07,   6.32459860e+07,
#          1.02334155e+08,   1.65580141e+08,   2.67914296e+08,
#          4.33494437e+08,   7.01408733e+08,   1.13490317e+09,
#          1.83631190e+09,   2.97121507e+09,   4.80752698e+09,
#          7.77874205e+09,   1.25862690e+10])

How this works is that we write F_0, F_1, F_2… as one vector F and the identities -F_{i-1} -F_i + F{i+1} = 0 as a matrix which is nicely banded and then solve for F. Note that this can adapted to other similar recurrences.

Answered By: Paul Panzer

Here is a reasonably efficient method using a scipy filter:

import numpy as np
from scipy import signal

def scipy_fib(n):
    x = np.zeros(n, dtype=np.uint8)
    x[0] = 1

    sos = signal.tf2sos([1], [1, -1, -1])
    return signal.sosfilt(sos, x)

(Note that we can’t use lfilter because in terms of signal processing this is an unstable filter which makes the Python interpreter crash, so we have to convert to second-order sections form.)

The problem with the filtering approach is that I’m not sure if it can be generalized to the actual problem of solving an ODE.

Another solution is to simply write the loop in Python and accelerate it with a JIT compiler:

import numba as nb    

@nb.jit(nopython=True)
def numba_fib(n):
    y = np.empty(n)
    y[:2] = 1

    for i in range(2, n):
        y[i] = y[i-1] + y[i-2]

    return y

The advantage of the JIT approach is that you can implement all kinds of fancy stuff, but it works best if you stick to simple loops and branches without calling any (non-JITted) Python functions.

Timing results:

# Baseline without JIT:
%timeit numba_fib(10000)  # 100 loops, best of 3: 5.47 ms per loop

%timeit scipy_fib(10000)  # 1000 loops, best of 3: 719 µs per loop
%timeit numba_fib(10000)  # 10000 loops, best of 3: 33.8 µs per loop

# For fun, this is how Paul Panzer's solve_banded approach compares:
%timeit banded_fib(10000)  # 1000 loops, best of 3: 1.33 ms per loop

The scipy filter approach is faster than the pure Python loop but slower than the JITted loop. I guess the filter function is relatively generic and does stuff we don’t need here, while the JITted loop compiles down to a very small loop.

Answered By: MB-F

Since linear recurrences have analytic solutions (here for Fibonacci), another rapid scipy method is :

import math

import numpy as np

def fib_scipy2(N):
    r5=math.sqrt(5)
    phi=(1+r5)/2
    a= (-phi*np.ones(N)).cumprod()
    return (np.abs(a)-1/a)/r5

Runs:

In [413]: fib_scipy2(8)
Out[413]: array([  1.,   1.,   2.,   3.,   5.,   8.,  13.,  21.])
In [414]: %timeit fib_scipy2(10**4)
103 µs ± 888 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

You can tune it to any linear equation.

Answered By: B. M.

Your add works for this calculation, but has to be applied repeatedly, so that the nonzero values propagate. I don’t see how your calculation generated [ 1. 1. 2. 1. 3. 1. 4. 1. 5. 1.].

In [619]: fib=np.zeros(10,int);fib[:2]=1
In [620]: for _ in range(10):
     ...:     np.add(fib[:-2], fib[1:-1], out=fib[2:])
     ...:     print(fib)
     ...: 
[1 1 2 1 0 0 0 0 0 0]   # **
[1 1 2 3 3 1 0 0 0 0]
[1 1 2 3 5 6 4 1 0 0]
[ 1  1  2  3  5  8 11 10  5  1]
[ 1  1  2  3  5  8 13 19 21 15]
 ...

(edit –
Note that the first np.add acts as though it is fully buffered. Compare the result at ** with both of your object and float arrays. I’m using version 1.13.1 on Py3.)

Or to highlight the good values at each stage

In [623]: fib=np.zeros(20,int);fib[:2]=1
In [624]: for i in range(10):
     ...:     np.add(fib[:-2], fib[1:-1], out=fib[2:])
     ...:     print(fib[:(i+2)])
     ...: 
[1 1]
[1 1 2]
[1 1 2 3]
[1 1 2 3 5]
[1 1 2 3 5 8]
[ 1  1  2  3  5  8 13]
[ 1  1  2  3  5  8 13 21]
[ 1  1  2  3  5  8 13 21 34]
[ 1  1  2  3  5  8 13 21 34 55]
[ 1  1  2  3  5  8 13 21 34 55 89]

fib[2:] = fib[:-2]+fib[1:-1] does the same thing.

As discussed in the documentation for ufunc.at, operations like np.add use buffering, even with the out parameter. So while they do interate in C level code, they don’t accumulate; that is, results from the ith step aren’t used at the i+1 step.

add.at can be used to perform unbuffered a[i] += b. That’s handy when the indexes contain dupicates. but it doesn’t allow for feedback from the changed a values to b. So it isn’t useful here.

The is also a add.accumulate (aka np.cumsum) which can be handy for certain iterative definitions. But it’s hard to apply in general cases.

cumprod (multiply accumulate) can work with the qmatrix approach

Numpy's matrix_power function giving wrong results for large exponents

Use np.matrix to define the qmatrix, so that * multiply means matrix product (as opposed to elementwise):

In [647]: qmatrix = numpy.matrix([[1, 1], [1, 0]])

Populate an object matrix with copies (pointers actually) to this matrix.

In [648]: M = np.empty(10, object)
In [649]: M[:] = [qmatrix for _ in range(10)]

Apply cumprod, and extract one element from each matrix.

In [650]: m1=np.cumprod(M)
In [651]: m1
Out[651]: 
array([matrix([[1, 1],
        [1, 0]]), matrix([[2, 1],
        [1, 1]]),
       matrix([[3, 2],
        [2, 1]]), matrix([[5, 3],
        [3, 2]]),
       matrix([[8, 5],
        [5, 3]]),
       matrix([[13,  8],
        [ 8,  5]]),
       matrix([[21, 13],
        [13,  8]]),
       matrix([[34, 21],
        [21, 13]]),
       matrix([[55, 34],
        [34, 21]]),
       matrix([[89, 55],
        [55, 34]])], dtype=object)
In [652]: [x[0,1] for x in m1]
Out[652]: [1, 1, 2, 3, 5, 8, 13, 21, 34, 55]

The linked answer uses numpy.linalg.matrix_power which also does repeated matrix products. For a single power, matrix_power is faster, but for a whole range of powers, the cumprod is faster.

Answered By: hpaulj
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.