Is there a numpy function for generating sequences similar to R's seq function?

Question

In R, you can create a sequence by specifying the start point, end point, and desired length of output

seq(1, 1.5, length.out=10)
# [1] 1.000000 1.055556 1.111111 1.166667 1.222222 1.277778 1.333333 1.388889 1.444444 1.500000

In Python, you can use the numpy arange function in a similar way, but there’s no easy way to specify the output length. The best I can come up with:

np.append(np.arange(1, 1.5, step = (1.5-1)/9), 1.5)
# array([ 1.        ,  1.05555556,  1.11111111,  1.16666667,  1.22222222, 1.27777778,  1.33333333,  1.38888889,  1.44444444,  1.5       ])

Is there a cleaner way to perform this operation?

Asked By: C_Z_

||

Source

Answer 1

Yes! An easy way to do this will be using numpy.linspace

Numpy Docs

numpy.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None)

Return evenly spaced numbers over a specified interval.
Returns num evenly spaced samples, calculated over the interval [start, stop].
The endpoint of the interval can optionally be excluded.

Example:

[In 1] np.linspace(start=0, stop=50, num=5)

[Out 1] array([  0. ,  12.5,  25. ,  37.5,  50. ])

Notice that the distance between the start and stop values is evenly spaced, i.e. evenly divided by num=5.

For those having problems installing numpy (a problem less common these days), you might look in to using anaconda (or miniconda), or some other similar distribution.

Answered By: PaulG

Answer 2

@PaulG’s answer is very good to generate series of floating point numbers. In case you are looking for the R equivalent of 1:5 to create a numpy vector containing 5 integer elements, use:

a = np.array(range(0,5))
a
# array([0, 1, 2, 3, 4])

a.dtype
# dtype('int64')

In contrast to R vectors, Python lists and numpy arrays are zero indexed. In general you will use np.array(range(n)) which returns values from 0 to n-1.

Answered By: Paul Rougieux

Answer 3

As an alternative (and for those interested), if one wanted the functionality of seq(start, end, by, length.out) from R, the following function provides the full functionality.

def seq(start, end, by = None, length_out = None):
    len_provided = True if (length_out is not None) else False
    by_provided = True if (by is not None) else False
    if (not by_provided) & (not len_provided):
        raise ValueError('At least by or length_out must be provided')
    width = end - start
    eps = pow(10.0, -14)
    if by_provided:
        if (abs(by) < eps):
            raise ValueError('by must be non-zero.')
    #Switch direction in case in start and end seems to have been switched (use sign of by to decide this behaviour)
        if start > end and by > 0:
            e = start
            start = end
            end = e
        elif start < end and by < 0:
            e = end
            end = start
            start = e
        absby = abs(by)
        if absby - width < eps: 
            length_out = int(width / absby)
        else: 
            #by is too great, we assume by is actually length_out
            length_out = int(by)
            by = width / (by - 1)
    else:
        length_out = int(length_out)
        by = width / (length_out - 1) 
    out = [float(start)]*length_out
    for i in range(1, length_out):
        out[i] += by * i
    if abs(start + by * length_out - end) < eps:
        out.append(end)
    return out

This function is a bit slower than numpy.linspace (which is roughly 4x-5x faster), but using numba the speed we can obtain a function that is about 2x as fast as np.linspace while keeping the syntax from R.

from numba import jit
@jit(nopython = True, fastmath = True)
def seq(start, end, by = None, length_out = None):
    [function body]

And we can execute this just like we would in R.

seq(0, 5, 0.3)
#out: [3.0, 3.3, 3.6, 3.9, 4.2, 4.5, 4.8]

In the implementation above it also allows (somewhat) for swaps between ‘by’ and ‘length_out’

seq(0, 5, 10)
#out: [0.0,
 0.5555555555555556,
 1.1111111111111112,
 1.6666666666666667,
 2.2222222222222223,
 2.7777777777777777,
 3.3333333333333335,
 3.8888888888888893,
 4.444444444444445,
 5.0]

Benchmarks:

%timeit -r 100 py_seq(0.5, 1, 1000) #Python no jit
133 µs ± 20.9 µs per loop (mean ± std. dev. of 100 runs, 1000 loops each)

%timeit -r 100 seq(0.5, 1, 1000) #adding @jit(nopython = True, fastmath = True) prior to function definition
20.1 µs ± 2 µs per loop (mean ± std. dev. of 100 runs, 10000 loops each)

%timeit -r 100 linspace(0.5, 1, 1000)
46.2 µs ± 6.11 µs per loop (mean ± std. dev. of 100 runs, 10000 loops each)

Answered By: Oliver

Answer 4

You can find more examples here, it contains a lot of R functions with numpy package.

Answered By: Walid Bousseta

Is there a numpy function for generating sequences similar to R's seq function?

Question:

Answers:

Benchmarks: