Replicating MATLAB's `randperm` in NumPy

Question:

I want to replicate MATLAB’s randperm() with NumPy.

Currently, to get randperm(n, k) I use np.random.permutation(n)[:k]. The problem is it allocates an array of size n then takes only k entries of it.

Is there a more memory efficient way to directly create the array?

Asked By: Royi

||

Answers:

I can recommend you np.random.choice(n, k, replace = False).
Yet, I am not sure about memory efficiency.
Please refer to docs

Answered By: TaQ

Based on @TaQ answer:

np.random.choice(n, k, replace = False)

Is the equivalent to MATLAB’s randperm().

Update: I will update his answer as well to mark it.

Answered By: Royi

numpy.random.choice(n, k, replace=False) is no more memory efficient than numpy.random.permutation(n)[:k]. It too creates an n-item temporary list, shuffles that list, and takes k items from that list. See:

However, numpy.random.* functions, such as numpy.random.choice and numpy.random.permutation, have become legacy functions as of NumPy 1.17, and their algorithms — inefficiencies and all — are expected to remain as they are for backward compatibility reasons (see the recent RNG policy for NumPy).

Fortunately, NumPy since version 1.17 has an alternative:numpy.random.Generator.choice, which uses a much more efficient implementation, as can be seen below:

In [227]: timeit np.random.choice(4000000, 48, replace = False)                                  
163 ms ± 19.3 ms per loop (mean ± std. Dev. Of 7 runs, 1 loop each)

In [228]: timeit np.random.permutation(4000000)[:48]                                             
178 ms ± 22.5 ms per loop (mean ± std. Dev. Of 7 runs, 1 loop each)

In [229]: r=numpy.random.default_rng()                                                           

In [230]: timeit r.choice(4000000,48,replace=False)                                              
14.5 µs ± 28.9 ns per loop (mean ± std. Dev. Of 7 runs, 100000 loops each)

If you use NumPy 1.17 or later, you should make use of the new pseudorandom number generation system introduced in version 1.17, including numpy.random.Generator, in newer applications.

Answered By: Peter O.