Check if a numpy array is sorted

Question:

I have a numpy array and I like to check if it is sorted.

>>> a = np.array([1,2,3,4,5])
array([1, 2, 3, 4, 5])
Asked By: luca

||

Answers:

np.all(a[:-1] <= a[1:])

Examples:

is_sorted = lambda a: np.all(a[:-1] <= a[1:])

>>> a = np.array([1,2,3,4,9])
>>> is_sorted(a)
True

>>> a = np.array([1,2,3,4,3])
>>> is_sorted(a)
False
Answered By: luca

With NumPy tools:

np.all(np.diff(a) >= 0)

but numpy solutions are all O(n).

If you want quick code and very quick conclusion on unsorted arrays:

import numba
@numba.jit
def is_sorted(a):
    for i in range(a.size-1):
         if a[i+1] < a[i] :
               return False
    return True
          

which is Θ(1) on random arrays.

Answered By: B. M.

The inefficient but easy-to-type solution:

(a == np.sort(a)).all()
Answered By: 1''

For completeness, the O(log n) iterative solution is found below. The recursive version is slower and crashes with big vector sizes. However, it is still slower than the native numpy using np.all(a[:-1] <= a[1:]) most likely due to modern CPU optimizations. The only case where the O(log n) is faster is on the “average” random case or if it is “almost” sorted. If you suspect your array is already fully sorted then np.all will be faster.

def is_sorted(a):
    idx = [(0, a.size - 1)]
    while idx:
        i, j = idx.pop(0) # Breadth-First will find almost-sorted in O(log N)
        if i >= j:
            continue
        elif a[i] > a[j]:
            return False
        elif i + 1 == j:
            continue
        else:
            mid = (i + j) >> 1 # Division by 2 with floor
            idx.append((i, mid))
            idx.append((mid, j))
    return True

is_sorted2 = lambda a: np.all(a[:-1] <= a[1:])

Here are the results:

# Already sorted array - np.all is super fast
sorted_array = np.sort(np.random.rand(1000000))

%timeit is_sorted(sorted_array)
659 ms ± 3.28 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit is_sorted2(sorted_array)
431 µs ± 35.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
# Here I included the random in each command so we need to substract it's timing
%timeit np.random.rand(1000000)
6.08 ms ± 17.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit is_sorted(np.random.rand(1000000))
6.11 ms ± 58.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
# Without random part, it took 6.11 ms - 6.08 ms = 30µs per loop

%timeit is_sorted2(np.random.rand(1000000))
6.83 ms ± 75.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
# Without random part, it took 6.83 ms - 6.08 ms = 750µs per loop

Net, a O(n) vector optimized code is better than an O(log n) algorithm, unless you will run >100 million element arrays.

Answered By: Dorian B.
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.