How to check a sequence is strictly monotonically or there is one turning point where both sides are strictly monotonically?

Question:

Input

l1=[1,3,5,6,7]
l2=[1,2,2,3,4]
l3=[5,4,3,2,1]
l4=[5,5,3,2,1]
l5=[1,2,3,4.1,3,2]
l6=[3,2,1,0.4,1,2,3]
l7=[1,2,10,4,8,9,2]
l8=[1,2,3,4,4,3,2,1]
l9=[-0.05701686,  0.57707936, -0.34602634, -0.02599778]
l10=[ 0.13556905,  0.45859   , -0.34602634, -0.09178798,  0.03044908]
l11=[-0.38643975, -0.09178798,  0.57707936, -0.05701686,  0.00649252]

Notice: The value in sequence is float.

Expected

  • Write a function find_targeted_seq that returns a sequence whether is strictly monotonically or there is one turning point where both sides are strictly monotonically.For example, l1,l3,l5,l6 are expected.

Try

Asked By: Jack

||

Answers:

I use a combination of np.sgn and np.diff to check for ascending/descending parts of the sequence. From your examples in l2 and l4, double elements (where diff == 0) do not count as ascending or descending. These are rejected in the first if-else clause. In most cases, the np.sign(np.diff(x)) is all -1’s and 1’s, according to ascending/descending parts. We compute a second np.diff to see how many turning points there are and return True/False accordingly.

See the attached code 🙂

import numpy as np
  
def legal_seq(seq):
    arr = np.array(seq)
    diffs = np.diff(arr) 
    sgn_diff = np.sign(diffs)
    if np.isin(0, sgn_diff): # if the difference is 0, we reject the seq
        return False
    else:
        sgn_diff2 = np.diff(sgn_diff) # the sgn_diff is only 1's and -1's we need to count how many constant segments there are, so we use np.diff again
        num_turning_points = len(np.where(sgn_diff2)[0]) # np.where will see how many non-zero elements there are. nonzero elements in the np.diff are turning points, so we count these
        if num_turning_points < 2: # if num_turning_points is 0, the seq is mono. if num_turning_points is 1, we return True. Otherwise, False is returned.
            return True
        else:
            return False

## TESTING ##
l1=[1,3,5,6,7]
l2=[1,2,2,3,4]
l3=[5,4,3,2,1]
l4=[5,5,3,2,1]
l5=[1,2,3,4.1,3,2]
l6=[3,2,1,0.4,1,2,3]
l7=[1,2,10,4,8,9,2]
l8=[1,2,3,4,4,3,2,1]
l9=[-0.05701686,  0.57707936, -0.34602634, -0.02599778]
l10=[ 0.13556905,  0.45859   , -0.34602634, -0.09178798,  0.03044908]
l11=[-0.38643975, -0.09178798,  0.57707936, -0.05701686,  0.00649252]

ls = [l1, l2, l3, l4, l5, l6, l7, l8, l9, l10, l11]
for i,l in enumerate(ls):
    print(i + 1, count_turning_points(l))

This outputs:

1 True
2 False
3 True
4 False
5 True
6 True
7 False
8 False
9 False
10 False
11 False
Answered By: yonatansc97

IIUC, you have 3 cases:

  • Only a strictly increasing monotonic sequence that cover the whole sequence
  • Only a strictly decreasing monotonic sequence that cover the whole sequence
  • Both a strictly increasing and decreasing monotonic sequence, whose sume cover the whole sequence

So you could do the following:

from scipy.signal import argrelmin, argrelmax
import pandas as pd


def is_strictly_monotonic_increasing(s):
    return s.is_unique and s.is_monotonic_increasing


def is_strictly_monotonic_decreasing(s):
    return s.is_unique and s.is_monotonic_decreasing


def find_targeted_seq(lst):
    ser = pd.Series(lst)

    if is_strictly_monotonic_increasing(ser) or is_strictly_monotonic_decreasing(ser):
        return True

    minima, *_ = argrelmin(ser.values)
    if len(minima) == 1:  # only on minimum turning point
        idx = minima[0]
        return is_strictly_monotonic_decreasing(ser[:idx]) and is_strictly_monotonic_increasing(ser[idx:])

    maxima, *_ = argrelmax(ser.values)
    if len(maxima) == 1:  # only on maximum turning point
        idx = maxima[0]
        return is_strictly_monotonic_increasing(ser[:idx]) and is_strictly_monotonic_decreasing(ser[idx:])

    return False


data = [[1, 3, 5, 6, 7],  # l1
        [1, 2, 2, 3, 4],  # l2
        [5, 4, 3, 2, 1],  # l3
        [5, 5, 3, 2, 1],  # l4
        [1, 2, 3, 4.1, 3, 2],  # l5
        [3, 2, 1, 0.5, 1, 2],  # this value was added in addition to the existing ones
        [3, 2, 1, 0.4, 1, 2, 3],  # l6
        [1, 2, 10, 4, 8, 9, 2],  # l7
        [1, 2, 3, 4, 4, 3, 2, 1],  # l8
        [-0.05701686, 0.57707936, -0.34602634, -0.02599778],  # l9
        [0.13556905, 0.45859, -0.34602634, -0.09178798, 0.03044908],  # l10
        [-0.38643975, -0.09178798, 0.57707936, -0.05701686, 0.00649252]]  # l11

for datum in data:
    print(datum, find_targeted_seq(datum))

Output

[1, 3, 5, 6, 7] True
[1, 2, 2, 3, 4] False
[5, 4, 3, 2, 1] True
[5, 5, 3, 2, 1] False
[1, 2, 3, 4.1, 3, 2] True
[3, 2, 1, 0.5, 1, 2] True
[3, 2, 1, 0.4, 1, 2, 3] True
[1, 2, 10, 4, 8, 9, 2] False
[1, 2, 3, 4, 4, 3, 2, 1] False
[-0.05701686, 0.57707936, -0.34602634, -0.02599778] False
[0.13556905, 0.45859, -0.34602634, -0.09178798, 0.03044908] False
[-0.38643975, -0.09178798, 0.57707936, -0.05701686, 0.00649252] False
Answered By: Dani Mesejo

Neither Python’s standard lib nor NumPy have a specific primitive to solve this task.
However, the traditional way in NumPy is to look into differences using np.diff().

To investigate turning points, you could use np.argmin() and np.argmax(), respectively.

Strict monotonicity condition corresponds to: np.all(arr[1:] > arr[:-1]) (increasing) or np.all(arr[1:] < arr[:-1]) (decreasing).
The requirement of one turning point (pivot), is equivalent to finding that turning point and checking the that the sequence is separately monotonic.
For multiple consecutive minima or maxima, if one uses the first minima or maxima encountered and checks for monotonicity the left branch excluding that minima or maxima, this is enough to guarantee that two consecutive minima or maxima will be be correctly identified.

Hence, a simple implementation follows:

import numpy as np


def find_targeted_seq_np(seq):
    incr = arr[1:] > arr[:-1]
    decr = arr[1:] < arr[:-1]
    if np.all(incr) or np.all(decr):
        return True
    maximum = np.argmax(seq)
    if np.all(incr[:maximum]) and np.all(decr[maximum:]):
        return True
    minimum = np.argmin(seq)
    if np.all(decr[:minimum]) and np.all(incr[minimum:]):
        return True
    return False

(This is fundamentally the same idea as in @DaniMesejo’s answer).


Another option would be to make use of a combination of np.diff(), np.sign() and np.count_nonzero() to count the number of times there is a change in monotonicity. If this is 0 or 1, then the sequence is valid. Avoiding repeating elements is built-in in the counting of the changes in sign, except when repeating elements are at the beginning or the end of the sequence and this situation must be checked explicitly.
This leads to a very concise solution:

import numpy as np


def find_targeted_seq_np2(seq):
    diffs = np.diff(seq)
    return 
        diffs[0] != 0 and diffs[-1] != 0  
        and np.count_nonzero(np.diff(np.sign(diffs))) < 2

(This is fundamentally the same idea as in @yonatansc97’s answer, but without using np.isin() as suggested in the comments by @DaniMesejo).


Alternatively, one can consider using explicit looping.
This has the advantage of being considerably more memory efficient and has much better short-circuiting properties:

def find_targeted_seq(seq):
    n = len(seq)
    changes = 0
    x = seq[1]
    last_x = seq[0]
    if x > last_x:
        monotonic = 1
    elif x < last_x:
        monotonic = -1
    else:  # x == last_x
        return False
    for i in range(1, n):
        x = seq[i]
        if x == last_x:
            return False
        elif (x > last_x and monotonic == -1) or (x < last_x and monotonic == 1):
            changes += 1
            monotonic = -monotonic
        if changes > 1:
            return False
        last_x = x
    return True

Additionally, if the type stability of the elements of the sequence can be guaranteed, then it can be easily accelerated via Numba:

import numba as nb


_find_targeted_seq_nb = nb.njit(find_targeted_seq)


def find_targeted_seq_nb(seq):
    return _find_targeted_seq_nb(np.array(seq))

For comparison, here it is reported an implementation using pandas (which provides some primitives for monotonicity check) and scipy.signal.argrelmin()/scipy.signal.argrelmax() for finding turning points (this code is substantially the same found in @DaniMesejo’s answer) e.g.:

from scipy.signal import argrelmin, argrelmax
import pandas as pd


def is_strictly_monotonic_increasing(s):
    return s.is_unique and s.is_monotonic_increasing


def is_strictly_monotonic_decreasing(s):
    return s.is_unique and s.is_monotonic_decreasing


def find_targeted_seq_pd(lst):
    ser = pd.Series(lst)
    if is_strictly_monotonic_increasing(ser) or is_strictly_monotonic_decreasing(ser):
        return True
    minima, *_ = argrelmin(ser.values)
    if len(minima) == 1:  # only on minimum turning point
        idx = minima[0]
        return is_strictly_monotonic_decreasing(ser[:idx]) and is_strictly_monotonic_increasing(ser[idx:])
    maxima, *_ = argrelmax(ser.values)
    if len(maxima) == 1:  # only on maximum turning point
        idx = maxima[0]
        return is_strictly_monotonic_increasing(ser[:idx]) and is_strictly_monotonic_decreasing(ser[idx:])
    return False

These solutions applied to the given input all do give the correct results:

data = (
    ((1, 3, 5, 6, 7), True),  # l1
    ((1, 2, 2, 3, 4), False),  # l2
    ((5, 4, 3, 2, 1), True),  # l3
    ((5, 5, 3, 2, 1), False),  # l4
    ((1, 2, 3, 4.1, 3, 2), True),  # l5
    ((3, 2, 1, 0.5, 1, 2), True),  # this value was added in addition to the existing ones
    ((3, 2, 1, 0.4, 1, 2, 3), True),  # l6
    ((1, 2, 10, 4, 8, 9, 2), False),  # l7
    ((1, 2, 3, 4, 4, 3, 2, 1), False),  # l8
    ((-0.05701686, 0.57707936, -0.34602634, -0.02599778), False),  # l9
    ((0.13556905, 0.45859, -0.34602634, -0.09178798, 0.03044908), False),  # l10
    ((-0.38643975, -0.09178798, 0.57707936, -0.05701686, 0.00649252), False),  # l11
)


funcs = find_targeted_seq_np, find_targeted_seq_np2, find_targeted_seq_pd, find_targeted_seq, find_targeted_seq_nb

for func in funcs:
    print(func.__name__, all(func(seq) == result for seq, result in data))
# find_targeted_seq_np True
# find_targeted_seq_np2 True
# find_targeted_seq_pd True
# find_targeted_seq True
# find_targeted_seq_nb True

Timewise, some simple benchmarks on the proposed data clearly indicate direct looping (with or without Numba acceleration) to be the fastest.
The second Numpy approach gets significantly faster than the first NumPy appraoch, while the pandas-based approach is the slowest:

for func in funcs:
    print(func.__name__)
    %timeit [func(seq) == result for seq, result in data]
    print()

# find_targeted_seq_np
# 1000 loops, best of 3: 530 µs per loop

# find_targeted_seq_np2
# 10000 loops, best of 3: 187 µs per loop

# find_targeted_seq_pd
# 100 loops, best of 3: 4.68 ms per loop

# find_targeted_seq
# 100000 loops, best of 3: 14.6 µs per loop

# find_targeted_seq_nb
# 10000 loops, best of 3: 19.9 µs per loop

While direct looping is faster for this test data than the NumPy-based approaches on the given input, the latters should scale better with larger input size. The Numba approach is likely to be faster than the NumPy approaches at all scales.

Answered By: norok2
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.