Python/NumPy: Split non-consecutive values into discrete subset arrays

Question:

How can I slice arrays such as this into n-many subsets, where one subset consists of consecutive values?

arr = np.array((0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 39, 40,
       41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 66, 67, 68, 69, 70, 71))
# tells me where they are consecutive
np.where(np.diff(arr) == 1)[0]
# where the break points are
cut_points = np.where(np.diff(arr) != 1)[0] + 1
# wont generalize well with n-many situations
arr[:cut_points[0] ]
arr[cut_points[0] : cut_points[1] ]
arr[cut_points[1] :, ]
Asked By: John Stud

||

Answers:

You can use np.split, and just pass in cut_points as the 2nd argument.

eg.

split_arr = np.split(arr, cut_points)

# split_arr looks like:
# [array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14]),
# array([39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55]),
# array([66, 67, 68, 69, 70, 71])]

full solution:

import numpy as np
arr = np.array((0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 39, 40,
       41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 66, 67, 68, 69, 70, 71))
cut_points = np.where(np.diff(arr) != 1)[0] + 1
split_arr = np.split(arr, split_points)
split_arr
# outputs:
[array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14]),
 array([39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55]),
 array([66, 67, 68, 69, 70, 71])]
Answered By: Haleemur Ali

Just as an alternative way, with no pandas/numpy.

If you don’t care about the order of the input/ouput, you start at the end and do something like:

l = (0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 66, 67, 68, 69, 70, 71)
i = 0
current_index = -1
prev_value = None
result = []
for k in l[::-1]:
    current_value = k + i
    if prev_value != current_value:
        prev_value = current_value
        current_index += 1
        result.append([])
    result[current_index].append(k)
    i += 1
print(result)

Then result will contain:

[
    [71, 70, 69, 68, 67, 66],
    [55, 54, 53, 52, 51, 50, 49, 48, 47, 46, 45, 44, 43, 42, 41, 40, 39],
    [14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
]
Answered By: ChatterOne
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.