# Re-number disjoint sections of an array, by order of appearance

## Question:

Consider an array of contiguous "sections":

``````x = np.asarray([
1, 1, 1, 1,
9, 9, 9,
3, 3, 3, 3, 3,
5, 5, 5,
])
``````

I don’t care about the actual values in the array. I only care that they demarcate disjoint sections of the array. I would like to renumber them so that the first section is all `0`, the second second is all `1`, and so on:

``````desired = np.asarray([
0, 0, 0, 0,
1, 1, 1,
2, 2, 2, 2, 2,
3, 3, 3,
])
``````

What is an elegant way to perform this operation? I don’t expect there to be a single best answer, but I think this question could provide interesting opportunities to show off applications of various Numpy and other Python features.

Assume for the sake of this question that the array is 1-dimensional and non-empty.

Here is a naïve but linear-time implementation using `nditer`:

``````def renumber(arr):
assert arr.ndim == 1

val_prev = None  # Arbitrary placeholder
section_number = 0
result = np.empty_like(arr, dtype=int)
with np.nditer(
[arr, result],
flags=['c_index'],
) as it:
for val_curr, res in it:
if it.index > 0 and val_curr != val_prev:
section_number += 1
res[...] = section_number
val_prev = val_curr
return result
``````

There are certainly fancier ways to do this, but this implementation should serve as a sensible baseline:

``````x = np.asarray([1, 1, 1, 1, 9, 9, 9, 3, 3, 3, 3, 3, 5, 5, 5])
desired = array([0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3])
np.testing.assert_array_equal(renumber(x), desired)
``````

Note: There is a nicer equivalent of this in another answer.

My other answer essentially consists of comparing every value to the value before it, and incrementing a counter whenever that happens. This can be implemented in vectorized fashion by taking advantage of the fact that boolean `True` corresponds to integer `1`, and `False` corresponds to `0`.

``````def renumber(arr):
assert x.ndim == 1
return np.cumsum(np.insert(x[1:] != x[:-1], 0, x[0] != x[1]))

x = np.asarray([1, 1, 1, 1, 9, 9, 9, 3, 3, 3, 3, 3, 5, 5, 5])
desired = np.array([0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3])
np.testing.assert_array_equal(renumber(x), desired)
``````

Note that this is a little clunky due to the need to `np.insert` the first value. I would be very interested to know if there is a more elegant way to achieve this.

Combining `np.cumsum` with `np.diff` allows you to do this.

``````a = np.cumsum(np.diff(x, prepend=x[0]) != 0)
``````
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.