numpy get indexes of connected array values
Question:
I have a 1d numpy array that looks like this:
a = np.array([1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1])
Is there a way to get the indexes of start and end of each cluster of values. So basically I would get this:
[
# clusters with value 1 (cluster with values 0 aren't needed)
[
# start and end of each cluster
[0, 2],
[8, 11],
[13, 14],
],
]
I’m not very skilled with numpy. I know there are lots of cool functions, but I have no idea which ones to use. Also googling this problem didn’t give me anything since people usually have pretty specific problems that are different than mine. I know that for example np.split
won’t be enough here.
Please help me if you can, I can provide you with more examples or details if needed. I’ll try to respond as quickly as possible. Thank you for you time.
Answers:
Maybe this what you want? Try it and see if it helps you:
import numpy as np
a = np.array([1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1])
# find the start of each cluster
starts = np.where(np.diff(np.concatenate(([0], a, [0]))) == 1)[0]
# find the end of each cluster
ends = np.where(np.diff(np.concatenate(([0], a, [0]))) == -1)[0] - 1
# combine start and end indexes into a list o
clusters = list(zip(starts, ends))
print(clusters)
2nd version as requested:
# find the indexes where the value of a changes
change_idxs = np.flatnonzero(np.diff(a))
# add the start and end indexes of the array as boundaries
boundaries = np.concatenate(([0], change_idxs+1, [len(a)]))
# group consecutive boundaries
clusters = [(boundaries[i], boundaries[i+1]-1) for i in range(len(boundaries)-1) if i % 2 == 0]
print(clusters) # [(0, 2), (8, 11), (13, 14)]
I have a 1d numpy array that looks like this:
a = np.array([1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1])
Is there a way to get the indexes of start and end of each cluster of values. So basically I would get this:
[
# clusters with value 1 (cluster with values 0 aren't needed)
[
# start and end of each cluster
[0, 2],
[8, 11],
[13, 14],
],
]
I’m not very skilled with numpy. I know there are lots of cool functions, but I have no idea which ones to use. Also googling this problem didn’t give me anything since people usually have pretty specific problems that are different than mine. I know that for example np.split
won’t be enough here.
Please help me if you can, I can provide you with more examples or details if needed. I’ll try to respond as quickly as possible. Thank you for you time.
Maybe this what you want? Try it and see if it helps you:
import numpy as np
a = np.array([1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1])
# find the start of each cluster
starts = np.where(np.diff(np.concatenate(([0], a, [0]))) == 1)[0]
# find the end of each cluster
ends = np.where(np.diff(np.concatenate(([0], a, [0]))) == -1)[0] - 1
# combine start and end indexes into a list o
clusters = list(zip(starts, ends))
print(clusters)
2nd version as requested:
# find the indexes where the value of a changes
change_idxs = np.flatnonzero(np.diff(a))
# add the start and end indexes of the array as boundaries
boundaries = np.concatenate(([0], change_idxs+1, [len(a)]))
# group consecutive boundaries
clusters = [(boundaries[i], boundaries[i+1]-1) for i in range(len(boundaries)-1) if i % 2 == 0]
print(clusters) # [(0, 2), (8, 11), (13, 14)]