Group numbers into bins based on offset with Python

Question:

I have a list like this:

ls = [0, 1, 2, 4, 6, 7]  # it won't have duplicates and it is sorted

Now I want to group this list into bins based on a offset (in this example offset=1) which should return this:

[[0, 1, 2], [4], [6, 7]]
# Note: 
# The offset in [0, 1, 2] isn't 1 for 0 and 2,
# but it is for 1 and 2 and this is what I want

Is there an high-level function in numpy, scipy, pandas, etc. which will provide my desired result?

Note: The returned datastructure doesn’t have to be a list, any is welcomed.

Asked By: jamesB

||

Answers:

Using pure python:

ls = [0, 1, 2, 4, 6, 7]

def group(l, offset=1):
    out = []
    tmp = []
    prev = l[0]
    for val in l:
        if val-prev > offset:
            out.append(tmp)
            tmp = []
        tmp.append(val)
        prev = val
    out.append(tmp)
    return out

group(ls)
# [[0, 1, 2], [4], [6, 7]]

With :

import pandas as pd

offset = 1

s = pd.Series(ls)
s.groupby(s.diff().gt(offset).cumsum()).agg(list)

output:

0    [0, 1, 2]
1          [4]
2       [6, 7]
dtype: object

With :

import numpy as np

offset = 1

a = np.split(ls, np.nonzero(np.diff(ls)>offset)[0]+1)
# [array([0, 1, 2]), array([4]), array([6, 7])]
Answered By: mozway
Categories: questions Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.