Group data by a tolerance

Question:

I have an ordered list

L = [301.148986835,
301.148986835,
301.148986835,
301.161562835,
301.161562835,
301.16156333500004,
301.167179835,
301.167179835,
301.167179835,
301.167179835,
301.167179835,
301.179755835,
301.179755835,
301.179755835,
301.646611835,
301.659187335,
301.659187335,
301.659187335,
301.659187335,
302.138619335,
302.142316335,
302.151194835,
302.1568118349999,
302.15681183500004,
302.15681183500004,
302.15681183500004,
302.156812335,
302.156812335,
302.156812335,
302.169387835,
302.169387835,
302.169387835,
302.169387835,
302.169387835,
302.169388335,
302.636243335,
302.636243835,
302.648819835,
302.648819835,
303.137565335,
303.140827335,
303.140827335,
303.146443835,
303.146443835,
303.146444335,
303.159019835,
303.159019835,
303.15901983500004,
303.159020335,
303.159020335,
303.15902033500004,
303.63283533500004,
303.638451335,
304.130459335,
304.130459335,
304.14370483499994,
304.14370483499994,
304.14370483499994,
304.148651835,
304.148652335,
304.148652335]

I want to group it with a margin of +-0.5

The expected output

 R = [[301.148986835,
  301.148986835,
  301.148986835,
  301.161562835,
  301.161562835,
  301.16156333500004,
  301.167179835,
  301.167179835,
  301.167179835,
  301.167179835,
  301.167179835,
  301.179755835,
  301.179755835,
  301.179755835,
  301.646611835,
  301.659187335,
  301.659187335,
  301.659187335,
  301.659187335,
  302.138619335],[302.142316335,
  302.151194835,
  302.1568118349999,
  302.15681183500004,
  302.15681183500004,
  302.15681183500004,
  302.156812335,
  302.156812335,
  302.156812335,
  302.169387835,
  302.169387835,
  302.169387835,
  302.169387835,
  302.169387835,
  302.169388335,
  302.636243335,
  302.636243835,
  302.648819835,
  302.648819835,
  303.137565335,
  303.140827335,
  303.140827335,
  303.146443835,
  303.146443835,
  303.146444335,
  303.159019835,
  303.159019835,
  303.15901983500004,
  303.159020335,
  303.159020335,
  303.15902033500004],
[303.63283533500004,
  303.638451335,
  304.130459335,
  304.130459335,
  304.14370483499994,
  304.14370483499994,
  304.14370483499994],[304.148651835,
  304.148652335,
  304.148652335]

When I use this code (my question is not duplicate

def grouper(iterable):
    prev = None
    group = []
    for item in iterable:
        if prev is None or item - prev <= 1:
            group.append(item)
        else:
            yield group
            group = [item]
        prev = item
    if group:
        yield group

I get the same list as an output

calculate within a tolerance

Asked By: ZAHR

||

Answers:

You update prev in every iteration. Because of this, every element of your list is within 1 of prev. You want to update it only when you start a new group.

Better yet, get rid of prev altogether and always compare against the first element of the group.

I’d also suggest including a tol argument so that the function is more flexible:

def grouper(iterable, tol=0.5):
    tol = abs(tol*2) # Since we're counting from the start of the group, multiply tol by 2
    group = []
    for item in iterable:
        if not group or item - group[0] <= tol:
            group.append(item)
        else:
            yield group
            group = [item]
    if group:
        yield group

Try it online

Answered By: Pranav Hosangadi
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.