Slicing a list into n nearly-equal-length partitions

Question:

I’m looking for a fast, clean, pythonic way to divide a list into exactly n nearly-equal partitions.

partition([1,2,3,4,5],5)->[[1],[2],[3],[4],[5]]
partition([1,2,3,4,5],2)->[[1,2],[3,4,5]] (or [[1,2,3],[4,5]])
partition([1,2,3,4,5],3)->[[1,2],[3,4],[5]] (there are other ways to slice this one too)

There are several answers in here Iteration over list slices that run very close to what I want, except they are focused on the size of the list, and I care about the number of the lists (some of them also pad with None). These are trivially converted, obviously, but I’m looking for a best practice.

Similarly, people have pointed out great solutions here How do you split a list into evenly sized chunks? for a very similar problem, but I’m more interested in the number of partitions than the specific size, as long as it’s within 1. Again, this is trivially convertible, but I’m looking for a best practice.

Asked By: Drew

||

Answers:

def partition(lst, n):
    division = len(lst) / float(n)
    return [ lst[int(round(division * i)): int(round(division * (i + 1)))] for i in xrange(n) ]

>>> partition([1,2,3,4,5],5)
[[1], [2], [3], [4], [5]]
>>> partition([1,2,3,4,5],2)
[[1, 2, 3], [4, 5]]
>>> partition([1,2,3,4,5],3)
[[1, 2], [3, 4], [5]]
>>> partition(range(105), 10)
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [11, 12, 13, 14, 15, 16, 17, 18, 19, 20], [21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31], [32, 33, 34, 35, 36, 37, 38, 39, 40, 41], [42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52], [53, 54, 55, 56, 57, 58, 59, 60, 61, 62], [63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73], [74, 75, 76, 77, 78, 79, 80, 81, 82, 83], [84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94], [95, 96, 97, 98, 99, 100, 101, 102, 103, 104]]

Python 3 version:

def partition(lst, n):
    division = len(lst) / n
    return [lst[round(division * i):round(division * (i + 1))] for i in range(n)]
Answered By: João Silva

Below is one way.

def partition(lst, n):
    increment = len(lst) / float(n)
    last = 0
    i = 1
    results = []
    while last < len(lst):
        idx = int(round(increment * i))
        results.append(lst[last:idx])
        last = idx
        i += 1
    return results

If len(lst) cannot be evenly divided by n, this version will distribute the extra items at roughly equal intervals. For example:

>>> print [len(x) for x in partition(range(105), 10)]
[11, 10, 11, 10, 11, 10, 11, 10, 11, 10]

The code could be simpler if you don’t mind all of the 11s being at the beginning or the end.

Answered By: Daniel Stutzbach

Here’s a version that’s similar to Daniel’s: it divides as evenly as possible, but puts all the larger partitions at the start:

def partition(lst, n):
    q, r = divmod(len(lst), n)
    indices = [q*i + min(i, r) for i in xrange(n+1)]
    return [lst[indices[i]:indices[i+1]] for i in xrange(n)]

It also avoids the use of float arithmetic, since that always makes me uncomfortable. 🙂

Edit: an example, just to show the contrast with Daniel Stutzbach’s solution

>>> print [len(x) for x in partition(range(105), 10)]
[11, 11, 11, 11, 11, 10, 10, 10, 10, 10]
Answered By: Mark Dickinson

Just a different take, that only works if [[1,3,5],[2,4]] is an acceptable partition, in your example.

def partition ( lst, n ):
    return [ lst[i::n] for i in xrange(n) ]

This satisfies the example mentioned in @Daniel Stutzbach’s example:

partition(range(105),10)
# [[0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100],
# [1, 11, 21, 31, 41, 51, 61, 71, 81, 91, 101],
# [2, 12, 22, 32, 42, 52, 62, 72, 82, 92, 102],
# [3, 13, 23, 33, 43, 53, 63, 73, 83, 93, 103],
# [4, 14, 24, 34, 44, 54, 64, 74, 84, 94, 104],
# [5, 15, 25, 35, 45, 55, 65, 75, 85, 95],
# [6, 16, 26, 36, 46, 56, 66, 76, 86, 96],
# [7, 17, 27, 37, 47, 57, 67, 77, 87, 97],
# [8, 18, 28, 38, 48, 58, 68, 78, 88, 98],
# [9, 19, 29, 39, 49, 59, 69, 79, 89, 99]]
Answered By: Manuel Zubieta

This answer provides a function split(list_, n, max_ratio), for people
who want to split their list into n pieces with at most max_ratio
ratio in piece length. It allows for more variation than the
questioner’s ‘at most 1 difference in piece length’.

It works by sampling n piece lengths within the desired ratio range
[1 , max_ratio), placing them after each other to form a ‘broken
stick’ with the right distances between the ‘break points’ but the wrong
total length. Scaling the broken stick to the desired length gives us
the approximate positions of the break points we want. To get integer
break points requires subsequent rounding.

Unfortunately, the roundings can conspire to make pieces just too short,
and let you exceed the max_ratio. See the bottom of this answer for an
example.

import random

def splitting_points(length, n, max_ratio):
    """n+1 slice points [0, ..., length] for n random-sized slices.

    max_ratio is the largest allowable ratio between the largest and the
    smallest part.
    """
    ratios = [random.uniform(1, max_ratio) for _ in range(n)]
    normalized_ratios = [r / sum(ratios) for r in ratios]
    cumulative_ratios = [
        sum(normalized_ratios[0:i])
        for i in range(n+1)
    ]
    scaled_distances = [
        int(round(r * length))
        for r in cumulative_ratios
    ]

    return scaled_distances


def split(list_, n, max_ratio):
    """Slice a list into n randomly-sized parts.

    max_ratio is the largest allowable ratio between the largest and the
    smallest part.
    """

    points = splitting_points(len(list_), n, ratio)

    return [
        list_[ points[i] : points[i+1] ]
        for i in range(n)
    ]

You can try it out like so:

for _ in range(10):
    parts = split('abcdefghijklmnopqrstuvwxyz', 4, 2)
    print([(len(part), part) for part in parts])

Example of a bad result:

parts = split('abcdefghijklmnopqrstuvwxyz', 10, 2)

# lengths range from 1 to 4, not 2 to 4
[(3, 'abc'),  (3, 'def'), (1, 'g'),
 (4, 'hijk'), (3, 'lmn'), (2, 'op'),
 (2, 'qr'),  (3, 'stu'),  (2, 'vw'),
 (3, 'xyz')]
Answered By: Esteis
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.