Speed up set partition generation by skipping ones with subsets smaller or larger than X, or partitions with more or less than Y subsets

Question:

There is a great answer regarding the generation of partition sets:

Set partitions in Python

(another similar answer with ruby is found here: Calculate set partitions with specific subset sizes which also has a picture)

The problem is that it quickly gets slow when N increases, it can be noticed from 12 or 13

CONTEXT:

I need set partitions to fit a collection of images into either a
portrait or landscape sheet of paper, to reduce blank space as most as
possible.

I tried bin packing algorithms like answered here
How is 2D bin packing achieved programmatically?

but it generates bad solutions (I posted a comment on why with a screenshot). Other bin
packing solutions are too complex.

A good solution I found is to scale sets of images so they have the
same height, put them in a row, and assemble several rows on a landscape or portrait layout.
Description here:

enter image description here

I filter out candidates that:

  • scale up or down images too much, since I want each image to have almost equal surface allocated

  • waste too much space

END OF CONTEXT

This technique works well for 10 up to 11 images, but at 12 and beyond, it’s too slow.

The more-itertools‘s set_partitions() also works, but it has the same speed problem.

A lot of partitions have subsets of size 1, or have too few large subsets. As you can see in the image, I don’t need singletons (subsets of size 1) and it needs partitions with a minimum number of subsets (it’s not really the square root of N, it depends if it’s either landscape or portrait).

My question is this:

Is it possible to speed up set partition generation and directly skip:

  • set partitions with subsets of size SMALLER than X

AND/OR

  • set partitions with fewer than Y subsets

and eventually:

  • set partitions with subsets of size LARGER than X

or eventually

  • set partitions with MORE than Y subsets

Those X and Y can vary since I want to generate those set partitions for different values of N.

EDIT: I also found this: https://codereview.stackexchange.com/questions/1526/finding-all-k-subset-partitions

This seems to be somewhat fast.

Asked By: jokoon

||

Answers:

Just use recursion.

Note, the arrays returned each time are modified in place for efficiency. If you’re going to do anything interesting with them, you probably want to make your own copies.

def split_partition (elements, min_size, max_size, partition=None, rest=None, i = 0):
    if partition is None:
        partition = []
    if rest is None:
        rest = []

    # The first element always goes in.
    if i == 0:
        partition.append(elements[i])
        yield from split_partition(elements, min_size, max_size, partition, rest, i+1)
    elif len(partition) + len(elements) - i < min_size:
        pass # Too few solutions.
    elif max_size == len(partition):
        for j in range(i, len(elements)):
            rest.append(elements[j])
        yield (partition, rest) # Just the one solution.
        for j in range(i, len(elements)):
            rest.pop()
    elif i == len(elements):
        yield (partition, rest) # Just the one solution.
    else:
        partition.append(elements[i])
        yield from split_partition(elements, min_size, max_size, partition, rest, i+1)
        rest.append(partition.pop())
        yield from split_partition(elements, min_size, max_size, partition, rest, i+1)
        rest.pop()

def desired_partitions (elements, min_size, max_size, min_count, max_count):
    if len(elements) < min_size * min_count:
        pass # No solutions possible.
    elif max_size * max_count < len(elements):
        pass # No solutions possible.
    else:
        for partition, rest in split_partition(elements, min_size, max_size):
            if 0 == len(rest):
                yield [partition]
            else:
                for split in desired_partitions(rest, min_size, max_size, min_count - 1, max_count - 1):
                    split.append(partition)
                    yield split
                    split.pop()


for split in desired_partitions([1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 3, 5, 2, 3):
    print(split)
Answered By: btilly

Here is a faster approach. It is still using recursion to figure out the sizes of the partitions. But for the heavy lifting of producing the partitions it avoids recursive calls, unnecessary checks, and excessive copying of data. In my testing that makes it about 3x as fast.

I’m not sure that a factor of 3 will help too much with the combinatorial explosion though.

def desired_partition_sizes (element_count, min_size, max_size, min_count, max_count):
    if max_size * max_count < element_count:
        pass # No solutions.
    elif element_count < min_size * min_count:
        pass # No solutions.
    elif max_count == 1 and 0 < element_count:
        if min_size <= element_count:
            yield [element_count]
    elif 0 == element_count and 0 == min_count:
        yield []
    else:
        for i in range(min_size, max_size+1):
            for solution in desired_partition_sizes(
                    element_count-i, min_size, max_size,
                    max(min_count-1, 0), max_count-1):
                solution.append(i)
                yield solution

# The first partition contains the first element and has the first size
# The second partition contains the first element not in the first
#     partition and has the second size.
# The third partition contains the first element not in the first 2
#     partitions and has the third size.
# etc
def partitions_fitting_desired_sizes (elements, sizes):
    partitions = [[] for _ in sizes]
    assigned = []
    while True:
        # Attempt to assign.
        i = len(assigned)
        j = 0
        while i < len(elements):
            while sizes[j] <= len(partitions[j]):
                j += 1

            partitions[j].append(elements[i])
            assigned.append(j)
            i += 1

        yield partitions

        # Unassign the last element.
        partitions[ assigned.pop() ].pop()

        # Unassign until an element can be moved to a later partition.
        while 0 < len(assigned):
            k = assigned.pop()
            partitions[k].pop()
            # Remember, the k'th starts with first not in others.
            found = False
            if 0 < len(partitions[k]):
                for j in range(k+1, len(sizes)):
                    if len(partitions[j]) < sizes[j]:
                        found = True
                        partitions[j].append(elements[len(assigned)])
                        assigned.append(j)
                        break # Inner loop.
            if found: # EXIT HERE (we moved one
                break
        if 0 == len(assigned):
            break # All done.

def desired_partitions (elements, min_size, max_size, min_count, max_count):
    for sizes in desired_partition_sizes(len(elements), min_size, max_size, min_count, max_count):
        yield from partitions_fitting_desired_sizes(elements, sizes)

for partitions in desired_partitions([1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 3, 5, 2, 3):
    print(partitions)
Answered By: btilly

This is an adaptation of my answer to the basic partition problem here. The idea here is that if all subsets should have a minimum size of 2, recursive partitions of a smaller set should include the sets of size 1, since these can be brought up to size as the recursion unwinds. The number of missing elements that can be forgiven is the number of elements still to be added.

So we can avoid repetition and also trim at every level, which lets the number of cases grow less steeply (but still exponentially, of course). It runs pretty fast, I’d say. If you do any timing comparisons, please report what you find out.

def partition_filtered(collection, minsize=1, forgive=0):
    """
    Generate partitions that contain at least `minsize` elements per set;
    allow `forgive` missing elements, which can get added in subsequent steps
    """
    if len(collection) == 1:
        yield [ collection ]
        return

    first = collection[0]
    for smaller in partition_filtered(collection[1:], minsize, forgive=forgive+1):
        # insert `first` in each of the subpartition's subsets
        for n, subset in enumerate(smaller):
            candidate = smaller[:n] + [[ first ] + subset]  + smaller[n+1:]
            if conforms(candidate, minsize, forgive):
                yield candidate

        # put `first` in its own subset
        candidate = [ [ first ] ] + smaller
        if conforms(candidate, minsize, forgive):
            yield candidate

def conforms(candidate, minsize, forgive):
    """
    Check if partition `candidate` is at most `forgive` additions from making
    all its elements conform to having minimum size `minsize`
    """
    deficit = 0
    for p in candidate:
        need = minsize - len(p)
        if need > 0:
            deficit += need

    # Is the deficit small enough?
    return (deficit <= forgive)

something = list(range(1,13)) # [1, ... 12]
for n, p in enumerate(partition_filtered(something, minsize=2), 1):
    print(n, sorted(p))

# Alternately, save the list and check it for correctness
# clean_partitions = list(partition_filtered(something, minsize=2))
Answered By: alexis