Find all possible sums of the combinations of integers from a set, efficiently

Question:

Given an integer n, and an array a of x random positive integers, I would like to find all possible sums of the combinations with replacement (n out of x) that can be drawn from this array.

For example:

n = 2, a = [2, 3, 4, 6]
all combinations = [2+2, 2+3, 2+4, 2+6, 3+3, 3+4, 3+6, 4+4, 4+6, 6+6]
all unique sums of these combination = {4, 5, 6, 7, 8, 9, 10, 12}

This can of course easily be solved by enumerating and summing all possible combinations, for example in Python:

from itertools import combinations_with_replacement

n = 2
a = [2,3,4,6]

{sum(comb) for comb in combinations_with_replacement(a, n)}

Is there a more efficient way to do this? I have to do this for n up to 4 and a up to a 1000 values, which gives 4e10 combinations, while the number of unique sums will be several orders of magnitude less for arrays with integers whose values aren’t too far apart, so I would guess there must be a more efficient way.

For example when n=3 and a is the set of the first 1000 even numbers, there will be only 2998 unique sums out of 1.6E8 possible combinations.

** Original question was updated to state that integers are only positive

Asked By: jonas87

||

Answers:

sums = {0}
for _ in range(n):
    sums = {s + x for s in sums for x in a}

Or using a bitset (assumes your numbers are non-negative):

sums = 1
for _ in range(n):
    new = 0
    for x in a:
        new |= sums << x
    sums = new
sums = {i for i, bit in enumerate(reversed(bin(sums))) if bit == '1'}

Which one is faster depends on the density of your numbers.

A further optimization of the second solution which can also handle negative numbers (it translates all values so that the smallest becomes 0):

sums = 1
minimum = min(a)
b = [x - minimum for x in a]
for _ in range(n):
    new = 0
    for x in b:
        new |= sums << x
    sums = new
sums = {
    i + n*minimum
    for i, bit in enumerate(reversed(bin(sums)))
    if bit == '1'
}

Yet another, intended for dense sets:

minimum = min(a)
maximum = max(a)
sums = set(a) if n else {0}
for i in range(2, n + 1):
    new = set()
    for s in range(minimum * i, maximum * i + 1):
        for x in a:
            if s - x in sums:
                new.add(s)
                break
    sums = new

One for large n, working like exponentiation by squaring:

sums = {0}
while n:
    if n % 2:
        sums = {s + x for s in sums for x in a}
    n //= 2
    if n:
        a = {x + y for x in a for y in a}

Benchmark according to your "let’s say that the first thousand positive even numbers is realistic. Though there will be some uneven mixed in as well" (I used "some"=42) and with n=4 (since you said "I don’t think I’ll ever need more than n=4").

   2.14 ±  0.03 ms  Kelly2
   2.44 ±  0.04 ms  Kelly3
  45.10 ±  4.05 ms  Kelly4b
 208.19 ±  2.22 ms  Kelly4
 807.97 ±  4.79 ms  Kelly1
1216.60 ± 19.22 ms  Kelly5

Benchmark code (Attempt This Online!):

def Kelly1(a, n):
    sums = {0}
    for _ in range(n):
        sums = {s + x for s in sums for x in a}
    return sums

def Kelly2(a, n):
    sums = 1
    for _ in range(n):
        new = 0
        for x in a:
            new |= sums << x
        sums = new
    return {i for i, bit in enumerate(reversed(bin(sums))) if bit == '1'}

def Kelly3(a, n):
    sums = 1
    minimum = min(a)
    b = [x - minimum for x in a]
    for _ in range(n):
        new = 0
        for x in b:
            new |= sums << x
        sums = new
    return {
        i + n*minimum
        for i, bit in enumerate(reversed(bin(sums)))
        if bit == '1'
    }

def Kelly4(a, n):
    minimum = min(a)
    maximum = max(a)
    sums = set(a) if n else {0}
    for i in range(2, n + 1):
        new = set()
        for s in range(minimum * i, maximum * i + 1):
            for x in a:
                if s - x in sums:
                    new.add(s)
                    break
        sums = new
    return sums

# optimization: try subtracting odd numbers first
def Kelly4b(a, n):
    minimum = min(a)
    maximum = max(a)
    sums = set(a) if n else {0}
    a = sorted(a, key=lambda x: -(x % 2))
    for i in range(2, n + 1):
        new = set()
        for s in range(minimum * i, maximum * i + 1):
            for x in a:
                if s - x in sums:
                    new.add(s)
                    break
        sums = new
    return sums

def Kelly5(a, n):
    sums = {0}
    while n:
        if n % 2:
            sums = {s + x for s in sums for x in a}
        n //= 2
        if n:
            a = {x + y for x in a for y in a}
    return sums

funcs = Kelly1, Kelly2, Kelly3, Kelly4, Kelly4b, Kelly5

from random import sample
from statistics import mean, stdev
from time import perf_counter as time

# Correctness
for n in range(11):
    a = sample(range(1, 10**4), 10)
    expect = funcs[0](a, n)
    for f in funcs:
        result = f(a, n)
        assert result == expect

# Speed
times = {f: [] for f in funcs}
def stats(f):
    ts = [t * 1e3 for t in sorted(times[f])[:5]]
    return f'{mean(ts):7.2f} ± {stdev(ts):5.2f} ms '

for _ in range(15):
    evens = list(range(2, 2001, 2))
    odds = sample(range(1, 2000, 2), 42)
    a = sorted(evens + odds)
    n = 4
    for f in funcs:
        t0 = time()
        result = f(a, n)
        times[f].append(time() - t0)
        del result

for f in sorted(funcs, key=stats):
    print(stats(f), f.__name__)
Answered By: Kelly Bundy