Is there an efficient way to determine if a sum of floats will be order invariant?
Question:
Due to precision limitations in floating point numbers, the order in which numbers are summed can affect the result.
>>> 0.3 + 0.4 + 2.8
3.5
>>> 2.8 + 0.4 + 0.3
3.4999999999999996
This small error can become a bigger problem if the results are then rounded.
>>> round(0.3 + 0.4 + 2.8)
4
>>> round(2.8 + 0.4 + 0.3)
3
I would like to generate a list of random floats such that their rounded sum does not depend on the order in which the numbers are summed. My current brute force approach is O(n!). Is there a more efficient method?
import random
import itertools
import math
def gen_sum_safe_seq(func, length: int, precision: int) -> list[float]:
"""
Return a list of floats that has the same sum when rounded to the given
precision regardless of the order in which its values are summed.
"""
invalid = True
while invalid:
invalid = False
nums = [func() for _ in range(length)]
first_sum = round(sum(nums), precision)
for p in itertools.permutations(nums):
if round(sum(p), precision) != first_sum:
invalid = True
print(f"rejected {nums}")
break
return nums
for _ in range(3):
nums = gen_sum_safe_seq(
func=lambda :round(random.gauss(3, 0.5), 3),
length=10,
precision=2,
)
print(f"{nums} sum={sum(nums)}")
For context, as part of a programming exercise I’m providing a list of floats that model a measured value over time to ~1000 entry-level programming students. They will sum them in a variety of ways. Provided that their code is correct, I’d like for them all to get the same result to simplify checking their code. I do not want to introduce the complexities of floating point representation to students at this level.
Answers:
You are asking for a
numeric error analysis;
there is a rich literature on this.
In your example you found the
relative error
was unacceptably large.
Plus, you’re cramming infinite repeating fractions
into a 53-bit mantissa, with predictable truncation issues.
Adding numbers of different magnitudes tends to cause trouble.
Here, 2.8
is more than 8x 0.3
, so we risk losing three bits of precision.
You’re making this problem much too hard.
Simply use decimal values.
Or do the equivalent: scale your random floats by
some large number, perhaps 1e9
, and truncate to integer.
Now you’re summing integers, with no repeating fractional digits,
so we’re back to being able to rely on commutativity and associativity.
Remember to scale sums appropriately when reporting the results.
What you want is "math".
What you’ll get is a "machine representation".
So choose a representation that is a good fit for your use case.
EDIT
Oh, the educational context is illuminating.
Avoiding negative round-off errors is crucial.
Simply add epsilon = 2 ** -50
to all those round( ..., 3)
figures being summed.
With epsilon big enough to turn even the negative rounding errors
into positive terms, for N numbers with average of mean
,
your FP sum will be approximately N * mean + N * epsilon
,
and then the final rounding operation trims the accumulated error.
We exploit the facts that input values are in a defined range and N is small, so lots of zero bits
separate those two terms.
The naive sum of three-digit quantities is d1 + err1 + d2 + err2 + ... + dN + errN
, where the errors are +
or -
rounding errors that come from truncating a repeating fraction at 53 bits.
Separating them gives d1 + ... + dN + N * random_var_with_zero_mean
.
I am proposing d1 + err1 + eps + d2 + err2 + eps + ...
which is d1 + ... + dN + N * eps + small_random_error
.
In particular, eps
ensures that we only add positive errors
as we accumulate, and by "small" I mean small_random_error < N * eps
.
from itertools import permutations
eps = 2 ** -50 # -52 suffices, but -50 is more obvious during debugging
nums = [round(random.gauss(3, 0.5), 3) + eps for _ in range(10)]
print(expected := round(sum(nums), 3))
for perm in permutations(nums):
assert round(sum(perm), 3) == expected
Assume positive values, e.g. uniform draws from the unit interval.
Then standard Numeric Analysis advice is to
first order the values from smallest to largest,
and then sum them.
If your students will distribute the summation over K hosts,
they should walk the sorted values with stride of K.
If we don’t need very tight error bounds,
then histogram estimates can save a big-Oh log N
factor,
or can even let you begin computations after
a "taste the prefix" operation which takes constant time.
Not that I know of, but a practical approach is to use math.fsum()
instead. While some platforms are perverse nearly beyond repair, on most platforms fsum()
returns the infinitely-precise result subject to a single rounding error at the end. Which means the final result is independent of the order in which elements are given. For example,
>>> from math import fsum
>>> from itertools import permutations
>>> for p in permutations([0.3, 0.4, 2.8]):
... print(p, fsum(p))
(0.3, 0.4, 2.8) 3.5
(0.3, 2.8, 0.4) 3.5
(0.4, 0.3, 2.8) 3.5
(0.4, 2.8, 0.3) 3.5
(2.8, 0.3, 0.4) 3.5
(2.8, 0.4, 0.3) 3.5
Python’s fsum()
docs go on to point to slower ways that are more robust against perverse platform quirks.
Arguably silly
Here’s another approach: fiddle the numbers you generate, clearing enough low-order bits so that no rounding of any kind is ever needed no matter how an addition tree is arranged. I haven’t thought hard about this – it’s not worth the effort 😉 For a start, I haven’t thought about negative inputs at all.
def crunch(xs):
from math import floor, ulp, ldexp
if any(x < 0.0 for x in xs):
raise ValueError("all elements must be >= 0.0")
target_ulp = ldexp(ulp(max(xs)), len(xs).bit_length())
return [floor(x / target_ulp) * target_ulp
for x in xs]
Then, e.g.,
>>> xs = crunch([0.3, 0.4, 2.8])
>>> for x in xs:
... print(x, x.hex())
0.29999999999999893 0x1.3333333333320p-2
0.3999999999999986 0x1.9999999999980p-2
2.799999999999999 0x1.6666666666664p+1
The decimal values are "a mess", because, from the hex values, you can see that the binary values reliably have enough low-order 0 bits to absorb any shifts that may be needed during a sum. The order of summation makes no difference then:
>>> for p in permutations(xs):
... print(p, sum(p))
(0.29999999999999893, 0.3999999999999986, 2.799999999999999) 3.4999999999999964
(0.29999999999999893, 2.799999999999999, 0.3999999999999986) 3.4999999999999964
(0.3999999999999986, 0.29999999999999893, 2.799999999999999) 3.4999999999999964
(0.3999999999999986, 2.799999999999999, 0.29999999999999893) 3.4999999999999964
(2.799999999999999, 0.29999999999999893, 0.3999999999999986) 3.4999999999999964
(2.799999999999999, 0.3999999999999986, 0.29999999999999893) 3.4999999999999964
and
>>> import random, math
>>> xs = [random.random() * 1e3 for i in range(100_000)]
>>> sum(xs)
49872035.43787267
>>> math.fsum(xs) # different
49872035.43787304
>>> sum(sorted(xs, reverse=True)) # and different again
49872035.43787266
>>> ys = crunch(xs) # now fiddle the numbers
>>> sum(ys) # and all three ways are the same
49872035.43712826
>>> math.fsum(ys)
49872035.43712826
>>> sum(sorted(ys, reverse=True))
49872035.43712826
The good news is that this is obviously linear-time in the number of inputs. The bad news is that more and more trailing bits have to be thrown away, the higher the dynamic range across the inputs, and the more inputs there are.
Faster (0.3 seconds instead of your 8 seconds for length 10, and 3.4 seconds for length 12) and considers more ways to sum (not just linear like ((a+b)+c)+d
, but also divide&conquer summation like (a+b)+(c+d)
).
The core part is the sums
function, which computes all possible sums. First it enumerates the numbers, so it can use sets without losing duplicate numbers. Then its inner helper sums
does the actual work. It tries all possible splits of the given numbers into a left subset and a right subset, computes all possible sums for each, and combines them.
import random
import itertools
import math
import functools
def sums(nums):
@functools.cache
def sums(nums):
if len(nums) == 1:
[num] = nums
return {num[1]}
result = set()
for k in range(1, len(nums)):
for left in map(frozenset, itertools.combinations(nums, k)):
right = nums - left
left_sums = sums(left)
right_sums = sums(right)
for L in left_sums:
for R in right_sums:
result.add(L + R)
return result
return sums(frozenset(enumerate(nums)))
def gen_sum_safe_seq(func, length: int, precision: int) -> list[float]:
"""
Return a list of floats that has the same sum when rounded to the given
precision regardless of the order in which its values are summed.
"""
while True:
nums = [func() for _ in range(length)]
rounded_sums = {
round(s, precision)
for s in sums(nums)
}
if len(rounded_sums) == 1:
return nums
print(f"rejected {nums}")
for _ in range(3):
nums = gen_sum_safe_seq(
func=lambda :round(random.gauss(3, 0.5), 3),
length=10,
precision=2,
)
print(f"{nums} sum={sum(nums)}")
The easiest way is to create random integers, and then divide (or multiply) them all by the same power of 2.
As long as the sum of the absolute values of the original integers fits into 52 bits, then you can add the resulting floats without any rounding errors.
Faster variant of my first answer, this one only checking linear summation like ((a+b)+c)+d
like you do yourself, not also divide&conquer summation like (a+b)+(c+d))
. Takes me 0.03 seconds for length 10 and 0.12 seconds for length 12.
import random
import itertools
import math
import functools
def sums(nums):
@functools.cache
def sums(nums):
if len(nums) == 1:
[num] = nums
return {num[1]}
result = set()
for last in nums:
before = nums - {last}
before_sums = sums(before)
_, R = last
for L in before_sums:
result.add(L + R)
return result
return sums(frozenset(enumerate(nums)))
def gen_sum_safe_seq(func, length: int, precision: int) -> list[float]:
"""
Return a list of floats that has the same sum when rounded to the given
precision regardless of the order in which its values are summed.
"""
while True:
nums = [func() for _ in range(length)]
rounded_sums = {
round(s, precision)
for s in sums(nums)
}
if len(rounded_sums) == 1:
return nums
print(f"rejected {nums}")
for _ in range(3):
nums = gen_sum_safe_seq(
func=lambda :round(random.gauss(3, 0.5), 3),
length=10,
precision=2,
)
print(f"{nums} sum={sum(nums)}")
Due to precision limitations in floating point numbers, the order in which numbers are summed can affect the result.
>>> 0.3 + 0.4 + 2.8
3.5
>>> 2.8 + 0.4 + 0.3
3.4999999999999996
This small error can become a bigger problem if the results are then rounded.
>>> round(0.3 + 0.4 + 2.8)
4
>>> round(2.8 + 0.4 + 0.3)
3
I would like to generate a list of random floats such that their rounded sum does not depend on the order in which the numbers are summed. My current brute force approach is O(n!). Is there a more efficient method?
import random
import itertools
import math
def gen_sum_safe_seq(func, length: int, precision: int) -> list[float]:
"""
Return a list of floats that has the same sum when rounded to the given
precision regardless of the order in which its values are summed.
"""
invalid = True
while invalid:
invalid = False
nums = [func() for _ in range(length)]
first_sum = round(sum(nums), precision)
for p in itertools.permutations(nums):
if round(sum(p), precision) != first_sum:
invalid = True
print(f"rejected {nums}")
break
return nums
for _ in range(3):
nums = gen_sum_safe_seq(
func=lambda :round(random.gauss(3, 0.5), 3),
length=10,
precision=2,
)
print(f"{nums} sum={sum(nums)}")
For context, as part of a programming exercise I’m providing a list of floats that model a measured value over time to ~1000 entry-level programming students. They will sum them in a variety of ways. Provided that their code is correct, I’d like for them all to get the same result to simplify checking their code. I do not want to introduce the complexities of floating point representation to students at this level.
You are asking for a
numeric error analysis;
there is a rich literature on this.
In your example you found the
relative error
was unacceptably large.
Plus, you’re cramming infinite repeating fractions
into a 53-bit mantissa, with predictable truncation issues.
Adding numbers of different magnitudes tends to cause trouble.
Here, 2.8
is more than 8x 0.3
, so we risk losing three bits of precision.
You’re making this problem much too hard.
Simply use decimal values.
Or do the equivalent: scale your random floats by
some large number, perhaps 1e9
, and truncate to integer.
Now you’re summing integers, with no repeating fractional digits,
so we’re back to being able to rely on commutativity and associativity.
Remember to scale sums appropriately when reporting the results.
What you want is "math".
What you’ll get is a "machine representation".
So choose a representation that is a good fit for your use case.
EDIT
Oh, the educational context is illuminating.
Avoiding negative round-off errors is crucial.
Simply add epsilon = 2 ** -50
to all those round( ..., 3)
figures being summed.
With epsilon big enough to turn even the negative rounding errors
into positive terms, for N numbers with average of mean
,
your FP sum will be approximately N * mean + N * epsilon
,
and then the final rounding operation trims the accumulated error.
We exploit the facts that input values are in a defined range and N is small, so lots of zero bits
separate those two terms.
The naive sum of three-digit quantities is d1 + err1 + d2 + err2 + ... + dN + errN
, where the errors are +
or -
rounding errors that come from truncating a repeating fraction at 53 bits.
Separating them gives d1 + ... + dN + N * random_var_with_zero_mean
.
I am proposing d1 + err1 + eps + d2 + err2 + eps + ...
which is d1 + ... + dN + N * eps + small_random_error
.
In particular, eps
ensures that we only add positive errors
as we accumulate, and by "small" I mean small_random_error < N * eps
.
from itertools import permutations
eps = 2 ** -50 # -52 suffices, but -50 is more obvious during debugging
nums = [round(random.gauss(3, 0.5), 3) + eps for _ in range(10)]
print(expected := round(sum(nums), 3))
for perm in permutations(nums):
assert round(sum(perm), 3) == expected
Assume positive values, e.g. uniform draws from the unit interval.
Then standard Numeric Analysis advice is to
first order the values from smallest to largest,
and then sum them.
If your students will distribute the summation over K hosts,
they should walk the sorted values with stride of K.
If we don’t need very tight error bounds,
then histogram estimates can save a big-Oh log N
factor,
or can even let you begin computations after
a "taste the prefix" operation which takes constant time.
Not that I know of, but a practical approach is to use math.fsum()
instead. While some platforms are perverse nearly beyond repair, on most platforms fsum()
returns the infinitely-precise result subject to a single rounding error at the end. Which means the final result is independent of the order in which elements are given. For example,
>>> from math import fsum
>>> from itertools import permutations
>>> for p in permutations([0.3, 0.4, 2.8]):
... print(p, fsum(p))
(0.3, 0.4, 2.8) 3.5
(0.3, 2.8, 0.4) 3.5
(0.4, 0.3, 2.8) 3.5
(0.4, 2.8, 0.3) 3.5
(2.8, 0.3, 0.4) 3.5
(2.8, 0.4, 0.3) 3.5
Python’s fsum()
docs go on to point to slower ways that are more robust against perverse platform quirks.
Arguably silly
Here’s another approach: fiddle the numbers you generate, clearing enough low-order bits so that no rounding of any kind is ever needed no matter how an addition tree is arranged. I haven’t thought hard about this – it’s not worth the effort 😉 For a start, I haven’t thought about negative inputs at all.
def crunch(xs):
from math import floor, ulp, ldexp
if any(x < 0.0 for x in xs):
raise ValueError("all elements must be >= 0.0")
target_ulp = ldexp(ulp(max(xs)), len(xs).bit_length())
return [floor(x / target_ulp) * target_ulp
for x in xs]
Then, e.g.,
>>> xs = crunch([0.3, 0.4, 2.8])
>>> for x in xs:
... print(x, x.hex())
0.29999999999999893 0x1.3333333333320p-2
0.3999999999999986 0x1.9999999999980p-2
2.799999999999999 0x1.6666666666664p+1
The decimal values are "a mess", because, from the hex values, you can see that the binary values reliably have enough low-order 0 bits to absorb any shifts that may be needed during a sum. The order of summation makes no difference then:
>>> for p in permutations(xs):
... print(p, sum(p))
(0.29999999999999893, 0.3999999999999986, 2.799999999999999) 3.4999999999999964
(0.29999999999999893, 2.799999999999999, 0.3999999999999986) 3.4999999999999964
(0.3999999999999986, 0.29999999999999893, 2.799999999999999) 3.4999999999999964
(0.3999999999999986, 2.799999999999999, 0.29999999999999893) 3.4999999999999964
(2.799999999999999, 0.29999999999999893, 0.3999999999999986) 3.4999999999999964
(2.799999999999999, 0.3999999999999986, 0.29999999999999893) 3.4999999999999964
and
>>> import random, math
>>> xs = [random.random() * 1e3 for i in range(100_000)]
>>> sum(xs)
49872035.43787267
>>> math.fsum(xs) # different
49872035.43787304
>>> sum(sorted(xs, reverse=True)) # and different again
49872035.43787266
>>> ys = crunch(xs) # now fiddle the numbers
>>> sum(ys) # and all three ways are the same
49872035.43712826
>>> math.fsum(ys)
49872035.43712826
>>> sum(sorted(ys, reverse=True))
49872035.43712826
The good news is that this is obviously linear-time in the number of inputs. The bad news is that more and more trailing bits have to be thrown away, the higher the dynamic range across the inputs, and the more inputs there are.
Faster (0.3 seconds instead of your 8 seconds for length 10, and 3.4 seconds for length 12) and considers more ways to sum (not just linear like ((a+b)+c)+d
, but also divide&conquer summation like (a+b)+(c+d)
).
The core part is the sums
function, which computes all possible sums. First it enumerates the numbers, so it can use sets without losing duplicate numbers. Then its inner helper sums
does the actual work. It tries all possible splits of the given numbers into a left subset and a right subset, computes all possible sums for each, and combines them.
import random
import itertools
import math
import functools
def sums(nums):
@functools.cache
def sums(nums):
if len(nums) == 1:
[num] = nums
return {num[1]}
result = set()
for k in range(1, len(nums)):
for left in map(frozenset, itertools.combinations(nums, k)):
right = nums - left
left_sums = sums(left)
right_sums = sums(right)
for L in left_sums:
for R in right_sums:
result.add(L + R)
return result
return sums(frozenset(enumerate(nums)))
def gen_sum_safe_seq(func, length: int, precision: int) -> list[float]:
"""
Return a list of floats that has the same sum when rounded to the given
precision regardless of the order in which its values are summed.
"""
while True:
nums = [func() for _ in range(length)]
rounded_sums = {
round(s, precision)
for s in sums(nums)
}
if len(rounded_sums) == 1:
return nums
print(f"rejected {nums}")
for _ in range(3):
nums = gen_sum_safe_seq(
func=lambda :round(random.gauss(3, 0.5), 3),
length=10,
precision=2,
)
print(f"{nums} sum={sum(nums)}")
The easiest way is to create random integers, and then divide (or multiply) them all by the same power of 2.
As long as the sum of the absolute values of the original integers fits into 52 bits, then you can add the resulting floats without any rounding errors.
Faster variant of my first answer, this one only checking linear summation like ((a+b)+c)+d
like you do yourself, not also divide&conquer summation like (a+b)+(c+d))
. Takes me 0.03 seconds for length 10 and 0.12 seconds for length 12.
import random
import itertools
import math
import functools
def sums(nums):
@functools.cache
def sums(nums):
if len(nums) == 1:
[num] = nums
return {num[1]}
result = set()
for last in nums:
before = nums - {last}
before_sums = sums(before)
_, R = last
for L in before_sums:
result.add(L + R)
return result
return sums(frozenset(enumerate(nums)))
def gen_sum_safe_seq(func, length: int, precision: int) -> list[float]:
"""
Return a list of floats that has the same sum when rounded to the given
precision regardless of the order in which its values are summed.
"""
while True:
nums = [func() for _ in range(length)]
rounded_sums = {
round(s, precision)
for s in sums(nums)
}
if len(rounded_sums) == 1:
return nums
print(f"rejected {nums}")
for _ in range(3):
nums = gen_sum_safe_seq(
func=lambda :round(random.gauss(3, 0.5), 3),
length=10,
precision=2,
)
print(f"{nums} sum={sum(nums)}")