Can I naively check if a/b == c/d?

Question:

I was doing leetcode when I had to do some arithmetic with rational numbers (both numerator and denominator integers).

I needed to count slopes in a list. In python

collections.Counter( [ x/y if y != 0 else "inf" for (x,y) in points ] )

did the job, and I passed all the tests with it. ((edit: they’ve pointed out in the comments that in that exercise numbers were much smaller, not general 32 bit integers))

I wonder if this is correct, that is, python correctly recognizes if a/b == c/d as rationals, for a,b,c,d 32 bit integers. I am also interested the case for c++, and any additional facts that may be useful (footguns, best practices, theory behind it if not too long etc).

Also this question seems frequent and useful, but I don’t really find anything about it (give me the duplicates!), maybe I am missing some important keywords?

Asked By: bmacho

||

Answers:

It’s not safe, and I’ve seen at least one LeetCode problem where you’d fail with that (maybe Max Points on a Line). Example:

a = 94911150
b = 94911151
c = 94911151
d = 94911152
print(a/b == c/d)
print(a/b)
print(c/d)

Both a/b and c/d are the same float value even though the slopes actually differ (Try it online!):

True
0.9999999894638303
0.9999999894638303

You could use fractions.Fraction(x, y) or the tuple (x//g, y//g) after g = math.gcd(d, y) ( if I remember correctly, this is more lightweight/efficient than the Fraction class).

Answered By: Kelly Bundy

Assuming you don’t want to allow for the effects of integer division, check the equivalent ad == bc instead.

This is more numerically stable. In C++ you can write

1LL * a * d == 1LL * b * c

to prevent overflow.

Answered By: Bathsheba

tl;dr: If max(|a|, |b|, |c|, |d|) ≤ 67114657, then you’re safe: under that restriction, if a/b and c/d are equal as IEEE 754 binary64 floats, then they’re equal as fractions.

In detail, we have the following theorem, giving a precise bound under which the mapping from fractions to IEEE 754 double-precision binary floats is injective.

Theorem. Suppose that a/b and c/d are unequal fractions such that when both are converted to the nearest IEEE 754 binary64 float they become equal. Then max(|a|, |b|, |c|, |d|) > 67114657.

Note that our bound 67114657 is just a little larger than 2**26 = 67108864. Below we give a direct mathematical proof that max(|a|, |b|, |c|, |d|) ≥ 67108864, and then augment that with an exhaustive search to show that the smallest case where distinct fractions a/b and c/d coincide as floats has max(|a|, |b|, |c|, |d|) = 67114658.

By symmetry, it’s enough to consider positive fractions. (If both a/b and c/d are negative, apply the theorem to -a/b and -c/d. If the signs of a/b and c/d differ or either fraction is zero, it’s easy to establish that absent underflow or overflow, they can’t both map to the same float. And the only way for underflow or overflow to be involved is when max(|a|, |b|, |c|, |d|) is huge (at least 2**1022).) So from this point on we assume that a, b, c and d are all positive.

The proof of the theorem is divided into two main cases, with the first case (which is the more interesting one) further subdivided. (Spoiler: Case 1d is the only really interesting one, and it’s the one where we need to perform the exhaustive search.)

Case 1: a/b and c/d live in the same "binade"

The main case we consider is the case in which there’s a closed interval of the form [2**e, 2**(e+1)] for some integer e that contains both of a/b and c/d. Within that interval, consecutive floats are spaced at distance 2**(e-52) from one another, so if a/b and c/d map to the same float then |a/b - c/d| ≤ 2**(e-52). Rearranging, we know that

2**(52-e) ≤ b*d / |a*d - b*c|.

Note that since a/b and c/d are distinct, |a*d - b*c| ≥ 1.

Now we divide case 1 into 4 subcases.

Case 1a: e ≥ 1

In this case, 2**e ≤ a/b and 2**e ≤ c/d imply that b ≤ 2**-e * a and d ≤ 2**-e * c, hence that b*d ≤ 2**(-2*e) * a*c. So from the inequality highlighted above,

2**(-2*e) * a*c ≥ b*d ≥ b*d / |a*d - b*c| ≥ 2**(52-e)

Simplifying gives a*c ≥ 2**(52 + e) ≥ 2**53. So at least one of a or c must be at least √(2**53), so max(a, b, c, d) ≥ √(2**53) > 67114657 and we’re done.

Case 1b: e ≤ -1

In this case,

b*d ≥ b*d / |a*d - b*c| ≥ 2**(52-e) ≥ 2**53

so now either b or d (or both) is larger than √(2**53), and again we’re done.

Case 1c: e = 0, |a*d - b*c| ≥ 2

In this case, our first inequality above gives:

b*d / 2 ≥ b*d / |a*d - b*c| ≥ 2**(52-e) = 2**52

So b*d ≥ 2**53, and as with cases 1a and 1b, we’re done.

Case 1d: e = 0, |a*d - b*c| = 1

This one’s the interesting case. Now our main inequality gives

b*d = b*d / |a*d - b*c| ≥ 2**(52-e) = 2**52

So at least one of b and d is larger than 2**26 = 67108864, so we have max(a, b, c, d) ≥ 67108864. But we need a bit more: we need max(a, b, c, d) > 67114657.

At this point we can do an exhaustive search. Before diving into that, we need a bit of work to reduce the search space to something feasible.

First note that either a < c or c < a: they can’t be equal (except in trivial cases), since ad - bc = ±1 is not divisible by a (unless a = 1). Now if a < c then b < d, and if c < a then d < b. Let’s swap if necessary so that a/b is the fraction with larger numerator and denominator: c < a and d < b.

Now for a tiny bit of elementary number theory: given a positive fraction a/b (written in lowest terms), there’s a unique pair c and d of integers such that a*d - b*c = 1, 0 ≤ c < a and 0 < d ≤ b, and a unique pair c and d of integers such that a*d - b*c = -1, 0 < c ≤ a and 0 ≤ d < b. So given a/b, there are at most two possible choices for c/d, and we can find both those choices using the extended Euclidean algorithm. (For mathematicians: there’s a hint of the theory of continued fractions here: the two choices for c/d are the parents of a/b in the Stern-Brocot tree.)

For further reductions: in this case we know that b*d ≥ 2**52, and that b > d, so we have that b ≥ 2**26. Moreover, since e = 0 we have 1 ≤ a/b ≤ 2, so b ≤ a. And since 0 < c < a and 0 < d < b, max(a, b, c, d) = a.

So we can restrict ourselves to searching pairs (a, b) of relatively prime integers satisfying 2**26 ≤ b ≤ a, then for each of those pairs generate the two possibilities for c/d using the extended Euclidean algorithm. Here’s some Python code that does exactly that:

from math import gcd


def sb_parents(m, n):
    """
    Given a positive fraction m/n, return its parents in the Stern-Brocot tree.
    """
    a, b, p, q, r, s = n, m % n, 1, 0, m // n, 1
    while b:
        x = a//b
        a, b, p, q, r, s = b, a - x * b, r, s, p + x * r, q + x * s
    return p, q, r - p, s - q


for a in range(2**26, 2**27):
    for b in range(2**26, a):
        if gcd(a, b) > 1:
            continue
        c, d, e, f = sb_parents(a, b)
        if d and a/b == c/d:
            print(f"{a}/{b} == {c}/{d}")
        if f and a/b == e/f:
            print(f"{a}/{b} == {e}/{f}")

When run, the first example this prints (after around 30 seconds of runtime on my laptop) is

67114658/67114657 == 67114657/67114656

The next few, which take a few minutes to produce, are:

67118899/67118898 == 67118898/67118897
67121819/67121818 == 67121818/67121817
67123403/67115730 == 67114655/67106983
67124193/67124192 == 67124192/67124191
67125383/67119501 == 67113971/67108090
67126017/67122029 == 67109185/67105198
67126246/67126245 == 67126245/67126244
67128080/67128079 == 67128079/67128078

This completes the proof in case 1d, which in turn completes the proof of all of case 1 (since the four cases exhaust all possibilities). We’re pretty much done, except that we still have the annoying second case to eliminate. We do that now.

Case 2: a/b and c/d do not live in the same "binade"

This is the negation of case 1: we assume that there’s no integer e so that both a/b and c/d lie in the closed interval [2**e, 2**(e+1)]. The only way that this can happen is if there’s a power of two lying strictly between a/b and c/d: swapping a/b and c/d if necessary, there’s an integer e with

a/b < 2**e < c/d

Now since we’re assuming that a/b and c/d map to the same binary64 float, 2**e, being squeezed between a/b and c/d, must also map to that same binary float, and that float will be exactly equal to 2**e (since powers of two within a reasonable range will convert exactly).

At this point we can discard c/d and just consider a/b. Since a/b and 2**e map to the same float, it follows that the difference between a/b and 2**e is at most half a ulp (because 2**e converts exactly), so 0 < 2**e - a/b ≤ 2**(e-54), or 0 < 2**e * b - a ≤ b*2**(e-54). Moreover, since a/b is so close to 2**e, we have 2**(e-1) < a/b, so b < a * 2**(1-e).

Now consider the case where e ≥ 0. Then 2**e * b - a is an integer, so 1 ≤ b*2**(e-54). Combining with b ≤ a * 2**(1-e) we get 2**53 ≤ a, so a must be huge, and correspondingly max(a, b, c, d) is at least 2**53.

Finally, consider the case where e ≤ 0. Then from 0 < 2**e - a/b ≤ 2**(e-54) we have 0 < b - 2**(-e)*a ≤ b*2**(-54). As before, b - 2**(-e)*a is a positive integer, so 1 ≤ b * 2**-54, so 2**54 ≤ b, and again we’re done.

Answered By: Mark Dickinson
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.