How to count the number of digits in numbers in different bases?

Question:

I’m working with numbers in different bases (base-10, base-8, base-16, etc). I’m trying to count the number of characters in each number.

Example

Number: ABCDEF

Number of digits: 6

I know about the method based on logarithms but I’m facing some problems.

  1. This Python script outputs that it failed to calculate the number of digits correctly in 3,969 numbers out of 1,000,000.

  2. I think the method that uses logarithms could be rather slow

Links:

  • This C program must be very slow (what if I have a very great number?). It also can’t deal with numbers in different bases (for example, base-16).

  • Not a dupe of this as there the OP was asking only about base-10


Edit: certainly I can calculate the length of a string but what interests me most, is if it is possible to do the calculation without convention to string. I’d want to know the algorithm that could help to do it knowing just the source-base and the base to convert to.

Edit2: source-base is base-10 and the base to convert to can be any other base.


How can we calculate the number of digits in numbers in different bases?

If I know the number in base-10, how do I calculate the number of digits in the same number converted to base-16 (base-8, etc) without performing the conversion?

Note: some Python or C code will be greatly appreciated

Asked By: ForceBru

||

Answers:

I am not sure I understand your question. When you say your initial number is in base b1, does it mean you have a representation of it as a string in base b1 ? Maybe you want to construct some table which tells you which number in base b1 correspond to b2, b2^2, b2^3, … and then compare your number to these numbers to see where it fits.

Otherwise I would go with the algorithm you mentioned, that can easily adopted to any base.

Input : your integer x, the base b2 you want to count the digits in.

int number_of_digits (int x, int b2) {
    int n = 0;
    while (x >0) {
        x/=b2;
        n++;
    }
    return n;
}

Both methods are only linear in n.

EDIT : If you want to be faster, you can implement this as a binary search. Then you can get O(log(n)).

Answered By: vib

Logarithms shouldn’t really be slow. And you can easily calculate logarithms to any base by this formula: logBaseN(x)=logBaseA(x)/logBaseA(N) – you can use ln(Base e = 2.718…) or logBase10 or whatever you have. So you don’t really need a program, a formular should do it:

num_digets(N, base) = 1 + floor(log(N) / log(base))

where N is your number and base the base you want that number in.

For more explanation take a look here:
http://www.mathpath.org/concepts/Num/numdigits.htm

Answered By: kratenko

Note your NumToStr() function in your Python code is incorrect due to an off-by-one in your base, it should be:

def NumToStr(num):
    str=''
    while num:
            str+=alpha[(num)%base]
            num=(num)/base
    return ''.join(list(reversed(str)))

Note that checking that this function returns the correct result would have found the error (for example, use alpha="0123456789").

With this fix we get the correct number of digits using the given formula:

nDigits = int(ceil(log(nmb, base)))

except for exact powers of the base (base**0, base**1, base**2, etc…) where it is exactly one less than what it should be. This can be fixed by changing the forumla slightly to:

nDigits = 1 + floor(log(nmb, base))

Note that even this seems to fail for some inputs (for example converting from base-10 to base-10 it incorrectly says 3 digits for 1000 and 6 digits for 1000000). This seems to be due to the inherent inprecision of floats, for example:

print floor(log(1000, 10))

outputs 2 instead of the expected 3.

Concerning your mention about performance, I would not worry about performance issues for such trivial code until you’ve done profiling/benchmarking that shows it to be an issue. For example, your “very slow” C code would only take at most 38 divisions by 10 for a 128-bit number. If you need better performance than this then you would run into the same issue with any trivial method mentioned here. Fastest thing I can think of would be a custom log() function using a combination of lookup table and linear interpolation but you would have to be careful about the resulting precision.

Answered By: uesp

If you care are using Python’s math module for the log, be careful of the off-by-one error for the digits in 1000**i in base 10:

>>> num_digets(1000, 10)  # there are 4 digits
3
>>> [num_digets(1000**i,10)%3 for i in range(1,10)]  # these should all be 1
[0, 0, 0, 0, 0, 0, 0, 0, 0]

def ndigits(self, b):
    n = abs(self)
    if n < b:
        return 1
    d = 1 + math.floor(math.log(n)/math.log(b))
    # paranoid
    s = n//b**d
    if not s:
        pass
    else:
        assert not s//b, self  # check assumptions
        d += 1
    return d

>>> [ndigits(1000**i,10)%3 for i in range(1,10)]
[1, 1, 1, 1, 1, 1, 1, 1, 1]

The reason this requires extra care is that math.log(1000**i)/math.log(10) is a little smaller than the expected floating integer:

import math
>>> math.log(1000)/math.log(10)
2.9999999999999996

If you use math.log10 in base 10 it will work as expected, but then it will fail for other bases, like 125 in base 5. Here is a small set of tests for whatever formula you use:

    assert ndigits(1000, 10) == 4
    assert ndigits(125, 5) == 4
    assert ndigits(100, 16) == 2
Answered By: smichr
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.