How to count the number of digits in numbers in different bases?
Question:
I’m working with numbers in different bases (base-10, base-8, base-16, etc). I’m trying to count the number of characters in each number.
Example
Number: ABCDEF
Number of digits: 6
I know about the method based on logarithms but I’m facing some problems.
-
This Python script outputs that it failed to calculate the number of digits correctly in 3,969 numbers out of 1,000,000.
-
I think the method that uses logarithms could be rather slow
Links:
-
This C program must be very slow (what if I have a very great number?). It also can’t deal with numbers in different bases (for example, base-16).
-
Not a dupe of this as there the OP was asking only about base-10
Edit: certainly I can calculate the length of a string but what interests me most, is if it is possible to do the calculation without convention to string. I’d want to know the algorithm that could help to do it knowing just the source-base and the base to convert to.
Edit2: source-base is base-10 and the base to convert to can be any other base.
How can we calculate the number of digits in numbers in different bases?
If I know the number in base-10, how do I calculate the number of digits in the same number converted to base-16 (base-8, etc) without performing the conversion?
Note: some Python or C code will be greatly appreciated
Answers:
I am not sure I understand your question. When you say your initial number is in base b1, does it mean you have a representation of it as a string in base b1 ? Maybe you want to construct some table which tells you which number in base b1 correspond to b2, b2^2, b2^3, … and then compare your number to these numbers to see where it fits.
Otherwise I would go with the algorithm you mentioned, that can easily adopted to any base.
Input : your integer x, the base b2 you want to count the digits in.
int number_of_digits (int x, int b2) {
int n = 0;
while (x >0) {
x/=b2;
n++;
}
return n;
}
Both methods are only linear in n.
EDIT : If you want to be faster, you can implement this as a binary search. Then you can get O(log(n)).
Logarithms shouldn’t really be slow. And you can easily calculate logarithms to any base by this formula: logBaseN(x)=logBaseA(x)/logBaseA(N)
– you can use ln
(Base e = 2.718…) or logBase10
or whatever you have. So you don’t really need a program, a formular should do it:
num_digets(N, base) = 1 + floor(log(N) / log(base))
where N
is your number and base
the base you want that number in.
For more explanation take a look here:
http://www.mathpath.org/concepts/Num/numdigits.htm
Note your NumToStr()
function in your Python code is incorrect due to an off-by-one in your base, it should be:
def NumToStr(num):
str=''
while num:
str+=alpha[(num)%base]
num=(num)/base
return ''.join(list(reversed(str)))
Note that checking that this function returns the correct result would have found the error (for example, use alpha="0123456789"
).
With this fix we get the correct number of digits using the given formula:
nDigits = int(ceil(log(nmb, base)))
except for exact powers of the base (base**0
, base**1
, base**2
, etc…) where it is exactly one less than what it should be. This can be fixed by changing the forumla slightly to:
nDigits = 1 + floor(log(nmb, base))
Note that even this seems to fail for some inputs (for example converting from base-10 to base-10 it incorrectly says 3 digits for 1000 and 6 digits for 1000000). This seems to be due to the inherent inprecision of floats, for example:
print floor(log(1000, 10))
outputs 2
instead of the expected 3
.
Concerning your mention about performance, I would not worry about performance issues for such trivial code until you’ve done profiling/benchmarking that shows it to be an issue. For example, your “very slow” C code would only take at most 38 divisions by 10 for a 128-bit number. If you need better performance than this then you would run into the same issue with any trivial method mentioned here. Fastest thing I can think of would be a custom log()
function using a combination of lookup table and linear interpolation but you would have to be careful about the resulting precision.
If you care are using Python’s math module for the log
, be careful of the off-by-one error for the digits in 1000**i
in base 10:
>>> num_digets(1000, 10) # there are 4 digits
3
>>> [num_digets(1000**i,10)%3 for i in range(1,10)] # these should all be 1
[0, 0, 0, 0, 0, 0, 0, 0, 0]
def ndigits(self, b):
n = abs(self)
if n < b:
return 1
d = 1 + math.floor(math.log(n)/math.log(b))
# paranoid
s = n//b**d
if not s:
pass
else:
assert not s//b, self # check assumptions
d += 1
return d
>>> [ndigits(1000**i,10)%3 for i in range(1,10)]
[1, 1, 1, 1, 1, 1, 1, 1, 1]
The reason this requires extra care is that math.log(1000**i)/math.log(10)
is a little smaller than the expected floating integer:
import math
>>> math.log(1000)/math.log(10)
2.9999999999999996
If you use math.log10
in base 10 it will work as expected, but then it will fail for other bases, like 125 in base 5. Here is a small set of tests for whatever formula you use:
assert ndigits(1000, 10) == 4
assert ndigits(125, 5) == 4
assert ndigits(100, 16) == 2
I’m working with numbers in different bases (base-10, base-8, base-16, etc). I’m trying to count the number of characters in each number.
Example
Number:
ABCDEF
Number of digits: 6
I know about the method based on logarithms but I’m facing some problems.
-
This Python script outputs that it failed to calculate the number of digits correctly in 3,969 numbers out of 1,000,000.
-
I think the method that uses logarithms could be rather slow
Links:
-
This C program must be very slow (what if I have a very great number?). It also can’t deal with numbers in different bases (for example, base-16).
-
Not a dupe of this as there the OP was asking only about base-10
Edit: certainly I can calculate the length of a string but what interests me most, is if it is possible to do the calculation without convention to string. I’d want to know the algorithm that could help to do it knowing just the source-base and the base to convert to.
Edit2: source-base is base-10 and the base to convert to can be any other base.
How can we calculate the number of digits in numbers in different bases?
If I know the number in base-10, how do I calculate the number of digits in the same number converted to base-16 (base-8, etc) without performing the conversion?
Note: some Python or C code will be greatly appreciated
I am not sure I understand your question. When you say your initial number is in base b1, does it mean you have a representation of it as a string in base b1 ? Maybe you want to construct some table which tells you which number in base b1 correspond to b2, b2^2, b2^3, … and then compare your number to these numbers to see where it fits.
Otherwise I would go with the algorithm you mentioned, that can easily adopted to any base.
Input : your integer x, the base b2 you want to count the digits in.
int number_of_digits (int x, int b2) {
int n = 0;
while (x >0) {
x/=b2;
n++;
}
return n;
}
Both methods are only linear in n.
EDIT : If you want to be faster, you can implement this as a binary search. Then you can get O(log(n)).
Logarithms shouldn’t really be slow. And you can easily calculate logarithms to any base by this formula: logBaseN(x)=logBaseA(x)/logBaseA(N)
– you can use ln
(Base e = 2.718…) or logBase10
or whatever you have. So you don’t really need a program, a formular should do it:
num_digets(N, base) = 1 + floor(log(N) / log(base))
where N
is your number and base
the base you want that number in.
For more explanation take a look here:
http://www.mathpath.org/concepts/Num/numdigits.htm
Note your NumToStr()
function in your Python code is incorrect due to an off-by-one in your base, it should be:
def NumToStr(num):
str=''
while num:
str+=alpha[(num)%base]
num=(num)/base
return ''.join(list(reversed(str)))
Note that checking that this function returns the correct result would have found the error (for example, use alpha="0123456789"
).
With this fix we get the correct number of digits using the given formula:
nDigits = int(ceil(log(nmb, base)))
except for exact powers of the base (base**0
, base**1
, base**2
, etc…) where it is exactly one less than what it should be. This can be fixed by changing the forumla slightly to:
nDigits = 1 + floor(log(nmb, base))
Note that even this seems to fail for some inputs (for example converting from base-10 to base-10 it incorrectly says 3 digits for 1000 and 6 digits for 1000000). This seems to be due to the inherent inprecision of floats, for example:
print floor(log(1000, 10))
outputs 2
instead of the expected 3
.
Concerning your mention about performance, I would not worry about performance issues for such trivial code until you’ve done profiling/benchmarking that shows it to be an issue. For example, your “very slow” C code would only take at most 38 divisions by 10 for a 128-bit number. If you need better performance than this then you would run into the same issue with any trivial method mentioned here. Fastest thing I can think of would be a custom log()
function using a combination of lookup table and linear interpolation but you would have to be careful about the resulting precision.
If you care are using Python’s math module for the log
, be careful of the off-by-one error for the digits in 1000**i
in base 10:
>>> num_digets(1000, 10) # there are 4 digits
3
>>> [num_digets(1000**i,10)%3 for i in range(1,10)] # these should all be 1
[0, 0, 0, 0, 0, 0, 0, 0, 0]
def ndigits(self, b):
n = abs(self)
if n < b:
return 1
d = 1 + math.floor(math.log(n)/math.log(b))
# paranoid
s = n//b**d
if not s:
pass
else:
assert not s//b, self # check assumptions
d += 1
return d
>>> [ndigits(1000**i,10)%3 for i in range(1,10)]
[1, 1, 1, 1, 1, 1, 1, 1, 1]
The reason this requires extra care is that math.log(1000**i)/math.log(10)
is a little smaller than the expected floating integer:
import math
>>> math.log(1000)/math.log(10)
2.9999999999999996
If you use math.log10
in base 10 it will work as expected, but then it will fail for other bases, like 125 in base 5. Here is a small set of tests for whatever formula you use:
assert ndigits(1000, 10) == 4
assert ndigits(125, 5) == 4
assert ndigits(100, 16) == 2