# Number of occurrences of digit in numbers from 0 to n

## Question:

Given a number n, count number of occurrences of digits 0, 2 and 4 including n.

Example1:

```
n = 10
output: 4
```

Example2:

```
n = 22
output: 11
```

My Code:

```
n = 22
def count_digit(n):
count = 0
for i in range(n+1):
if '2' in str(i):
count += 1
if '0' in str(i):
count += 1
if '4' in str(i):
count += 1
return count
count_digit(n)
```

Code Output: `10`

Desired Output: `11`

Constraints: `1 <= N <= 10^5`

Note: **The solution should not cause outOfMemoryException or Time Limit Exceeded for large numbers.**

## Answers:

There are numbers in which the desired number is repeated, such as 20 or 22, so instead of adding 1 you must add 2

```
>>>
>>> string = ','.join(map(str,range(23)))
>>>
>>> string
'0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22'
>>>
>>> string.count('0') + string.count('2') + string.count('4')
11
>>>
n = 22
def count_digit(n):
count = 0
for i in map(str,range(n+1)):
count+=i.count('0')
count+=i.count('2')
count+=i.count('3')
return count
print(count_digit(n))
```

that solotion is fast:

It can be developed to be faster:

```
def count_digit(n):
i=0
count=0
s='024'
while i<n-1:
j = 0
for v in str(i):
if v in s:
j+=1
count+=3*j + (7*(j-1))
i+=10
for i in range(i,n+1,1):
for v in str(i):
if v in s:
count+=1
return count
```

You can increment your count like this:

```
def count_digit(n):
count = 0
for i in range(n + 1):
if '2' in str(i):
count += str(i).count('2')
if '0' in str(i):
count += str(i).count('0')
if '4' in str(i):
count += str(i).count('4')
return count
```

In that way, edge cases like 22, 44, and so on are covered!

Another brute force, seems faster:

```
def count_digit(n):
s = str(list(range(n+1)))
return sum(map(s.count, '024'))
```

Benchmark with `n = 10**5`

:

```
result time solution
115474 244 ms original
138895 51 ms Kelly
138895 225 ms islam_abdelmoumen
138895 356 ms CodingDaveS
```

Code (Try it online!):

```
from timeit import default_timer as time
def original(n):
count = 0
for i in range(n+1):
if '2' in str(i):
count += 1
if '0' in str(i):
count += 1
if '4' in str(i):
count += 1
return count
def Kelly(n):
s = str(list(range(n+1)))
return sum(map(s.count, '024'))
def islam_abdelmoumen(n):
count = 0
for i in map(str,range(n+1)):
count+=i.count('0')
count+=i.count('2')
count+=i.count('3')
return count
def CodingDaveS(n):
count = 0
for i in range(n + 1):
if '2' in str(i):
count += str(i).count('2')
if '0' in str(i):
count += str(i).count('0')
if '4' in str(i):
count += str(i).count('4')
return count
funcs = original, Kelly, islam_abdelmoumen, CodingDaveS
print('result time solution')
print()
for _ in range(3):
for f in funcs:
t = time()
print(f(10**5), ' %3d ms ' % ((time()-t)*1e3), f.__name__)
print()
```

TL;DR: If you do it right, you can compute the count about a thousand times faster for *n* close to 10**5, and since the better algorithm uses time proportional to the number of digits in *n*, it can easily handle even values of *n* too large for a 64-bit integer.

As is often the case with puzzles like this ("in the numbers from x to y, how many…?"), the key is to find a way to compute an aggregate count, ideally in O(1), for a large range. For combinatorics over the string representation of numbers, a convenient range is often something like the set of all numbers whose string representation is a given size, possibly with a specific prefix. In other words, ranges of the form `[prefix*10⁴, prefix*10⁴+9999]`

, where 0s in the lower limit is the same as the number of 9s in the upper limit and the exponent of 10 in the multiplier. (It’s often actually more convenient to use half-open ranges, where the lower limit is inclusive and the upper limit is exclusive, so the above example would be `[prefix*10⁴, (prefix+1)*10⁴)`

.)

Also note that if the problem is to compute a count for [x, y), and you only know how to compute [0, y), then you just do two computations, because

```
count [x, y) == count [0, y) - count [0, x)
```

That identity is one of the simplifications which half-open intervals allow.

That would work nicely with this problem, because it’s clear how many times a digit *d* occurs in the set of all k-digit suffixes for a given prefix. (In the 10^{k} suffixes, every digit has the same frequency as every other digit; there are a total of *k*×10^{k} digits in those 10^{k}, and since all digits have the same count, that count must be *k*×10^{k−1}.) Then you just have to add the digit count of the prefixes, but the prefix appears exactly 10^{k} times, and each one contributes the same count.

So you could take a number like 72483, and decompose it into the following ranges, which roughly correspond to the sum of the digits in 72483, plus a few ranges containing fewer digits.

- [0, 9]
- [10, 99]
- [100, 999]
- [1000, 9999]
- [10000, 19999]
- [20000, 29999]
- [30000, 39999]
- [40000, 49999]
- [50000, 59999]
- [60000, 69999]
- [70000, 70999]
- [71000, 71999]
- [72000, 72099]
- [72100, 72199]
- [72200, 72299]
- [72300, 72399]
- [72400, 72409]
- [72410, 72419]
- [72420, 72429]
- [72430, 72439]
- [72440, 72449]
- [72450, 72459]
- [72460, 72469]
- [72470, 72479]
- [72480, 72480]
- [72481, 72481]
- [72482, 72482]
- [72483, 72483]

However, in the following code, I used a slightly different algorithm, which turned out to be a bit shorter. It considers the rectangle in which all the mumbers from 0 to n are written out, including leading zeros, and then computes counts for each column. A column of digits in a rectangle of sequential integers follows a simple recurring pattern; the frequency can easily be computed by starting with the completely repetitive part of the column. After the complete repetitions, the remaining digits are in order, with each one except the last one appearing the same number of times. It’s probably easiest to understand that by drawing out a small example on a pad of paper, but the following code should also be reasonably clear (I hope).

The one problem with that is that it counts leading zeros which don’t actually exist, so it needs to be corrected by subtracting the leading zero count. Fortunately, that count is extremely easy to compute. If you consider a range ending with a five-digit number (which itself cannot start with a zero, since it wouldn’t really be a five-digit number if it started with zero), then you can see that the range includes:

- 10000 numbers start with a zero
- 1000 more numbers which have a second leading zero
- 100 more numbers which have a third leading zero
- 10 more numbers which have a fourth leading zero

No numbers have five leading zeros, because we write 0 as such, not as an empty string.

That adds up to 11110, and it’s easy to see how that generalises. That value can be computed without a loop, as (10⁵ − 1) / 9 − 1. That correction is done at the end of the following function:

```
def countd(m, s=(0,2,4)):
if m < 0: return 0
m += 1
rv = 0
rest = 0
pos = 1
while True:
digit = m % 10
m //= 10
rv += m * pos * len(s)
for d in s:
if digit > d:
rv += pos
elif digit == d:
rv += rest
if m == 0:
break
rest += digit * pos
pos *= 10
if 0 in s:
rv -= (10 * pos - 1) // 9 - 1
return rv
```

That code could almost certainly be tightened up; I was just trying to get the algorithm down. But, as it is, it’s execution time is measured in microseconds, not milliseconds, even for much larger values of *n*.

Here’s an update of Kelly’s benchmark; I removed the other solutions because they were taking too long for the last value of *n*:

I ended up with a similar answer to rici’s, except maybe from a slightly different phrasing for the numeric formulation. How many instances of each digit in each position ("counts for each column," as rici described) we can formulate in two parts as first `p * floor(n / (10 * p))`

, where `p`

is 10 raised to the power of position. For example, in position 0 (the rightmost), there is one 1 for each ten numbers. Counting the 0’s, however, requires an additional check regarding the population of the current and next position.

To the first part we still need to add the counts attributed to the remainder of the division. For example, for `n = 6`

, `floor(6 / 10) = 0`

but we do have one count of 2 and one of 4. We add `p`

if the digit in that position in `n`

is greater than the digit we’re counting; or, if the digit is the same, we add the value on the right of the digit plus 1 (for example, for `n = 45`

, we want to count the 6 instances where 4 appears in position 1: 40, 41, 42, 43, 44, 45).

JavaScript code, comparing with rici’s instantly for *all* numbers from 1 to 600,000. (If I’m not mistaken, rici’s code wrongly returns 0 for `n = 0`

, when the answer should be 1 count.

```
function countd(m, s = [0,2,4]) {
if (m <= 0)
return 0
m += 1
rv = 0
rest = 0
pos = 1
while (true) {
digit = m % 10
m = Math.floor(m / 10)
rv += m * pos * s.length
for (d of s) {
if (digit > d)
rv += pos
else if (digit == d)
rv += rest
}
if (m == 0) {
break
}
rest += digit * pos
pos *= 10
}
if (s.includes(0)) {
rv -= Math.floor((10 * pos - 1) / 9) - 1
}
return rv
}
function f(n, ds = [0, 2, 4]) {
// Value on the right of position
let curr = 0;
let m = n;
// 10 to the power of position
let p = 1;
let result = 1;
while (m) {
const digit = m % 10;
m = Math.floor(m / 10);
for (const d of ds) {
if (d != 0 || n >= 11 * p) {
result += p * Math.floor((n - (d ? 0 : 10 * p)) / (10 * p));
}
if (digit > d && (d != 0 || m > 0)) {
result += p;
} else if (digit == d) {
result += curr + 1;
}
}
curr += p * digit;
p *= 10;
}
return result;
}
for (let n = 1; n <= 600000; n += 1) {
const _f = f(n);
const _countd = countd(n);
if (_f != _countd) {
console.log(`n: ${ n }`);
console.log(_f, _countd);
break;
}
}
console.log("Done.");
```