Is it possible to be better than O(N+M) for Codility lesson MaxCounters using python?
Question:
This is the code I am using for the Codility lesson: MaxCounters
def solution(N, A):
counters = [0] * N
max_c = 0
for el in A:
if el >= 1 and el <= N:
counters[el-1] += 1
max_c = max(counters[el-1], max_c)
elif el > N:
counters = [max_c] * N
return counters
Every test passes but the last one (“all max_counter operations”) times out after 7 seconds so the result is just 88% with time complexity of O(N+M).
Is it possible to improve the time complexity of the algorithm and get 100% test result using Python?
The MaxCounters task
You are given N counters, initially set to 0, and you have two possible operations on them:
- increase(X) − counter X is increased by 1,
- max counter − all counters are set to the maximum value of any counter.
A non-empty array A of M integers is given. This array represents consecutive operations:
- if A[K] = X, such that 1 ≤ X ≤ N, then operation K is increase(X),
- if A[K] = N + 1 then operation K is max counter.
Write an efficient algorithm for the following assumptions:
- N and M are integers within the range [1..100,000];
- each element of array A is an integer within the range [1..N + 1].
Answers:
EDIT: following up on the discussion in the comments to this answer, tracking the last operation to avoid unnecessarily resetting the array in successive max_counter operations was the key to achieving the goal. Here’s what the different solutions (one keeping track of the max and the second calculating the max on demand) would look like implementing that change:
def solution(N, A):
counters = [0] * N
max_c = 0
last_was_max = False
for el in A:
if el <= N:
counters[el - 1] += 1
max_c = max(counters[el - 1], max_c)
last_was_max = False
elif not last_was_max:
counters = [max_c] * N
last_was_max = True
return counters
def solution2_1(N, A):
counters = [0] * N
last_was_max = False
for el in A:
if el <= N:
counters[el - 1] += 1
last_was_max = False
elif not last_was_max:
counters = [max(counters)] * N
last_was_max = True
return counters
I am not aware which implementation was used in the submission.
First, you’re wasting some time in your if conditions: there’s no need to check for the integer being greater or equal to 1, that’s a given from the exercise. Then, there’s no need to evaluate a second elif-statement, simply go for an else there. If the first condition is not met, the second will be met by definition of the exercise
Second, according to my testing, just calculating the maximum at the time it is needed is much faster than keeping track of it through all the runs. This is likely due to the fact that the max-operation will only occur very rarely, especially for large values of M and therefore you’re wasting time keeping track of stuff many times vs only calculating the maximum a few times during the run.
Addressing @Nearoo’s comment, it appears reassigning the array does in fact change the physical address, but according to some tests I ran reassigning is much faster than the for loop anyway.
def solution2(N, A):
counters = [0] * N
for el in A:
if el <= N:
counters[el - 1] += 1
else:
counters = [max(counters)] * N
return counters
This solution I came up with outperformed your solution by a solid factor of 2-3 on several test runs with different seed values. Here’s the code to reproduce:
import random
random.seed(101)
N = random.randint(1, 100000)
M = random.randint(1, 100000)
A = [random.randint(1, N + 1) for i in range(M)]
%timeit solution(N,A)
>>> 11.7 ms ± 805 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit solution2(N,A)
>>> 3.63 ms ± 169 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Here’s a 100% solution:
# MaxCounters
def solution(N, A):
count = [0] * N
max_count = 0
last_max = False
for val in A:
if val == N + 1 and last_max == False:
count = [max_count] * N
last_max = True
continue
if val <= N:
count[val - 1] += 1
max_count = max(count[val - 1], max_count)
last_max = False
return count
# MaxCounters
def solution(N, A):
max_counter = 0
list_counters = [0]*N
if N < min(A):
return list_counters
for i in A:
if i <= N:
list_counters[i-1] += 1
max_counter = max(list_counters[i-1], max_counter)
else:
list_counters = [max_counter]*N
return list_counters
I dont know Python.
But my kotlin solution even run faster when more MaxCounter at the extreme test.
Check the Report
▶extreme_large
all max_counter operations✔OK
- 0.280s OK
- 0.264s OK
I believe this test is about avoid the full update on the array during max process.
If you can avoid that you should able to get the 100%.
So try your best to stop a full array update no matter on which language.
I get a 88% at my first try, and I used an hour to getting this original new answer. After a max operation all counter has same value, so its meaning less to keep check each counter until the final max operation. I create the array AFTER I found the last Max Operation, so there has no array full update require at all
actually, this gives a 100% score:
def solution(N, A):
# write your code in Python 3.6
maxcount=0
counter=[maxcount]*N
can_be_updated = True
for J in A:
if(J<=N ):
counter[J-1]+=1
maxcount = max(maxcount,counter[J-1])
can_be_updated = True
else:
if(can_be_updated):
counter = [maxcount]*N
can_be_updated = False
return(counter)
pass
Actually, every solution that involves creating new list every time there is an update will fail on "large_random2" test. You have to cache two max values. One for global max (minimum for every value) and current max (current maximum value in whole array).
def solution(N, A):
global_max = 0
current_max = 0
result = [0] * N
canUpdate = True
for i in A:
if i <= N:
if result[i-1] <= global_max:
result[i-1] = global_max + 1
else:
result[i-1] += 1
current_max = max(current_max, result[i-1])
canUpdate = True
elif canUpdate:
canUpdate = False
global_max = current_max
# print(result)
for i, val in enumerate(result):
if val < global_max:
result[i] = global_max
return result
This is the code I am using for the Codility lesson: MaxCounters
def solution(N, A):
counters = [0] * N
max_c = 0
for el in A:
if el >= 1 and el <= N:
counters[el-1] += 1
max_c = max(counters[el-1], max_c)
elif el > N:
counters = [max_c] * N
return counters
Every test passes but the last one (“all max_counter operations”) times out after 7 seconds so the result is just 88% with time complexity of O(N+M).
Is it possible to improve the time complexity of the algorithm and get 100% test result using Python?
The MaxCounters task
You are given N counters, initially set to 0, and you have two possible operations on them:
- increase(X) − counter X is increased by 1,
- max counter − all counters are set to the maximum value of any counter.
A non-empty array A of M integers is given. This array represents consecutive operations:
- if A[K] = X, such that 1 ≤ X ≤ N, then operation K is increase(X),
- if A[K] = N + 1 then operation K is max counter.
Write an efficient algorithm for the following assumptions:
- N and M are integers within the range [1..100,000];
- each element of array A is an integer within the range [1..N + 1].
EDIT: following up on the discussion in the comments to this answer, tracking the last operation to avoid unnecessarily resetting the array in successive max_counter operations was the key to achieving the goal. Here’s what the different solutions (one keeping track of the max and the second calculating the max on demand) would look like implementing that change:
def solution(N, A):
counters = [0] * N
max_c = 0
last_was_max = False
for el in A:
if el <= N:
counters[el - 1] += 1
max_c = max(counters[el - 1], max_c)
last_was_max = False
elif not last_was_max:
counters = [max_c] * N
last_was_max = True
return counters
def solution2_1(N, A):
counters = [0] * N
last_was_max = False
for el in A:
if el <= N:
counters[el - 1] += 1
last_was_max = False
elif not last_was_max:
counters = [max(counters)] * N
last_was_max = True
return counters
I am not aware which implementation was used in the submission.
First, you’re wasting some time in your if conditions: there’s no need to check for the integer being greater or equal to 1, that’s a given from the exercise. Then, there’s no need to evaluate a second elif-statement, simply go for an else there. If the first condition is not met, the second will be met by definition of the exercise
Second, according to my testing, just calculating the maximum at the time it is needed is much faster than keeping track of it through all the runs. This is likely due to the fact that the max-operation will only occur very rarely, especially for large values of M and therefore you’re wasting time keeping track of stuff many times vs only calculating the maximum a few times during the run.
Addressing @Nearoo’s comment, it appears reassigning the array does in fact change the physical address, but according to some tests I ran reassigning is much faster than the for loop anyway.
def solution2(N, A):
counters = [0] * N
for el in A:
if el <= N:
counters[el - 1] += 1
else:
counters = [max(counters)] * N
return counters
This solution I came up with outperformed your solution by a solid factor of 2-3 on several test runs with different seed values. Here’s the code to reproduce:
import random
random.seed(101)
N = random.randint(1, 100000)
M = random.randint(1, 100000)
A = [random.randint(1, N + 1) for i in range(M)]
%timeit solution(N,A)
>>> 11.7 ms ± 805 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit solution2(N,A)
>>> 3.63 ms ± 169 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Here’s a 100% solution:
# MaxCounters
def solution(N, A):
count = [0] * N
max_count = 0
last_max = False
for val in A:
if val == N + 1 and last_max == False:
count = [max_count] * N
last_max = True
continue
if val <= N:
count[val - 1] += 1
max_count = max(count[val - 1], max_count)
last_max = False
return count
# MaxCounters
def solution(N, A):
max_counter = 0
list_counters = [0]*N
if N < min(A):
return list_counters
for i in A:
if i <= N:
list_counters[i-1] += 1
max_counter = max(list_counters[i-1], max_counter)
else:
list_counters = [max_counter]*N
return list_counters
I dont know Python.
But my kotlin solution even run faster when more MaxCounter at the extreme test.
Check the Report
▶extreme_large
all max_counter operations✔OK
- 0.280s OK
- 0.264s OK
I believe this test is about avoid the full update on the array during max process.
If you can avoid that you should able to get the 100%.
So try your best to stop a full array update no matter on which language.
I get a 88% at my first try, and I used an hour to getting this original new answer. After a max operation all counter has same value, so its meaning less to keep check each counter until the final max operation. I create the array AFTER I found the last Max Operation, so there has no array full update require at all
actually, this gives a 100% score:
def solution(N, A):
# write your code in Python 3.6
maxcount=0
counter=[maxcount]*N
can_be_updated = True
for J in A:
if(J<=N ):
counter[J-1]+=1
maxcount = max(maxcount,counter[J-1])
can_be_updated = True
else:
if(can_be_updated):
counter = [maxcount]*N
can_be_updated = False
return(counter)
pass
Actually, every solution that involves creating new list every time there is an update will fail on "large_random2" test. You have to cache two max values. One for global max (minimum for every value) and current max (current maximum value in whole array).
def solution(N, A):
global_max = 0
current_max = 0
result = [0] * N
canUpdate = True
for i in A:
if i <= N:
if result[i-1] <= global_max:
result[i-1] = global_max + 1
else:
result[i-1] += 1
current_max = max(current_max, result[i-1])
canUpdate = True
elif canUpdate:
canUpdate = False
global_max = current_max
# print(result)
for i, val in enumerate(result):
if val < global_max:
result[i] = global_max
return result