Fastest way to count frequencies of ordered list entries
Question:
I’m counting the occurrences of non-overlapping grouped subsequences of length i
in a binary list, so for example if I have a list:
[0, 1, 0, 1, 1, 0, 0, 0, 1, 1]
, I want to count occurrences of [0,0]
(one), [0,1]
(two), [1,0]
(one), [1,1]
(one).
I have created a function that accomplishes this (see below). However, I would like to see if there is anything that can be done to speed up the execution time of the function. I’ve already got it to be pretty quick (over previous versions of the same function), and it currently takes about ~0.03 seconds for a list of length=100,000 and i=2, and about 30 seconds for a list of length=100,000,000 and i=2. (This is a seemingly linear increase in time in relation to sequence length). However, my end goal is to do this with functions for multiple values of i
, with sequences of lengths near 15 billion. Which, assuming linearity holds, would take about 4.2 hours for just i
=2 (a higher value of i
take longer as it has to count more unique subsequences).
I unsure if there is much more speed that can be gained here(at least, while still working in python), but I am open to suggestions on how to accomplish this faster (with any method or language)?
def subseq_counter(i,l):
"""counts the frequency of unique, non-overlapping, grouped subsequences of length i in a binary list l"""
grouped = [str(l[k:k + i]) for k in range(0, len(l), i)]
#groups terms into i length subsequences
if len(grouped[len(grouped) - 1]) != len(grouped[0]):
grouped.pop(len(grouped) - 1)
#removes any subsequences at the end that are not of length i
grouped_sort = sorted(grouped)
#necesary so as to make sure the output frequencies correlate to the ascending binary order of the subsequences
grouped_sort_values = Counter(grouped_sort).values()
# counts the elements' frequency
freq_list = list(grouped_sort_values)
return freq_list
I know that a marginally faster execution time can be obtained by removing the grouped_sorted
line, however, I need to be able to access the frequencies in correlation to the ascening binary order of the subsequences (so for i
=2 that would be [0,0],[0,1],[1,0],[1,1]
) and have not figured about a better way around this.
Answers:
Not really sure I understood that last part about the order. It seems unnecessary to build a giant list of subsequences. Use a generator to yield the subsequences to the counter – that way you also don’t have to fiddle with indices:
from collections import Counter
def count_subsequences(sequence, subseq_len=2):
return Counter(subseq for subseq in zip(*[iter(sequence)] * subseq_len))
sequence = [0, 1, 0, 1, 1, 0, 0, 0, 1, 1]
counter = count_subsequences(sequence)
for subseq in (0, 0), (0, 1), (1, 0), (1, 1):
print("{}: {}".format(subseq, counter[subseq]))
Output:
(0, 0): 1
(0, 1): 2
(1, 0): 1
(1, 1): 1
>>>
In this case, the function returns the counter object itself, and the calling code displays the results in some order.
I don’t know if is faster, but try
import numpy as np
# create data
bits = np.random.randint(0, 2, 10000)
def subseq_counter(i: int, l: np.array):
"""
Counts the number of subsequences of length l in the array i
"""
# the list l is reshaped as a matrix of i columns, and
# matrix-multiplied by the binary weigts "power of 2"
# | [[2**2],
# | [2**1],
# | [2**0]]
# |____________________
# [[1,0,1], | 1*4 + 0*2 + 1*1 = 5
# [0,1,0], | 0*4 + 1*2 + 0*1 = 2
# ..., | ....
# [1,1,1]] | 1*4 + 1*2 + 1*1 = 7
iBits = l[:i*(l.size//i)].reshape(-1, i)@(2**np.arange(i-1,-1,-1).T)
unique, counts = np.unique(iBits, return_counts=True)
print(f"Counts for {i} bits:")
for u, c in zip(unique, counts):
print(f"{u:0{i}b}:{c}")
return unique, counts
subseq_counter(2,bits)
subseq_counter(3,bits)
>>> Counts for 2 bits:
>>> 00:1264
>>> 01:1279
>>> 10:1237
>>> 11:1220
>>> Counts for 3 bits:
>>> 000:425
>>> 001:429
>>> 010:411
>>> 011:395
>>> 100:437
>>> 101:412
>>> 110:407
>>> 111:417
what it does is to reshape the list into an array of n rows by i
columns, and converting to integer by multiplying by 2**n
, converting 00 to 0
, 01 to 1
, 10 to 2
and 11 to 3
, then doing the counting with np.unique()
This is a way to do it:
from collections import Counter
from itertools import product
def subseq_counter(i,l):
freq_list = [0] * 2 ** i
binaryTupToInt = {binTup:j for j, binTup in enumerate(product((0,1),repeat=i))}
c = Counter(binaryTupToInt[tuple(l[k:k+i])] for k in range(0, len(l) // i * i, i))
for k, v in c.items():
freq_list[k] = v
return freq_list
l = [0, 1, 0, 1, 1, 0, 0, 0, 1, 1]
i = 2
print(subseq_counter(i, l))
Ouput:
[1, 2, 1, 1]
Notes:
- Using the above code and changing
i
to 3
gives:
[0, 1, 1, 0, 0, 0, 1, 0]
This is showing the frequency for all possible binary values of length 3
in ascending order beginning with 0
(binary 0,0,0
) and ending with 7
(binary 1,1,1
). In other words, 0,0,0
occurs 0
times, 0,0,1
occurs 1
time, 0,1,0
occurs 1
time, 0,1,1
occurs 0
times, etc., through 1,1,1
which occurs 0
times.
- Using the code in the question with
i
changed to 3
gives:
[1, 1, 1]
This output seems hard to decipher, as it isn’t labeled so that we can easily see that the results with a non-zero value correspond to the 3-digit binary values 0,0,1
, 0,1,0
and 1,1,0
.
UPDATE:
Here’s a benchmark of several approaches on an input list of length 55 million (with i
set to 2
) including OP’s, counting sort (this answer), numpy including list-to-ndarray conversion overhead, and numpy without the overhead:
foo_1 output:
[10000000, 15000000, 15000000, 15000000]
foo_2 output:
[10000000, 15000000, 15000000, 15000000]
foo_3 output:
[10000000 15000000 15000000 15000000]
foo_4 output:
[10000000 15000000 15000000 15000000]
Timeit results:
foo_1 (OP) ran in 32.20719700001064 seconds using 1 iterations
foo_2 (counting sort) ran in 17.91718759998912 seconds using 1 iterations
foo_3 (numpy with list-to-array conversion) ran in 9.713831000000937 seconds using 1 iterations
foo_4 (numpy) ran in 1.695262699999148 seconds using 1 iterations
The clear winner is numpy
, though unless the calling program can easily be changed to use ndarrays, the required conversion slows things down by a factor of about 5x in this example.
Benchmark including some new solutions from me:
For i=2:
2.9 s ± 0.0 s Kelly_NumPy
3.7 s ± 0.0 s Kelly_bytes_count
6.6 s ± 0.0 s Kelly_zip
7.8 s ± 0.1 s Colim_numpy
8.4 s ± 0.0 s Paul_genzip
8.6 s ± 0.0 s Kelly_bytes_split2
10.5 s ± 0.0 s Kelly_bytes_slices2
10.6 s ± 0.1 s Kelly_bytes_split1
16.1 s ± 0.0 s Kelly_bytes_slices1
20.9 s ± 0.1 s constantstranger
45.1 s ± 0.3 s original
For i=5:
2.3 s ± 0.0 s Kelly_NumPy
3.8 s ± 0.0 s Kelly_zip
4.5 s ± 0.0 s Paul_genzip
4.5 s ± 0.0 s Kelly_bytes_split2
5.2 s ± 0.0 s Kelly_bytes_split1
5.4 s ± 0.0 s Kelly_bytes_slices2
7.1 s ± 0.0 s Colim_numpy
7.2 s ± 0.0 s Kelly_bytes_slices1
9.3 s ± 0.0 s constantstranger
20.6 s ± 0.0 s Kelly_bytes_count
25.3 s ± 0.1 s original
This is for a list of length n=1e6, times multiplied by 100 so it somewhat reflects your timings with length 1e8. I minimally modified the other solutions so they do what your original does, i.e., take a list of ints and return a list of ints in the correct order. One or two of my slower solutions only work if the length is a multiple of their block size, I didn’t bother making them work for all lengths since they’re slower anyway.
Full code (Try it online!):
def Kelly_NumPy(i, l):
a = np.frombuffer(bytes(l), np.int8)
stop = a.size // i * i
s = a[:stop:i]
for j in range(1, i):
s = (s << 1) | a[j:stop:i]
return np.unique(s, return_counts=True)[1].tolist()
def Kelly_zip(i, l):
ctr = Counter(zip(*[iter(l)]*i))
return [v for k, v in sorted(ctr.items())]
def Kelly_bytes_slices1(i, l):
a = bytes(l)
slices = [a[j:j+i] for j in range(0, len(a)//i*i, i)]
ctr = Counter(slices)
return [v for k, v in sorted(ctr.items())]
def Kelly_bytes_slices2(i, l):
a = bytes(l)
ig = itemgetter(*(slice(j, j+i) for j in range(0, 1000*i, i)))
ctr = Counter(chain.from_iterable(
ig(a[k:k+1000*i])
for k in range(0, len(l), 1000*i)
))
return [v for k, v in sorted(ctr.items())]
def Kelly_bytes_count(i, l):
n = len(l)
a = bytes(l)
b = bytearray([2]) * (n + n//i)
for j in range(i):
b[j+1::i+1] = a[j::i]
a = b
ss = [bytes([2])]
for _ in range(i):
ss = [s+b for s in ss for b in [bytes([0]), bytes([1])]]
return [a.count(s) for s in ss]
def Kelly_bytes_split1(i, l):
n = len(l) // i
stop = n * i
a = bytes(l)
sep = bytearray([2])
b = sep * (stop + n - 1)
for j in range(i):
b[j::i+1] = a[j::i]
ctr = Counter(bytes(b).split(sep))
return [v for k, v in sorted(ctr.items())]
def Kelly_bytes_split2(i, l):
n = len(l) // i
stop = n * i
a = bytes(l)
sep = bytearray([2])
b = sep * (5000*i + 4999)
ctr = Counter()
for k in range(0, stop, 5000*i):
for j in range(i):
b[j::i+1] = a[k+j:k+5000*i+j:i]
ctr.update(bytes(b).split(sep))
return [v for k, v in sorted(ctr.items())]
def original(i,l):
grouped = [str(l[k:k + i]) for k in range(0, len(l), i)]
if len(grouped[len(grouped) - 1]) != len(grouped[0]):
grouped.pop(len(grouped) - 1)
grouped_sort = sorted(grouped)
grouped_sort_values = Counter(grouped_sort).values()
freq_list = list(grouped_sort_values)
return freq_list
def Paul_genzip(subseq_len, sequence):
ctr = Counter(subseq for subseq in zip(*[iter(sequence)] * subseq_len))
return [v for k, v in sorted(ctr.items())]
def constantstranger(i,l):
freq_list = [0] * 2 ** i
binaryTupToInt = {binTup:j for j, binTup in enumerate(product((0,1),repeat=i))}
c = Counter(binaryTupToInt[tuple(l[k:k+i])] for k in range(0, len(l) // i * i, i))
for k, v in c.items():
freq_list[k] = v
return freq_list
def Colim_numpy(i: int, l):
l = np.array(l)
iBits = l[:i*(l.size//i)].reshape(-1, i)@(2**np.arange(i-1,-1,-1).T)
unique, counts = np.unique(iBits, return_counts=True)
return counts.tolist()
funcs = [
original,
Colim_numpy,
Paul_genzip,
constantstranger,
Kelly_NumPy,
Kelly_bytes_count,
Kelly_zip,
Kelly_bytes_slices1,
Kelly_bytes_slices2,
Kelly_bytes_split1,
Kelly_bytes_split2,
]
from time import time
import os
from collections import Counter
from itertools import repeat, chain, product
import numpy as np
from operator import itemgetter
from statistics import mean, stdev
n = 10**6
i = 2
times = {f: [] for f in funcs}
def stats(f):
ts = [t/n*1e8 for t in sorted(times[f])[:3]]
return f'{mean(ts):4.1f} s ± {stdev(ts):3.1f} s '
for _ in range(10):
l = [b % 2 for b in os.urandom(n)]
expect = None
for f in funcs:
t = time()
result = f(i, l)
t = time() - t
times[f].append(t)
if expect is None:
expect = result
else:
assert result == expect
for f in sorted(funcs, key=stats):
print(stats(f), f.__name__,)
This is much faster. It uses Kelly’s idea of using numpy.frombuffer
instead of converting the list to numpy array, and uses Pandas to count unique values, which is faster than numpy.unique
for more than 100 000 results
import pandas as pd
def subseq_counter(i: int, l):
l = np.frombuffer(bytes(l), np.int8)
iBits = l[:i*(l.size//i)].reshape(-1, i)@(2 **np.arange(i-1, -1, -1).T).astype(np.int8)
# bug fix: when not enough data, (higly probable for large i),
# iBits do not has every possible value, so returning unique values
# as list may lose information
answer = [0]*2**i # empty counter including all possible values
if len(iBits) > 100000:
for i, v in pd.value_counts(iBits).items():
answer[i] = v
else:
unique, count = np.unique(iBits, return_counts=True)
for i, v in zip(unique, count):
answer[i] = v
return answer
I’m counting the occurrences of non-overlapping grouped subsequences of length i
in a binary list, so for example if I have a list:
[0, 1, 0, 1, 1, 0, 0, 0, 1, 1]
, I want to count occurrences of [0,0]
(one), [0,1]
(two), [1,0]
(one), [1,1]
(one).
I have created a function that accomplishes this (see below). However, I would like to see if there is anything that can be done to speed up the execution time of the function. I’ve already got it to be pretty quick (over previous versions of the same function), and it currently takes about ~0.03 seconds for a list of length=100,000 and i=2, and about 30 seconds for a list of length=100,000,000 and i=2. (This is a seemingly linear increase in time in relation to sequence length). However, my end goal is to do this with functions for multiple values of i
, with sequences of lengths near 15 billion. Which, assuming linearity holds, would take about 4.2 hours for just i
=2 (a higher value of i
take longer as it has to count more unique subsequences).
I unsure if there is much more speed that can be gained here(at least, while still working in python), but I am open to suggestions on how to accomplish this faster (with any method or language)?
def subseq_counter(i,l):
"""counts the frequency of unique, non-overlapping, grouped subsequences of length i in a binary list l"""
grouped = [str(l[k:k + i]) for k in range(0, len(l), i)]
#groups terms into i length subsequences
if len(grouped[len(grouped) - 1]) != len(grouped[0]):
grouped.pop(len(grouped) - 1)
#removes any subsequences at the end that are not of length i
grouped_sort = sorted(grouped)
#necesary so as to make sure the output frequencies correlate to the ascending binary order of the subsequences
grouped_sort_values = Counter(grouped_sort).values()
# counts the elements' frequency
freq_list = list(grouped_sort_values)
return freq_list
I know that a marginally faster execution time can be obtained by removing the grouped_sorted
line, however, I need to be able to access the frequencies in correlation to the ascening binary order of the subsequences (so for i
=2 that would be [0,0],[0,1],[1,0],[1,1]
) and have not figured about a better way around this.
Not really sure I understood that last part about the order. It seems unnecessary to build a giant list of subsequences. Use a generator to yield the subsequences to the counter – that way you also don’t have to fiddle with indices:
from collections import Counter
def count_subsequences(sequence, subseq_len=2):
return Counter(subseq for subseq in zip(*[iter(sequence)] * subseq_len))
sequence = [0, 1, 0, 1, 1, 0, 0, 0, 1, 1]
counter = count_subsequences(sequence)
for subseq in (0, 0), (0, 1), (1, 0), (1, 1):
print("{}: {}".format(subseq, counter[subseq]))
Output:
(0, 0): 1
(0, 1): 2
(1, 0): 1
(1, 1): 1
>>>
In this case, the function returns the counter object itself, and the calling code displays the results in some order.
I don’t know if is faster, but try
import numpy as np
# create data
bits = np.random.randint(0, 2, 10000)
def subseq_counter(i: int, l: np.array):
"""
Counts the number of subsequences of length l in the array i
"""
# the list l is reshaped as a matrix of i columns, and
# matrix-multiplied by the binary weigts "power of 2"
# | [[2**2],
# | [2**1],
# | [2**0]]
# |____________________
# [[1,0,1], | 1*4 + 0*2 + 1*1 = 5
# [0,1,0], | 0*4 + 1*2 + 0*1 = 2
# ..., | ....
# [1,1,1]] | 1*4 + 1*2 + 1*1 = 7
iBits = l[:i*(l.size//i)].reshape(-1, i)@(2**np.arange(i-1,-1,-1).T)
unique, counts = np.unique(iBits, return_counts=True)
print(f"Counts for {i} bits:")
for u, c in zip(unique, counts):
print(f"{u:0{i}b}:{c}")
return unique, counts
subseq_counter(2,bits)
subseq_counter(3,bits)
>>> Counts for 2 bits:
>>> 00:1264
>>> 01:1279
>>> 10:1237
>>> 11:1220
>>> Counts for 3 bits:
>>> 000:425
>>> 001:429
>>> 010:411
>>> 011:395
>>> 100:437
>>> 101:412
>>> 110:407
>>> 111:417
what it does is to reshape the list into an array of n rows by i
columns, and converting to integer by multiplying by 2**n
, converting 00 to 0
, 01 to 1
, 10 to 2
and 11 to 3
, then doing the counting with np.unique()
This is a way to do it:
from collections import Counter
from itertools import product
def subseq_counter(i,l):
freq_list = [0] * 2 ** i
binaryTupToInt = {binTup:j for j, binTup in enumerate(product((0,1),repeat=i))}
c = Counter(binaryTupToInt[tuple(l[k:k+i])] for k in range(0, len(l) // i * i, i))
for k, v in c.items():
freq_list[k] = v
return freq_list
l = [0, 1, 0, 1, 1, 0, 0, 0, 1, 1]
i = 2
print(subseq_counter(i, l))
Ouput:
[1, 2, 1, 1]
Notes:
- Using the above code and changing
i
to3
gives:[0, 1, 1, 0, 0, 0, 1, 0]
This is showing the frequency for all possible binary values of length
3
in ascending order beginning with0
(binary0,0,0
) and ending with7
(binary1,1,1
). In other words,0,0,0
occurs0
times,0,0,1
occurs1
time,0,1,0
occurs1
time,0,1,1
occurs0
times, etc., through1,1,1
which occurs0
times. - Using the code in the question with
i
changed to3
gives:[1, 1, 1]
This output seems hard to decipher, as it isn’t labeled so that we can easily see that the results with a non-zero value correspond to the 3-digit binary values
0,0,1
,0,1,0
and1,1,0
.
UPDATE:
Here’s a benchmark of several approaches on an input list of length 55 million (with i
set to 2
) including OP’s, counting sort (this answer), numpy including list-to-ndarray conversion overhead, and numpy without the overhead:
foo_1 output:
[10000000, 15000000, 15000000, 15000000]
foo_2 output:
[10000000, 15000000, 15000000, 15000000]
foo_3 output:
[10000000 15000000 15000000 15000000]
foo_4 output:
[10000000 15000000 15000000 15000000]
Timeit results:
foo_1 (OP) ran in 32.20719700001064 seconds using 1 iterations
foo_2 (counting sort) ran in 17.91718759998912 seconds using 1 iterations
foo_3 (numpy with list-to-array conversion) ran in 9.713831000000937 seconds using 1 iterations
foo_4 (numpy) ran in 1.695262699999148 seconds using 1 iterations
The clear winner is numpy
, though unless the calling program can easily be changed to use ndarrays, the required conversion slows things down by a factor of about 5x in this example.
Benchmark including some new solutions from me:
For i=2:
2.9 s ± 0.0 s Kelly_NumPy
3.7 s ± 0.0 s Kelly_bytes_count
6.6 s ± 0.0 s Kelly_zip
7.8 s ± 0.1 s Colim_numpy
8.4 s ± 0.0 s Paul_genzip
8.6 s ± 0.0 s Kelly_bytes_split2
10.5 s ± 0.0 s Kelly_bytes_slices2
10.6 s ± 0.1 s Kelly_bytes_split1
16.1 s ± 0.0 s Kelly_bytes_slices1
20.9 s ± 0.1 s constantstranger
45.1 s ± 0.3 s original
For i=5:
2.3 s ± 0.0 s Kelly_NumPy
3.8 s ± 0.0 s Kelly_zip
4.5 s ± 0.0 s Paul_genzip
4.5 s ± 0.0 s Kelly_bytes_split2
5.2 s ± 0.0 s Kelly_bytes_split1
5.4 s ± 0.0 s Kelly_bytes_slices2
7.1 s ± 0.0 s Colim_numpy
7.2 s ± 0.0 s Kelly_bytes_slices1
9.3 s ± 0.0 s constantstranger
20.6 s ± 0.0 s Kelly_bytes_count
25.3 s ± 0.1 s original
This is for a list of length n=1e6, times multiplied by 100 so it somewhat reflects your timings with length 1e8. I minimally modified the other solutions so they do what your original does, i.e., take a list of ints and return a list of ints in the correct order. One or two of my slower solutions only work if the length is a multiple of their block size, I didn’t bother making them work for all lengths since they’re slower anyway.
Full code (Try it online!):
def Kelly_NumPy(i, l):
a = np.frombuffer(bytes(l), np.int8)
stop = a.size // i * i
s = a[:stop:i]
for j in range(1, i):
s = (s << 1) | a[j:stop:i]
return np.unique(s, return_counts=True)[1].tolist()
def Kelly_zip(i, l):
ctr = Counter(zip(*[iter(l)]*i))
return [v for k, v in sorted(ctr.items())]
def Kelly_bytes_slices1(i, l):
a = bytes(l)
slices = [a[j:j+i] for j in range(0, len(a)//i*i, i)]
ctr = Counter(slices)
return [v for k, v in sorted(ctr.items())]
def Kelly_bytes_slices2(i, l):
a = bytes(l)
ig = itemgetter(*(slice(j, j+i) for j in range(0, 1000*i, i)))
ctr = Counter(chain.from_iterable(
ig(a[k:k+1000*i])
for k in range(0, len(l), 1000*i)
))
return [v for k, v in sorted(ctr.items())]
def Kelly_bytes_count(i, l):
n = len(l)
a = bytes(l)
b = bytearray([2]) * (n + n//i)
for j in range(i):
b[j+1::i+1] = a[j::i]
a = b
ss = [bytes([2])]
for _ in range(i):
ss = [s+b for s in ss for b in [bytes([0]), bytes([1])]]
return [a.count(s) for s in ss]
def Kelly_bytes_split1(i, l):
n = len(l) // i
stop = n * i
a = bytes(l)
sep = bytearray([2])
b = sep * (stop + n - 1)
for j in range(i):
b[j::i+1] = a[j::i]
ctr = Counter(bytes(b).split(sep))
return [v for k, v in sorted(ctr.items())]
def Kelly_bytes_split2(i, l):
n = len(l) // i
stop = n * i
a = bytes(l)
sep = bytearray([2])
b = sep * (5000*i + 4999)
ctr = Counter()
for k in range(0, stop, 5000*i):
for j in range(i):
b[j::i+1] = a[k+j:k+5000*i+j:i]
ctr.update(bytes(b).split(sep))
return [v for k, v in sorted(ctr.items())]
def original(i,l):
grouped = [str(l[k:k + i]) for k in range(0, len(l), i)]
if len(grouped[len(grouped) - 1]) != len(grouped[0]):
grouped.pop(len(grouped) - 1)
grouped_sort = sorted(grouped)
grouped_sort_values = Counter(grouped_sort).values()
freq_list = list(grouped_sort_values)
return freq_list
def Paul_genzip(subseq_len, sequence):
ctr = Counter(subseq for subseq in zip(*[iter(sequence)] * subseq_len))
return [v for k, v in sorted(ctr.items())]
def constantstranger(i,l):
freq_list = [0] * 2 ** i
binaryTupToInt = {binTup:j for j, binTup in enumerate(product((0,1),repeat=i))}
c = Counter(binaryTupToInt[tuple(l[k:k+i])] for k in range(0, len(l) // i * i, i))
for k, v in c.items():
freq_list[k] = v
return freq_list
def Colim_numpy(i: int, l):
l = np.array(l)
iBits = l[:i*(l.size//i)].reshape(-1, i)@(2**np.arange(i-1,-1,-1).T)
unique, counts = np.unique(iBits, return_counts=True)
return counts.tolist()
funcs = [
original,
Colim_numpy,
Paul_genzip,
constantstranger,
Kelly_NumPy,
Kelly_bytes_count,
Kelly_zip,
Kelly_bytes_slices1,
Kelly_bytes_slices2,
Kelly_bytes_split1,
Kelly_bytes_split2,
]
from time import time
import os
from collections import Counter
from itertools import repeat, chain, product
import numpy as np
from operator import itemgetter
from statistics import mean, stdev
n = 10**6
i = 2
times = {f: [] for f in funcs}
def stats(f):
ts = [t/n*1e8 for t in sorted(times[f])[:3]]
return f'{mean(ts):4.1f} s ± {stdev(ts):3.1f} s '
for _ in range(10):
l = [b % 2 for b in os.urandom(n)]
expect = None
for f in funcs:
t = time()
result = f(i, l)
t = time() - t
times[f].append(t)
if expect is None:
expect = result
else:
assert result == expect
for f in sorted(funcs, key=stats):
print(stats(f), f.__name__,)
This is much faster. It uses Kelly’s idea of using numpy.frombuffer
instead of converting the list to numpy array, and uses Pandas to count unique values, which is faster than numpy.unique
for more than 100 000 results
import pandas as pd
def subseq_counter(i: int, l):
l = np.frombuffer(bytes(l), np.int8)
iBits = l[:i*(l.size//i)].reshape(-1, i)@(2 **np.arange(i-1, -1, -1).T).astype(np.int8)
# bug fix: when not enough data, (higly probable for large i),
# iBits do not has every possible value, so returning unique values
# as list may lose information
answer = [0]*2**i # empty counter including all possible values
if len(iBits) > 100000:
for i, v in pd.value_counts(iBits).items():
answer[i] = v
else:
unique, count = np.unique(iBits, return_counts=True)
for i, v in zip(unique, count):
answer[i] = v
return answer