Slicing Python Lists using list of elements
Question:
I have the list qf_indexes and I want to slice it into many sublists using another list called used_qf_in:
qf_indexes = [ 4, 18, 32, 46, 60, 74, 88, 102, 116, 130, 144, 158, 172, 186, 200, 214, 228, 242, 256, 270]
used_qf_in= [4, 18, 186, 200, 228, 256]
such that each sublist starts from one of used_qf_in elements j and contains all the elements in qf_indexes until the element before the one inj+1
I tried the following:
used_qf_ind =[]
for j in range(len(used_qf_in)):
print(qf_indexes[used_qf_in[j]: used_qf_in[j+1]])
used_qf_ind.append(qf_indexes[used_qf_in[j]: used_qf_in[j+1]])
I expected to see:
qf_indexes = [[4],[18, 32, 46, 60, 74, 88, 102, 116, 130, 144, 158, 172],[186], [200, 214, 228, 242], [256, 270]]
But the result i got when i print inside the loop is :
[ 4, 18]
[ 32, 46, 60, 74, 88, 102, 116, 130, 144, 158, 172, 186, 200, 214, 228, 242, 256, 270]
[]
[]
[]
IndexError: list index out of range
Answers:
IIUC (and assuming that both lists are ordered, and all indices in used_qf_in
are in qf_indexes
):
a = [
qf_indexes[qf_indexes.index(i):qf_indexes.index(j)]
for i,j in zip(used_qf_in, used_qf_in[1:])
]
>>> a
[[4],
[18, 32, 46, 60, 74, 88, 102, 116, 130, 144, 158, 172],
[186],
[200, 214],
[228, 242]]
Edit: @SomeDude clever idea to use bisect_left
yields a much faster solution. Let’s generate some data to measure how much faster:
def gen(n):
a = np.random.randint(1, 10, n).cumsum()
b = np.random.choice(a, n // 4, replace=False)
return a.tolist(), sorted(b)
# example
np.random.seed(0)
a, b = gen(16)
>>> a
[6, 7, 11, 15, 23, 27, 33, 36, 41, 49, 56, 65, 74, 76, 83, 91]
>>> b
[11, 56, 65, 74]
Now:
def f0(a, b):
return [a[a.index(i):a.index(j)] for i,j in zip(b, b[1:])]
from bisect import bisect_left
def f_bisect(a, b):
idxs = [bisect_left(a, ix) for ix in b]
return [a[i:j] for i, j in zip(idxs, idxs[1:])] + [b[idxs[-1]:]]
a, b = gen(10_000)
t0 = %timeit -o f0(a, b)
# 869 ms ± 1.29 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
t1 = %timeit -o f_bisect(a, b)
# 1.91 ms ± 326 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
>>> t0.best / tb.best
454.62
Over 400x faster for 10_000
elements and 2500 sublists!
If your qf_indexes
is sorted you can use bisect_left
for more efficient slicing.
from bisect import bisect_left
idxs = [bisect_left(qf_indexes, qf) for qf in used_qf_in]
out = [qf_indexes[i:j] for i, j in zip(idxs, idxs[1:])] + [qf_indexes[idxs[-1]:]]
print(out)
[[4],
[18, 32, 46, 60, 74, 88, 102, 116, 130, 144, 158, 172],
[186],
[200, 214],
[228, 242],
[256, 270]]
You can first find the indexes of the numbers with something like this
indexes = []
cur_index = 0
for i, n in enumerate(qf_indexes):
if n == used_qf_in[cur_index]:
cur_index += 1
indexes.append(i)
if cur_index >= len(used_qf_in):
break
or with a comprehension like
indexes = [qf_indexes.index(n) for n in used_qf_in]
and then create the sub-lists
final_list = [qf_indexes[indexes[x]: indexes[x+1]] for x in range(len(used_qf_in)-1)]
You can use a for loop to iterate over the values in the slicing list and use the index method of the sliced list to get the index of each value.
qf_indexes = [ 4, 18, 32, 46, 60, 74, 88, 102, 116, 130, 144, 158, 172, 186, 200, 214, 228, 242, 256, 270]
used_qf_in = [4, 18, 186, 200, 228, 256]
result = []
start = 0
for value in used_qf_in:
index = qf_indexes.index(value)
result.append(qf_indexes[start:index+1])
start = index+1
# append the remaining elements
if start < len(qf_indexes):
result.append(qf_indexes[start:])
print(result)
[[4], [18, 32, 46, 60, 74, 88, 102, 116, 130, 144, 158, 172], [186],
[200, 214, 228, 242], [256, 270]]
I have the list qf_indexes and I want to slice it into many sublists using another list called used_qf_in:
qf_indexes = [ 4, 18, 32, 46, 60, 74, 88, 102, 116, 130, 144, 158, 172, 186, 200, 214, 228, 242, 256, 270]
used_qf_in= [4, 18, 186, 200, 228, 256]
such that each sublist starts from one of used_qf_in elements j and contains all the elements in qf_indexes until the element before the one inj+1
I tried the following:
used_qf_ind =[]
for j in range(len(used_qf_in)):
print(qf_indexes[used_qf_in[j]: used_qf_in[j+1]])
used_qf_ind.append(qf_indexes[used_qf_in[j]: used_qf_in[j+1]])
I expected to see:
qf_indexes = [[4],[18, 32, 46, 60, 74, 88, 102, 116, 130, 144, 158, 172],[186], [200, 214, 228, 242], [256, 270]]
But the result i got when i print inside the loop is :
[ 4, 18]
[ 32, 46, 60, 74, 88, 102, 116, 130, 144, 158, 172, 186, 200, 214, 228, 242, 256, 270]
[]
[]
[]
IndexError: list index out of range
IIUC (and assuming that both lists are ordered, and all indices in used_qf_in
are in qf_indexes
):
a = [
qf_indexes[qf_indexes.index(i):qf_indexes.index(j)]
for i,j in zip(used_qf_in, used_qf_in[1:])
]
>>> a
[[4],
[18, 32, 46, 60, 74, 88, 102, 116, 130, 144, 158, 172],
[186],
[200, 214],
[228, 242]]
Edit: @SomeDude clever idea to use bisect_left
yields a much faster solution. Let’s generate some data to measure how much faster:
def gen(n):
a = np.random.randint(1, 10, n).cumsum()
b = np.random.choice(a, n // 4, replace=False)
return a.tolist(), sorted(b)
# example
np.random.seed(0)
a, b = gen(16)
>>> a
[6, 7, 11, 15, 23, 27, 33, 36, 41, 49, 56, 65, 74, 76, 83, 91]
>>> b
[11, 56, 65, 74]
Now:
def f0(a, b):
return [a[a.index(i):a.index(j)] for i,j in zip(b, b[1:])]
from bisect import bisect_left
def f_bisect(a, b):
idxs = [bisect_left(a, ix) for ix in b]
return [a[i:j] for i, j in zip(idxs, idxs[1:])] + [b[idxs[-1]:]]
a, b = gen(10_000)
t0 = %timeit -o f0(a, b)
# 869 ms ± 1.29 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
t1 = %timeit -o f_bisect(a, b)
# 1.91 ms ± 326 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
>>> t0.best / tb.best
454.62
Over 400x faster for 10_000
elements and 2500 sublists!
If your qf_indexes
is sorted you can use bisect_left
for more efficient slicing.
from bisect import bisect_left
idxs = [bisect_left(qf_indexes, qf) for qf in used_qf_in]
out = [qf_indexes[i:j] for i, j in zip(idxs, idxs[1:])] + [qf_indexes[idxs[-1]:]]
print(out)
[[4],
[18, 32, 46, 60, 74, 88, 102, 116, 130, 144, 158, 172],
[186],
[200, 214],
[228, 242],
[256, 270]]
You can first find the indexes of the numbers with something like this
indexes = []
cur_index = 0
for i, n in enumerate(qf_indexes):
if n == used_qf_in[cur_index]:
cur_index += 1
indexes.append(i)
if cur_index >= len(used_qf_in):
break
or with a comprehension like
indexes = [qf_indexes.index(n) for n in used_qf_in]
and then create the sub-lists
final_list = [qf_indexes[indexes[x]: indexes[x+1]] for x in range(len(used_qf_in)-1)]
You can use a for loop to iterate over the values in the slicing list and use the index method of the sliced list to get the index of each value.
qf_indexes = [ 4, 18, 32, 46, 60, 74, 88, 102, 116, 130, 144, 158, 172, 186, 200, 214, 228, 242, 256, 270]
used_qf_in = [4, 18, 186, 200, 228, 256]
result = []
start = 0
for value in used_qf_in:
index = qf_indexes.index(value)
result.append(qf_indexes[start:index+1])
start = index+1
# append the remaining elements
if start < len(qf_indexes):
result.append(qf_indexes[start:])
print(result)
[[4], [18, 32, 46, 60, 74, 88, 102, 116, 130, 144, 158, 172], [186],
[200, 214, 228, 242], [256, 270]]