creating a new list with subset of list using index in python

Question:

A list:

a = ['a', 'b', 'c', 3, 4, 'd', 6, 7, 8]

I want a list using a subset of a using a[0:2],a[4], a[6:],

that is I want a list ['a', 'b', 4, 6, 7, 8]

Asked By: user2783615

||

Answers:

Try new_list = a[0:2] + [a[4]] + a[6:].

Or more generally, something like this:

from itertools import chain
new_list = list(chain(a[0:2], [a[4]], a[6:]))

This works with other sequences as well, and is likely to be faster.

Or you could do this:

def chain_elements_or_slices(*elements_or_slices):
    new_list = []
    for i in elements_or_slices:
        if isinstance(i, list):
            new_list.extend(i)
        else:
            new_list.append(i)
    return new_list

new_list = chain_elements_or_slices(a[0:2], a[4], a[6:])

But beware, this would lead to problems if some of the elements in your list were themselves lists.
To solve this, either use one of the previous solutions, or replace a[4] with a[4:5] (or more generally a[n] with a[n:n+1]).

Answered By: rlms

The following definition might be more efficient than the first solution proposed

def new_list_from_intervals(original_list, *intervals):
    n = sum(j - i for i, j in intervals)
    new_list = [None] * n
    index = 0
    for i, j in intervals :
        for k in range(i, j) :
            new_list[index] = original_list[k]
            index += 1

    return new_list

then you can use it like below

new_list = new_list_from_intervals(original_list, (0,2), (4,5), (6, len(original_list)))
Answered By: Mmmh mmh

Suppose

a = ['a', 'b', 'c', 3, 4, 'd', 6, 7, 8]

and the list of indexes is stored in

b= [0, 1, 2, 4, 6, 7, 8]

then a simple one-line solution will be

c = [a[i] for i in b]
Answered By: G. Cohen

This thread is years old and I do not know if the method existed at the time, but the fastest solution I found in 2022 is not mentioned in the answers so far.
My exemplary list contains integers from 1 to 6 and I want to retrieve 4 items from this list.

I used the %timeit functionality of Jupyter Notebook / iPython on a Windows 10 system with Python 3.7.4 installed.

I added a numpy approach just to see how fast it is. It might take more time with the mixed type collection from the original question.

The fastest solution appears to be itemgetter from the operator module (Standard Library). If it does not matter whether the return is a tuple or a list, use itemgetter as is or otherwise use a list conversion. Both cases are faster than the other solutions.

from itertools import chain
import numpy as np
from operator import itemgetter
# 
my_list = [1,2,3,4,5,6]
item_indices = [2, 0, 1, 5]
# 
%timeit itemgetter(*item_indices)(my_list)
%timeit list(itemgetter(*item_indices)(my_list))
%timeit [my_list[item] for item in item_indices]
%timeit list(np.array(my_list)[item_indices])
%timeit list(chain(my_list[2:3], my_list[0:1], my_list[1:2], my_list[5:6]))

and the output is:

184 ns ± 14.5 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
251 ns ± 11.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
283 ns ± 85.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
4.3 µs ± 260 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
663 ns ± 49.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

I would be interested in possible deviations of which solution is fastest depending on the size of the list and the number of items we want to extract, but this is my typical use case for my current project.
If someone finds the time to investigate this further, please let me know.

Answered By: FEngineer
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.