Access multiple elements of list knowing their index

Question:

I need to choose some elements from the given list, knowing their index. Let say I would like to create a new list, which contains element with index 1, 2, 5, from given list [-2, 1, 5, 3, 8, 5, 6]. What I did is:

a = [-2,1,5,3,8,5,6]
b = [1,2,5]
c = [ a[i] for i in b]

Is there any better way to do it? something like c = a[b] ?

Asked By: hoang tran

||

Answers:

Alternatives:

>>> map(a.__getitem__, b)
[1, 5, 5]

>>> import operator
>>> operator.itemgetter(*b)(a)
(1, 5, 5)
Answered By: falsetru

You can use operator.itemgetter:

from operator import itemgetter 
a = [-2, 1, 5, 3, 8, 5, 6]
b = [1, 2, 5]
print(itemgetter(*b)(a))
# Result:
(1, 5, 5)

Or you can use numpy:

import numpy as np
a = np.array([-2, 1, 5, 3, 8, 5, 6])
b = [1, 2, 5]
print(list(a[b]))
# Result:
[1, 5, 5]

But really, your current solution is fine. It’s probably the neatest out of all of them.

Answered By: TerryA

My answer does not use numpy or python collections.

One trivial way to find elements would be as follows:

a = [-2, 1, 5, 3, 8, 5, 6]
b = [1, 2, 5]
c = [i for i in a if i in b]

Drawback: This method may not work for larger lists. Using numpy is recommended for larger lists.

Answered By: Lavya 'Orion'

Basic and not very extensive testing comparing the execution time of the five supplied answers:

def numpyIndexValues(a, b):
    na = np.array(a)
    nb = np.array(b)
    out = list(na[nb])
    return out

def mapIndexValues(a, b):
    out = map(a.__getitem__, b)
    return list(out)

def getIndexValues(a, b):
    out = operator.itemgetter(*b)(a)
    return out

def pythonLoopOverlap(a, b):
    c = [ a[i] for i in b]
    return c

multipleListItemValues = lambda searchList, ind: [searchList[i] for i in ind]

using the following input:

a = range(0, 10000000)
b = range(500, 500000)

simple python loop was the quickest with lambda operation a close second, mapIndexValues and getIndexValues were consistently pretty similar with numpy method significantly slower after converting lists to numpy arrays.If data is already in numpy arrays the numpyIndexValues method with the numpy.array conversion removed is quickest.

numpyIndexValues -> time:1.38940598 (when converted the lists to numpy arrays)
numpyIndexValues -> time:0.0193445 (using numpy array instead of python list as input, and conversion code removed)
mapIndexValues -> time:0.06477512099999999
getIndexValues -> time:0.06391049500000001
multipleListItemValues -> time:0.043773591
pythonLoopOverlap -> time:0.043021754999999995
Answered By: Don Smythe

I’m sure this has already been considered: If the amount of indices in b is small and constant, one could just write the result like:

c = [a[b[0]]] + [a[b[1]]] + [a[b[2]]]

Or even simpler if the indices itself are constants…

c = [a[1]] + [a[2]] + [a[5]]

Or if there is a consecutive range of indices…

c = a[1:3] + [a[5]]
Answered By: ecp

Another solution could be via pandas Series:

import pandas as pd

a = pd.Series([-2, 1, 5, 3, 8, 5, 6])
b = [1, 2, 5]
c = a[b]

You can then convert c back to a list if you want:

c = list(c)
Answered By: BossaNova

Static indexes and small list?

Don’t forget that if the list is small and the indexes don’t change, as in your example, sometimes the best thing is to use sequence unpacking:

_,a1,a2,_,_,a3,_ = a

The performance is much better and you can also save one line of code:

 %timeit _,a1,b1,_,_,c1,_ = a
10000000 loops, best of 3: 154 ns per loop 
%timeit itemgetter(*b)(a)
1000000 loops, best of 3: 753 ns per loop
 %timeit [ a[i] for i in b]
1000000 loops, best of 3: 777 ns per loop
 %timeit map(a.__getitem__, b)
1000000 loops, best of 3: 1.42 µs per loop
Answered By: G M

Here’s a simpler way:

a = [-2,1,5,3,8,5,6]
b = [1,2,5]
c = [e for i, e in enumerate(a) if i in b]
Answered By: Max Sirwa

Kind of pythonic way:

c = [x for x in a if a.index(x) in b]
Answered By: Dmitry K. Chebanov

List comprehension is clearly the most immediate and easiest to remember – in addition to being quite pythonic!

In any case, among the proposed solutions, it is not the fastest (I have run my test on Windows using Python 3.8.3):

import timeit
from itertools import compress
import random
from operator import itemgetter
import pandas as pd

__N_TESTS__ = 10_000

vector = [str(x) for x in range(100)]
filter_indeces = sorted(random.sample(range(100), 10))
filter_boolean = random.choices([True, False], k=100)

# Different ways for selecting elements given indeces

# list comprehension
def f1(v, f):
   return [v[i] for i in filter_indeces]

# itemgetter
def f2(v, f):
   return itemgetter(*f)(v)

# using pandas.Series
# this is immensely slow
def f3(v, f):
   return list(pd.Series(v)[f])

# using map and __getitem__
def f4(v, f):
   return list(map(v.__getitem__, f))

# using enumerate!
def f5(v, f):
   return [x for i, x in enumerate(v) if i in f]

# using numpy array
def f6(v, f):
   return list(np.array(v)[f])

print("{:30s}:{:f} secs".format("List comprehension", timeit.timeit(lambda:f1(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Operator.itemgetter", timeit.timeit(lambda:f2(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Using Pandas series", timeit.timeit(lambda:f3(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Using map and __getitem__", timeit.timeit(lambda: f4(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Enumeration (Why anyway?)", timeit.timeit(lambda: f5(vector, filter_indeces), number=__N_TESTS__)))

My results are:

List comprehension :0.007113 secs
Operator.itemgetter :0.003247 secs
Using Pandas series :2.977286 secs
Using map and getitem :0.005029 secs
Enumeration (Why anyway?) :0.135156 secs
Numpy :0.157018 secs

Answered By: nikeros

The results for the latest pandas==1.4.2 as of June 2022 are as follows.

Note that simple slicing is no longer possible and benchmark results are faster.

import timeit
import pandas as pd
print(pd.__version__)
# 1.4.2

pd.Series([-2, 1, 5, 3, 8, 5, 6])[1, 2, 5]
# KeyError: 'key of type tuple not found and not a MultiIndex'

pd.Series([-2, 1, 5, 3, 8, 5, 6]).iloc[[1, 2, 5]].tolist()
# [1, 5, 5]

def extract_multiple_elements():
    return pd.Series([-2, 1, 5, 3, 8, 5, 6]).iloc[[1, 2, 5]].tolist()

__N_TESTS__ = 10_000
t1 = timeit.timeit(extract_multiple_elements, number=__N_TESTS__)
print(round(t1, 3), 'seconds')
# 1.035 seconds
Answered By: Keiku