Python: Find in list
Question:
I use the following to check if item
is in my_list
:
if item in my_list:
print("Desired item is in list")
Is "if item in my_list:
" the most "pythonic" way of finding an item in a list?
EDIT FOR REOPENING: the question has been considered dupplicate, but I’m not entirely convinced: here this question is roughly "what is the most Pythonic way to find an element in a list". And the first answer to the question is really extensive in all Python ways to do this.
Whereas on the linked dupplicate question and its corresponding answer, the focus is roughly only limited to the ‘in’ key word in Python. I think it is really limiting, compared to the current question.
And I think the answer to this current question, is more relevant and elaborated that the answer of the proposed dupplicate question/answer.
Answers:
As for your first question: "if item is in my_list:
" is perfectly fine and should work if item
equals one of the elements inside my_list
. The item must exactly match an item in the list. For instance, "abc"
and "ABC"
do not match. Floating point values in particular may suffer from inaccuracy. For instance, 1 - 1/3 != 2/3
.
As for your second question: There’s actually several possible ways if "finding" things in lists.
Checking if something is inside
This is the use case you describe: Checking whether something is inside a list or not. As you know, you can use the in
operator for that:
3 in [1, 2, 3] # => True
Filtering a collection
That is, finding all elements in a sequence that meet a certain condition. You can use list comprehension or generator expressions for that:
matches = [x for x in lst if fulfills_some_condition(x)]
matches = (x for x in lst if x > 6)
The latter will return a generator which you can imagine as a sort of lazy list that will only be built as soon as you iterate through it. By the way, the first one is exactly equivalent to
matches = filter(fulfills_some_condition, lst)
in Python 2. Here you can see higher-order functions at work. In Python 3, filter
doesn’t return a list, but a generator-like object.
Finding the first occurrence
If you only want the first thing that matches a condition (but you don’t know what it is yet), it’s fine to use a for loop (possibly using the else
clause as well, which is not really well-known). You can also use
next(x for x in lst if ...)
which will return the first match or raise a StopIteration
if none is found. Alternatively, you can use
next((x for x in lst if ...), [default value])
Finding the location of an item
For lists, there’s also the index
method that can sometimes be useful if you want to know where a certain element is in the list:
[1,2,3].index(2) # => 1
[1,2,3].index(4) # => ValueError
However, note that if you have duplicates, .index
always returns the lowest index:……
[1,2,3,2].index(2) # => 1
If there are duplicates and you want all the indexes then you can use enumerate()
instead:
[i for i,x in enumerate([1,2,3,2]) if x==2] # => [1, 3]
If you want to find one element or None
use default in next
, it won’t raise StopIteration
if the item was not found in the list:
first_or_default = next((x for x in lst if ...), None)
Check there are no additional/unwanted whites space in the items of the list of strings.
That’s a reason that can be interfering explaining the items cannot be found.
While the answer from Niklas B. is pretty comprehensive, when we want to find an item in a list it is sometimes useful to get its index:
next((i for i, x in enumerate(lst) if [condition on x]), [default value])
Finding the first occurrence
There’s a recipe for that in itertools:
def first_true(iterable, default=False, pred=None):
"""Returns the first true value in the iterable.
If no true value is found, returns *default*
If *pred* is not None, returns the first item
for which pred(item) is true.
"""
# first_true([a,b,c], x) --> a or b or c or x
# first_true([a,b], x, f) --> a if f(a) else b if f(b) else x
return next(filter(pred, iterable), default)
For example, the following code finds the first odd number in a list:
>>> first_true([2,3,4,5], None, lambda x: x%2==1)
3
You can copy/paste it or install more-itertools
pip3 install more-itertools
where this recipe is already included.
You may want to use one of two possible searches while working with list of strings:
-
if list element is equal to an item (‘example’ is in
[‘one’,’example’,’two’]):
if item in your_list: some_function_on_true()
‘ex’ in [‘one’,’ex’,’two’] => True
‘ex_1’ in [‘one’,’ex’,’two’] => False
-
if list element is like an item (‘ex’ is in
[‘one,’example’,’two’] or ‘example_1’ is in
[‘one’,’example’,’two’]):
matches = [el for el in your_list if item in el]
or
matches = [el for el in your_list if el in item]
then just check len(matches)
or read them if needed.
Another alternative: you can check if an item is in a list with if item in list:
, but this is order O(n). If you are dealing with big lists of items and all you need to know is whether something is a member of your list, you can convert the list to a set first and take advantage of constant time set lookup:
my_set = set(my_list)
if item in my_set: # much faster on average than using a list
# do something
Not going to be the correct solution in every case, but for some cases this might give you better performance.
Note that creating the set with set(my_list)
is also O(n), so if you only need to do this once then it isn’t any faster to do it this way. If you need to repeatedly check membership though, then this will be O(1) for every lookup after that initial set creation.
Instead of using list.index(x)
which returns the index of x if it is found in list or returns a #ValueError
message if x is not found, you could use list.count(x)
which returns the number of occurrences of x in the list (validation that x is indeed in the list) or it returns 0 otherwise (in the absence of x). The cool thing about count()
is that it doesn’t break your code or require you to throw an exception when x is not found.
Definition and Usage
the count()
method returns the number of elements with the specified value.
Syntax
list.count(value)
example:
fruits = ['apple', 'banana', 'cherry']
x = fruits.count("cherry")
Question’s example:
item = someSortOfSelection()
if myList.count(item) >= 1 :
doMySpecialFunction(item)
If you are going to check if value exist in the collectible once then using ‘in’ operator is fine. However, if you are going to check for more than once then I recommend using bisect module. Keep in mind that using bisect module data must be sorted. So you sort data once and then you can use bisect. Using bisect module on my machine is about 12 times faster than using ‘in’ operator.
Here is an example of code using Python 3.8 and above syntax:
import bisect
from timeit import timeit
def bisect_search(container, value):
return (
(index := bisect.bisect_left(container, value)) < len(container)
and container[index] == value
)
data = list(range(1000))
# value to search
true_value = 666
false_value = 66666
# times to test
ttt = 1000
print(f"{bisect_search(data, true_value)=} {bisect_search(data, false_value)=}")
t1 = timeit(lambda: true_value in data, number=ttt)
t2 = timeit(lambda: bisect_search(data, true_value), number=ttt)
print("Performance:", f"{t1=:.4f}, {t2=:.4f}, diffs {t1/t2=:.2f}")
Output:
bisect_search(data, true_value)=True bisect_search(data, false_value)=False
Performance: t1=0.0220, t2=0.0019, diffs t1/t2=11.71
lstr=[1, 2, 3]
lstr=map(str,lstr)
r=re.compile('^(3){1}')
results=list(filter(r.match,lstr))
print(results)
you said that in my several trials, maybe there were whitespaces, and line feeds interfering .that why I m giving you this solution.
myList=[" test","ok","ok1"]
item = "test"#someSortOfSelection()
if True in list(map(lambda el : item in el ,myList)):
doMySpecialFunction(item)
for_loop
def for_loop(l, target):
for i in l:
if i == target:
return i
return None
l = [1, 2, 3, 4, 5]
print(for_loop(l, 0))
print(for_loop(l, 1))
# None
# 1
next
def _next(l, target):
return next((i for i in l if i == target), None)
l = [1, 2, 3, 4, 5]
print(_next(l, 0))
print(_next(l, 1))
# None
# 1
more_itertools
more_itertools.first_true(iterable, default=None, pred=None)
install
pip install more-itertools
or use it directly
def first_true(iterable, default=None, pred=None):
return next(filter(pred, iterable), default)
from more_itertools import first_true
l = [1, 2, 3, 4, 5]
print(first_true(l, pred=lambda x: x == 0))
print(first_true(l, pred=lambda x: x == 1))
# None
# 1
Compare
method
time/s
for_loop
2.77
next()
3.64
more_itertools.first_true()
3.82 or 10.86
import timeit
import more_itertools
def for_loop():
for i in range(10000000):
if i == 9999999:
return i
return None
def _next():
return next((i for i in range(10000000) if i == 9999999), None)
def first_true():
return more_itertools.first_true(range(10000000), pred=lambda x: x == 9999999)
def first_true_2():
return more_itertools.first_true((i for i in range(10000000) if i == 9999999))
print(timeit.timeit(for_loop, number=10))
print(timeit.timeit(_next, number=10))
print(timeit.timeit(first_true, number=10))
print(timeit.timeit(first_true_2, number=10))
# 2.7730861
# 3.6409407000000003
# 10.869996399999998
# 3.8214487000000013
in
works with a list()
of dict()
s too:
a = [ {"a":1}, {"b":1, "c":1} ]
b = {"c":1 , "b":1} # <-- No matter the order
if b in a:
print("b is in a")
At least in Python 3.8.10, no matter the order
I use the following to check if item
is in my_list
:
if item in my_list:
print("Desired item is in list")
Is "if item in my_list:
" the most "pythonic" way of finding an item in a list?
EDIT FOR REOPENING: the question has been considered dupplicate, but I’m not entirely convinced: here this question is roughly "what is the most Pythonic way to find an element in a list". And the first answer to the question is really extensive in all Python ways to do this.
Whereas on the linked dupplicate question and its corresponding answer, the focus is roughly only limited to the ‘in’ key word in Python. I think it is really limiting, compared to the current question.
And I think the answer to this current question, is more relevant and elaborated that the answer of the proposed dupplicate question/answer.
As for your first question: "if item is in my_list:
" is perfectly fine and should work if item
equals one of the elements inside my_list
. The item must exactly match an item in the list. For instance, "abc"
and "ABC"
do not match. Floating point values in particular may suffer from inaccuracy. For instance, 1 - 1/3 != 2/3
.
As for your second question: There’s actually several possible ways if "finding" things in lists.
Checking if something is inside
This is the use case you describe: Checking whether something is inside a list or not. As you know, you can use the in
operator for that:
3 in [1, 2, 3] # => True
Filtering a collection
That is, finding all elements in a sequence that meet a certain condition. You can use list comprehension or generator expressions for that:
matches = [x for x in lst if fulfills_some_condition(x)]
matches = (x for x in lst if x > 6)
The latter will return a generator which you can imagine as a sort of lazy list that will only be built as soon as you iterate through it. By the way, the first one is exactly equivalent to
matches = filter(fulfills_some_condition, lst)
in Python 2. Here you can see higher-order functions at work. In Python 3, filter
doesn’t return a list, but a generator-like object.
Finding the first occurrence
If you only want the first thing that matches a condition (but you don’t know what it is yet), it’s fine to use a for loop (possibly using the else
clause as well, which is not really well-known). You can also use
next(x for x in lst if ...)
which will return the first match or raise a StopIteration
if none is found. Alternatively, you can use
next((x for x in lst if ...), [default value])
Finding the location of an item
For lists, there’s also the index
method that can sometimes be useful if you want to know where a certain element is in the list:
[1,2,3].index(2) # => 1
[1,2,3].index(4) # => ValueError
However, note that if you have duplicates, .index
always returns the lowest index:……
[1,2,3,2].index(2) # => 1
If there are duplicates and you want all the indexes then you can use enumerate()
instead:
[i for i,x in enumerate([1,2,3,2]) if x==2] # => [1, 3]
If you want to find one element or None
use default in next
, it won’t raise StopIteration
if the item was not found in the list:
first_or_default = next((x for x in lst if ...), None)
Check there are no additional/unwanted whites space in the items of the list of strings.
That’s a reason that can be interfering explaining the items cannot be found.
While the answer from Niklas B. is pretty comprehensive, when we want to find an item in a list it is sometimes useful to get its index:
next((i for i, x in enumerate(lst) if [condition on x]), [default value])
Finding the first occurrence
There’s a recipe for that in itertools:
def first_true(iterable, default=False, pred=None):
"""Returns the first true value in the iterable.
If no true value is found, returns *default*
If *pred* is not None, returns the first item
for which pred(item) is true.
"""
# first_true([a,b,c], x) --> a or b or c or x
# first_true([a,b], x, f) --> a if f(a) else b if f(b) else x
return next(filter(pred, iterable), default)
For example, the following code finds the first odd number in a list:
>>> first_true([2,3,4,5], None, lambda x: x%2==1)
3
You can copy/paste it or install more-itertools
pip3 install more-itertools
where this recipe is already included.
You may want to use one of two possible searches while working with list of strings:
-
if list element is equal to an item (‘example’ is in
[‘one’,’example’,’two’]):if item in your_list: some_function_on_true()
‘ex’ in [‘one’,’ex’,’two’] => True
‘ex_1’ in [‘one’,’ex’,’two’] => False
-
if list element is like an item (‘ex’ is in
[‘one,’example’,’two’] or ‘example_1’ is in
[‘one’,’example’,’two’]):matches = [el for el in your_list if item in el]
or
matches = [el for el in your_list if el in item]
then just check
len(matches)
or read them if needed.
Another alternative: you can check if an item is in a list with if item in list:
, but this is order O(n). If you are dealing with big lists of items and all you need to know is whether something is a member of your list, you can convert the list to a set first and take advantage of constant time set lookup:
my_set = set(my_list)
if item in my_set: # much faster on average than using a list
# do something
Not going to be the correct solution in every case, but for some cases this might give you better performance.
Note that creating the set with set(my_list)
is also O(n), so if you only need to do this once then it isn’t any faster to do it this way. If you need to repeatedly check membership though, then this will be O(1) for every lookup after that initial set creation.
Instead of using list.index(x)
which returns the index of x if it is found in list or returns a #ValueError
message if x is not found, you could use list.count(x)
which returns the number of occurrences of x in the list (validation that x is indeed in the list) or it returns 0 otherwise (in the absence of x). The cool thing about count()
is that it doesn’t break your code or require you to throw an exception when x is not found.
Definition and Usage
the count()
method returns the number of elements with the specified value.
Syntax
list.count(value)
example:
fruits = ['apple', 'banana', 'cherry']
x = fruits.count("cherry")
Question’s example:
item = someSortOfSelection()
if myList.count(item) >= 1 :
doMySpecialFunction(item)
If you are going to check if value exist in the collectible once then using ‘in’ operator is fine. However, if you are going to check for more than once then I recommend using bisect module. Keep in mind that using bisect module data must be sorted. So you sort data once and then you can use bisect. Using bisect module on my machine is about 12 times faster than using ‘in’ operator.
Here is an example of code using Python 3.8 and above syntax:
import bisect
from timeit import timeit
def bisect_search(container, value):
return (
(index := bisect.bisect_left(container, value)) < len(container)
and container[index] == value
)
data = list(range(1000))
# value to search
true_value = 666
false_value = 66666
# times to test
ttt = 1000
print(f"{bisect_search(data, true_value)=} {bisect_search(data, false_value)=}")
t1 = timeit(lambda: true_value in data, number=ttt)
t2 = timeit(lambda: bisect_search(data, true_value), number=ttt)
print("Performance:", f"{t1=:.4f}, {t2=:.4f}, diffs {t1/t2=:.2f}")
Output:
bisect_search(data, true_value)=True bisect_search(data, false_value)=False
Performance: t1=0.0220, t2=0.0019, diffs t1/t2=11.71
lstr=[1, 2, 3]
lstr=map(str,lstr)
r=re.compile('^(3){1}')
results=list(filter(r.match,lstr))
print(results)
you said that in my several trials, maybe there were whitespaces, and line feeds interfering .that why I m giving you this solution.
myList=[" test","ok","ok1"]
item = "test"#someSortOfSelection()
if True in list(map(lambda el : item in el ,myList)):
doMySpecialFunction(item)
for_loop
def for_loop(l, target):
for i in l:
if i == target:
return i
return None
l = [1, 2, 3, 4, 5]
print(for_loop(l, 0))
print(for_loop(l, 1))
# None
# 1
next
def _next(l, target):
return next((i for i in l if i == target), None)
l = [1, 2, 3, 4, 5]
print(_next(l, 0))
print(_next(l, 1))
# None
# 1
more_itertools
more_itertools.first_true(iterable, default=None, pred=None)
install
pip install more-itertools
or use it directly
def first_true(iterable, default=None, pred=None):
return next(filter(pred, iterable), default)
from more_itertools import first_true
l = [1, 2, 3, 4, 5]
print(first_true(l, pred=lambda x: x == 0))
print(first_true(l, pred=lambda x: x == 1))
# None
# 1
Compare
method | time/s |
---|---|
for_loop | 2.77 |
next() | 3.64 |
more_itertools.first_true() | 3.82 or 10.86 |
import timeit
import more_itertools
def for_loop():
for i in range(10000000):
if i == 9999999:
return i
return None
def _next():
return next((i for i in range(10000000) if i == 9999999), None)
def first_true():
return more_itertools.first_true(range(10000000), pred=lambda x: x == 9999999)
def first_true_2():
return more_itertools.first_true((i for i in range(10000000) if i == 9999999))
print(timeit.timeit(for_loop, number=10))
print(timeit.timeit(_next, number=10))
print(timeit.timeit(first_true, number=10))
print(timeit.timeit(first_true_2, number=10))
# 2.7730861
# 3.6409407000000003
# 10.869996399999998
# 3.8214487000000013
in
works with a list()
of dict()
s too:
a = [ {"a":1}, {"b":1, "c":1} ]
b = {"c":1 , "b":1} # <-- No matter the order
if b in a:
print("b is in a")
At least in Python 3.8.10, no matter the order