python – Find all indexes for the matching value in a list of ordered dict
Question:
I’m having a list of ordered dict which looks similarly as below
[OrderedDict([('a', 1), ('b', 2)]), OrderedDict([('a', 1), ('b', 3)]), OrderedDict([('a', 2), ('b', 2)]), OrderedDict([('a', 3), ('b', 2)]), OrderedDict([('a', 1), ('b', 3)])]
I wanted to store indexes of list in an array which are the a
value as 1
So, my list would contain below elements
[0,1,4]
I’m having a traditional script to get these values but since my original list is holding more than a million ordered dicts, it’s taking a longer time to fetch the elements.
for ele in range(len(liso)):
if(liso[ele]['a'] ==1):
giso.add(ele)
Can someone help me to rewrite the above script using map
or filter
to optimize the query?
Answers:
List Comprehension: [i for i, x in enumerate(liso) if x['a'] == 1]
Filter:
If you’re using Python 2: filter(lambda i: liso[i]['a'] == 1, xrange(len(liso)))
If you’re using Python 3: list(filter(lambda i: liso[i]['a'] == 1, range(len(liso))))
Ducks would work well here. It builds a database-style index on Python objects.
pip install ducks
from ducks import Dex
objects = [OrderedDict([('a', 1), ('b', 2)]), OrderedDict([('a', 1), ('b', 3)]), OrderedDict([('a', 2), ('b', 2)]), OrderedDict([('a', 3), ('b', 2)]), OrderedDict([('a', 1), ('b', 3)])]
# build index on objects
dex = Dex(objects, ['a', 'b'])
# get matching objects
dex[{'a': 1}] # gets objects where a == 1
dex[{'b': 2}] # gets objects where b == 2
dex[{'a': 1, 'b': 2}] # gets objects where a == 1 and b == 2
dex[{'a': {'<': 3}}] # gets objects where a < 3
On a million-item dataset, those queries will run 10 to 100x faster than filter / list comprehensions, which must scan through every object every time.
I’m having a list of ordered dict which looks similarly as below
[OrderedDict([('a', 1), ('b', 2)]), OrderedDict([('a', 1), ('b', 3)]), OrderedDict([('a', 2), ('b', 2)]), OrderedDict([('a', 3), ('b', 2)]), OrderedDict([('a', 1), ('b', 3)])]
I wanted to store indexes of list in an array which are the a
value as 1
So, my list would contain below elements
[0,1,4]
I’m having a traditional script to get these values but since my original list is holding more than a million ordered dicts, it’s taking a longer time to fetch the elements.
for ele in range(len(liso)):
if(liso[ele]['a'] ==1):
giso.add(ele)
Can someone help me to rewrite the above script using map
or filter
to optimize the query?
List Comprehension: [i for i, x in enumerate(liso) if x['a'] == 1]
Filter:
If you’re using Python 2: filter(lambda i: liso[i]['a'] == 1, xrange(len(liso)))
If you’re using Python 3: list(filter(lambda i: liso[i]['a'] == 1, range(len(liso))))
Ducks would work well here. It builds a database-style index on Python objects.
pip install ducks
from ducks import Dex
objects = [OrderedDict([('a', 1), ('b', 2)]), OrderedDict([('a', 1), ('b', 3)]), OrderedDict([('a', 2), ('b', 2)]), OrderedDict([('a', 3), ('b', 2)]), OrderedDict([('a', 1), ('b', 3)])]
# build index on objects
dex = Dex(objects, ['a', 'b'])
# get matching objects
dex[{'a': 1}] # gets objects where a == 1
dex[{'b': 2}] # gets objects where b == 2
dex[{'a': 1, 'b': 2}] # gets objects where a == 1 and b == 2
dex[{'a': {'<': 3}}] # gets objects where a < 3
On a million-item dataset, those queries will run 10 to 100x faster than filter / list comprehensions, which must scan through every object every time.