iterator function does not iterate over all elements
Question:
I have two functions:
# function to get number of wanted atom
def atom_number_grabber(sum_formula, wanted_atom):
match = re.match(r"([A-Z][a-z]*)([0-9]*)", sum_formula, re.I)
if match:
items = match.groups()
if items[0] == wanted_atom:
atom_number = items[1]
if not atom_number:
atom_number = "1"
return atom_number
else:
pass
and
#function to iterate over all elements
def iterator(sum_formula_list, atom_number_grabber, wanted_atom):
for sum_formula in sum_formula_list:
return atom_number_grabber(sum_formula, wanted_atom)
However, when I use my iterator function, it does not iterate over all of the elements in my list
test_list = ["C25", "H32", "O8"]
iterator(sum_formula_list = test_list, atom_number_grabber = atom_number_grabber, wanted_atom = "O")
output:
desired output:
8
To my suprise the function only iterates over the first element; so if I change my wanted_atom to "C", the code works properly:
iterator(sum_formula_list = test_list, atom_number_grabber = atom_number_grabber, wanted_atom = "C")
output
25
Answers:
When you have an unconditional return
in a for
loop, that loop will only iterate once, and then exit.
Given that your function is called iterator
, maybe you want to return … an iterator. In that case you could use yield
instead of return
. Make it conditional so that you only yield when you have a result. Like so:
def iterator(sum_formula_list, atom_number_grabber, wanted_atom):
for sum_formula in sum_formula_list:
result = atom_number_grabber(sum_formula, wanted_atom)
if result:
yield result
Now the main program can consume that iterator, for instance with *
:
print(*iterator(sum_formula_list = test_list, atom_number_grabber = atom_number_grabber, wanted_atom = "O"))
Output: 8
Other comments
-
There is an indentation that is wrong in your first function: the second if
might be executed when items
is not defined. This second if
block should be inside the block of the first if
block.
-
else: pass
is useless.
-
As you use the re.I
flag, there is no use in mixing upper and lower case with [A-Z][a-z]*
. Just do [A-Z]+
then.
-
You can access groups through indexing on the match
object. There is no need to retrieve the groups
. Just be aware that match[0]
is the whole match, and match[1]
is the first group.
-
if not atom_number
can be done with a logical or
operator, like return match[2] or "1"
-
There is no need to pass atom_number_grabber
as argument — it can be used directly.
-
It seems more practical to have the first function only take care of extracting the parts, while the second function would do the filtering on the wanted atom.
-
The second function can use comprehension syntax.
Here is how it could be done:
import re
# function to split string into atom name and number
def atom_parts(sum_formula):
match = re.match(r"([A-Z]*)([0-9]*)", sum_formula, re.I)
return match[1], int("0" + match[2]) or 1
# function to iterate over all elements
def iterator(sum_formula_list, wanted_atom):
return (number for name, number in map(atom_parts, sum_formula_list)
if name == wanted_atom)
test_list = ["C25", "H32", "O8"]
print(*iterator(sum_formula_list = test_list, wanted_atom = "O")) # 8
I have two functions:
# function to get number of wanted atom
def atom_number_grabber(sum_formula, wanted_atom):
match = re.match(r"([A-Z][a-z]*)([0-9]*)", sum_formula, re.I)
if match:
items = match.groups()
if items[0] == wanted_atom:
atom_number = items[1]
if not atom_number:
atom_number = "1"
return atom_number
else:
pass
and
#function to iterate over all elements
def iterator(sum_formula_list, atom_number_grabber, wanted_atom):
for sum_formula in sum_formula_list:
return atom_number_grabber(sum_formula, wanted_atom)
However, when I use my iterator function, it does not iterate over all of the elements in my list
test_list = ["C25", "H32", "O8"]
iterator(sum_formula_list = test_list, atom_number_grabber = atom_number_grabber, wanted_atom = "O")
output:
desired output:
8
To my suprise the function only iterates over the first element; so if I change my wanted_atom to "C", the code works properly:
iterator(sum_formula_list = test_list, atom_number_grabber = atom_number_grabber, wanted_atom = "C")
output
25
When you have an unconditional return
in a for
loop, that loop will only iterate once, and then exit.
Given that your function is called iterator
, maybe you want to return … an iterator. In that case you could use yield
instead of return
. Make it conditional so that you only yield when you have a result. Like so:
def iterator(sum_formula_list, atom_number_grabber, wanted_atom):
for sum_formula in sum_formula_list:
result = atom_number_grabber(sum_formula, wanted_atom)
if result:
yield result
Now the main program can consume that iterator, for instance with *
:
print(*iterator(sum_formula_list = test_list, atom_number_grabber = atom_number_grabber, wanted_atom = "O"))
Output: 8
Other comments
-
There is an indentation that is wrong in your first function: the second
if
might be executed whenitems
is not defined. This secondif
block should be inside the block of the firstif
block. -
else: pass
is useless. -
As you use the
re.I
flag, there is no use in mixing upper and lower case with[A-Z][a-z]*
. Just do[A-Z]+
then. -
You can access groups through indexing on the
match
object. There is no need to retrieve thegroups
. Just be aware thatmatch[0]
is the whole match, andmatch[1]
is the first group. -
if not atom_number
can be done with a logicalor
operator, likereturn match[2] or "1"
-
There is no need to pass
atom_number_grabber
as argument — it can be used directly. -
It seems more practical to have the first function only take care of extracting the parts, while the second function would do the filtering on the wanted atom.
-
The second function can use comprehension syntax.
Here is how it could be done:
import re
# function to split string into atom name and number
def atom_parts(sum_formula):
match = re.match(r"([A-Z]*)([0-9]*)", sum_formula, re.I)
return match[1], int("0" + match[2]) or 1
# function to iterate over all elements
def iterator(sum_formula_list, wanted_atom):
return (number for name, number in map(atom_parts, sum_formula_list)
if name == wanted_atom)
test_list = ["C25", "H32", "O8"]
print(*iterator(sum_formula_list = test_list, wanted_atom = "O")) # 8