python: extract float from a python list of string( AUD 31.99)
Question:
python: extract float from a python list of string( AUD 31.99).
I used openpyxl to read from an excel file the amount list. and i saved it in a list but the list is in string form like this:
['31.40 AUD', ' 32.99 AUD', '37.24 AUD']
I need to get the float from the string item list so that i can later save it in a new list to get the total of them.
Desired output:
[31.40, 32.99, 37.24]
I have already tried these:
newList = re.findall("d+.d+", tot[0])
print(newList)
Output:
[31.40]
But How can I use this for all the item elements?
I am new to python, this is just for some work i do, wanted to see the total using python instead of using excel`s find & replace option.
thanks
Answers:
Is it possible to use a string split instead? I think it would be much simpler
ls1 = ['32.46 AUD', '17.34 AUD']
myFloats = []
for aString in ls1:
aFloat = float(aString.split()[0])
myFloats.append(aFloat)
You can use the map
function:
inList = ['31.40 AUD', ' 32.99 AUD', '37.24 AUD']
output = list(map(lambda elem: float(elem.split()[0]), inList))
print(output)
Output:
[31.4, 32.99, 37.24]
If you want to get list of values with regex, try
tot = ['31.40 AUD', ' 32.99 AUD', '37.24 AUD']
newList = [float(re.search('d+.d+', fl).group(0)) for fl in tot]
print(newList)
# [31.40, 32.99, 37.24]
but using split
seem to be easier solution in this case
tot = ['31.40 AUD', ' 32.99 AUD', '37.24 AUD']
newList = [float(item.split()[0]) for item in tot]
print(newList)
# [31.40, 32.99, 37.24]
If second substring is always the same ("AUD"
) you can also try
tot = ['31.40 AUD', ' 32.99 AUD', '37.24 AUD']
newList = [float(item.rstrip(' AUD')) for item in tot]
print(newList)
# [31.40, 32.99, 37.24]
You should consider handling errors. Here is one way for instance:
import re
import math
def float_from_string(str_):
# Try to extract a floating number, if fail return nan
r = re.search('d+.d+', str_)
return float(r.group()) if r else math.nan
tot = ['31.40 AUD', ' 32.99 AUD', '37.24 AUD', ' nonumberhere AUD']
totfloat = [float_from_string(i) for i in tot]
print(totfloat)
Returns:
[31.4, 32.99, 37.24, nan]
Considering that the list is as follows
l = ['31.40 AUD', ' 32.99 AUD', '37.24 AUD']
There are various ways to extract the floats. Will leave, below, five possible options.
Option 1
Using a regular expression with Python’s re
with a list comprehension as follows
import re
regex = re.compile(r'(d+.d+)')
l = [float(regex.search(x).group(1)) for x in l]
[Out]:
[31.4, 32.99, 37.24]
Option 2
Using str.strip
and str.split
as follows
l = [float(x.strip().split(' ')[0]) for x in l]
[Out]:
[31.4, 32.99, 37.24]
Option 3
Using str.split
as follows
l = [float(x.split()[0]) for x in l]
[Out]:
[31.4, 32.99, 37.24]
Option 4
One approach would be to remove the space and the currency (AUD
) with use str.strip
as follows
l = [float(x.strip(' AUD')) for x in l]
[Out]:
[31.4, 32.99, 37.24]
Assuming that one has a list with the various currencies (let’s say AUD
, USD
, and EUR
), as one’s list only has AUD
, one can use str.strip
as follows
hl = [' AUD', ' USD', ' EUR']
l = [float(x.strip(hl[0])) for x in l]
[Out]:
[31.4, 32.99, 37.24]
Option 5
Another approach that works for this use case would be as follows
l = [float(x[:6]) for x in l]
[Out]:
[31.4, 32.99, 37.24]
Note, however, that one might have to adjust the number or resort to a different method, depending on the floats in the strings on one’s list.
python: extract float from a python list of string( AUD 31.99).
I used openpyxl to read from an excel file the amount list. and i saved it in a list but the list is in string form like this:
['31.40 AUD', ' 32.99 AUD', '37.24 AUD']
I need to get the float from the string item list so that i can later save it in a new list to get the total of them.
Desired output:
[31.40, 32.99, 37.24]
I have already tried these:
newList = re.findall("d+.d+", tot[0])
print(newList)
Output:
[31.40]
But How can I use this for all the item elements?
I am new to python, this is just for some work i do, wanted to see the total using python instead of using excel`s find & replace option.
thanks
Is it possible to use a string split instead? I think it would be much simpler
ls1 = ['32.46 AUD', '17.34 AUD']
myFloats = []
for aString in ls1:
aFloat = float(aString.split()[0])
myFloats.append(aFloat)
You can use the map
function:
inList = ['31.40 AUD', ' 32.99 AUD', '37.24 AUD']
output = list(map(lambda elem: float(elem.split()[0]), inList))
print(output)
Output:
[31.4, 32.99, 37.24]
If you want to get list of values with regex, try
tot = ['31.40 AUD', ' 32.99 AUD', '37.24 AUD']
newList = [float(re.search('d+.d+', fl).group(0)) for fl in tot]
print(newList)
# [31.40, 32.99, 37.24]
but using split
seem to be easier solution in this case
tot = ['31.40 AUD', ' 32.99 AUD', '37.24 AUD']
newList = [float(item.split()[0]) for item in tot]
print(newList)
# [31.40, 32.99, 37.24]
If second substring is always the same ("AUD"
) you can also try
tot = ['31.40 AUD', ' 32.99 AUD', '37.24 AUD']
newList = [float(item.rstrip(' AUD')) for item in tot]
print(newList)
# [31.40, 32.99, 37.24]
You should consider handling errors. Here is one way for instance:
import re
import math
def float_from_string(str_):
# Try to extract a floating number, if fail return nan
r = re.search('d+.d+', str_)
return float(r.group()) if r else math.nan
tot = ['31.40 AUD', ' 32.99 AUD', '37.24 AUD', ' nonumberhere AUD']
totfloat = [float_from_string(i) for i in tot]
print(totfloat)
Returns:
[31.4, 32.99, 37.24, nan]
Considering that the list is as follows
l = ['31.40 AUD', ' 32.99 AUD', '37.24 AUD']
There are various ways to extract the floats. Will leave, below, five possible options.
Option 1
Using a regular expression with Python’s re
with a list comprehension as follows
import re
regex = re.compile(r'(d+.d+)')
l = [float(regex.search(x).group(1)) for x in l]
[Out]:
[31.4, 32.99, 37.24]
Option 2
Using str.strip
and str.split
as follows
l = [float(x.strip().split(' ')[0]) for x in l]
[Out]:
[31.4, 32.99, 37.24]
Option 3
Using str.split
as follows
l = [float(x.split()[0]) for x in l]
[Out]:
[31.4, 32.99, 37.24]
Option 4
One approach would be to remove the space and the currency (AUD
) with use str.strip
as follows
l = [float(x.strip(' AUD')) for x in l]
[Out]:
[31.4, 32.99, 37.24]
Assuming that one has a list with the various currencies (let’s say AUD
, USD
, and EUR
), as one’s list only has AUD
, one can use str.strip
as follows
hl = [' AUD', ' USD', ' EUR']
l = [float(x.strip(hl[0])) for x in l]
[Out]:
[31.4, 32.99, 37.24]
Option 5
Another approach that works for this use case would be as follows
l = [float(x[:6]) for x in l]
[Out]:
[31.4, 32.99, 37.24]
Note, however, that one might have to adjust the number or resort to a different method, depending on the floats in the strings on one’s list.