python: extract float from a python list of string( AUD 31.99)

Question:

python: extract float from a python list of string( AUD 31.99).
I used openpyxl to read from an excel file the amount list. and i saved it in a list but the list is in string form like this:

['31.40 AUD', ' 32.99 AUD', '37.24 AUD']

I need to get the float from the string item list so that i can later save it in a new list to get the total of them.

Desired output:

[31.40, 32.99, 37.24]

I have already tried these:

newList = re.findall("d+.d+", tot[0])
print(newList)

Output:

[31.40]

But How can I use this for all the item elements?

I am new to python, this is just for some work i do, wanted to see the total using python instead of using excel`s find & replace option.
thanks

Asked By: Anamul Choudhury

||

Answers:

Is it possible to use a string split instead? I think it would be much simpler

ls1 = ['32.46 AUD', '17.34 AUD']

myFloats = []
for aString in ls1:
    aFloat = float(aString.split()[0])
    myFloats.append(aFloat)
Answered By: Davidhall

You can use the map function:

inList = ['31.40 AUD', ' 32.99 AUD', '37.24 AUD']
output = list(map(lambda elem: float(elem.split()[0]), inList))
print(output)

Output:

[31.4, 32.99, 37.24]
Answered By: Vasilis G.

If you want to get list of values with regex, try

tot = ['31.40 AUD', ' 32.99 AUD', '37.24 AUD']
newList = [float(re.search('d+.d+', fl).group(0)) for fl in tot]
print(newList)
# [31.40, 32.99, 37.24]

but using split seem to be easier solution in this case

tot = ['31.40 AUD', ' 32.99 AUD', '37.24 AUD']
newList = [float(item.split()[0]) for item in tot] 
print(newList)
# [31.40, 32.99, 37.24]

If second substring is always the same ("AUD") you can also try

tot = ['31.40 AUD', ' 32.99 AUD', '37.24 AUD']
newList = [float(item.rstrip(' AUD')) for item in tot] 
print(newList)
# [31.40, 32.99, 37.24]
Answered By: Andersson

You should consider handling errors. Here is one way for instance:

import re
import math

def float_from_string(str_):
    # Try to extract a floating number, if fail return nan
    r = re.search('d+.d+', str_)
    return float(r.group()) if r else math.nan

tot = ['31.40 AUD', ' 32.99 AUD', '37.24 AUD', ' nonumberhere AUD']
totfloat = [float_from_string(i) for i in tot]

print(totfloat)

Returns:

[31.4, 32.99, 37.24, nan]
Answered By: Anton vBR

Considering that the list is as follows

l = ['31.40 AUD', ' 32.99 AUD', '37.24 AUD']

There are various ways to extract the floats. Will leave, below, five possible options.


Option 1

Using a regular expression with Python’s re with a list comprehension as follows

import re

regex = re.compile(r'(d+.d+)')
l = [float(regex.search(x).group(1)) for x in l]

[Out]: 

[31.4, 32.99, 37.24]

Option 2

Using str.strip and str.split as follows

l = [float(x.strip().split(' ')[0]) for x in l]

[Out]: 

[31.4, 32.99, 37.24]

Option 3

Using str.split as follows

l = [float(x.split()[0]) for x in l]

[Out]: 

[31.4, 32.99, 37.24]

Option 4

One approach would be to remove the space and the currency (AUD) with use str.strip as follows

l = [float(x.strip(' AUD')) for x in l]

[Out]: 

[31.4, 32.99, 37.24]

Assuming that one has a list with the various currencies (let’s say AUD, USD, and EUR), as one’s list only has AUD, one can use str.strip as follows

hl = [' AUD', ' USD', ' EUR']

l = [float(x.strip(hl[0])) for x in l]

[Out]: 

[31.4, 32.99, 37.24]

Option 5

Another approach that works for this use case would be as follows

l = [float(x[:6]) for x in l]

[Out]: 

[31.4, 32.99, 37.24]

Note, however, that one might have to adjust the number or resort to a different method, depending on the floats in the strings on one’s list.

Answered By: Gonçalo Peres
Categories: questions Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.