How to get the real number after a string in a file

Question:

I have files that contain both strings and floats. I am interested in finding the floats after a specific string. Any help in writing such a function that reads the file look for that specific string and returns the float after it will be much appreciated.

Thanks

An example of a file is

lines = """aaaaaaaaaaaaaaa  bbbbbbbbbbbbbbb  cccccccccc
qq vvv rrr ssssa 22.6
zzzzx bbbb 12.0
xxxxxxxxxx -1.099
zzzz bbb nnn 33.5"""
import re

lines = """aaaaaaaaaaaaaaa  bbbbbbbbbbbbbbb  cccccccccc
qq vvv rrr ssssa 22.6
zzzzx bbbb 12.0
xxxxxxxxxx -1.099
zzzz bbb nnn 33.5"""

str_to_search = 'xxxxxxxxxx'
num = re.findall(r'^' + str_to_search + r' (d+.d+)', lines, flags=re.M)
print(num)

This works if there are no negative signs. In other words, if the number after the string ‘xxxxxxxxxx’ is 1.099 rather than ‘-1.099’, it works fine. The question I have is how to generalize so it accounts for negative numbers as well given that it can be positive number (no sign in this case) or a negative number (with a negative sign in this case)

Asked By: AbuStack

||

Answers:

You can use regex

(-?d+.?d*)

import re

lines = """aaaaaaaaaaaaaaa  bbbbbbbbbbbbbbb  cccccccccc
qq vvv rrr ssssa 22.6
zzzzx bbbb 12.0
xxxxxxxxxx -1.099
zzzz bbb nnn 33.5
xxxxxxxxxx 1.099"""

str_to_search = "xxxxxxxxxx"
num = re.findall(fr"(?m)^{str_to_search}s+(-?d+.?d*)", lines)
print(num)

Prints:

['-1.099', '1.099']
Answered By: Andrej Kesely

You can change the regex to following:

num = re.findall(r'^' + str_to_search + r' (-?d+.?d*)', lines, flags=re.M)
Answered By: Abinashbunty

I would just split the entire filecontent at every space. This will give us a list of all strings and floats. Then use list.index(" ") to find the index of the string you are searching for, put that into try/except to make sure your code wont stop if the string is not in the contents. Then just read the next element and try to convert it to a float.
Code:

lines = """aaaaaaaaaaaaaaa  bbbbbbbbbbbbbbb  cccccccccc
qq vvv rrr ssssa 22.6
zzzzx bbbb 12.0
xxxxxxxxxx -1.099
zzzz bbb nnn 33.5"""

lines = lines.replace("n", " ").split(" ") # replace the newlines with spaces to split them as well

try:
    float_index = lines.index("xxxxxxxxxx") + 1 # Get the element after the string you are trying to find

    num = float(lines[float_index])
except Exception as e:
    print(e)

print(num)

If you are looking for a solution in regex, use Andrej Kesely’s awnser.

Answered By: MarshiDev
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.