How to obtain substring of big string text in Python?

Question:

I have the following format of text files, which are outputs of an API:

TASK [Do this]
OK: {
    "changed":false,
    "msg": "check ok"
}

TASK [Do that]
OK

TASK [Do x]
Fatal: "Error message x"

TASK [Do y]
OK

TASK [Do z]
Fatal: "Stopped because of previous error"

The amount of lines, or tasks before and after the "Fatal" error are random, and I am only interested in the "Error message x" part.

Code as of now:

url = # API URL 
r = request.get(url, verify=False, allow_redirects=True, headers=headers, timeout=10)
output = r.text

I tried using a combination of output.split("Fatal", 1)[1] but it seems to return list index out of range, while also messing up the text, adding a lot of n.

Asked By: Zodi

||

Answers:

You should be able to use regular expressions with the re package to do that fairly easily. If it is possible for more than one occurrence of "Error Message X" then using something along the lines of

someVar = re.findall("Error Message X", output)

should return a list of all occurrences of strings within the output text that match. Findall can also be used if only one occurrence is possible, it will then just return a list with only one element.

Here is a helpful site for an intro to re
https://www.w3schools.com/python/python_regex.asp

Answered By: murrag

You can use the re package to use a regular expression to search for the text you need. There are probably more optimal regex, but I wrote this one quickly using regex101.com: Fatal: "(.+)"

import re

s = '''TASK [Do this]
OK: {
    "changed":false,
    "msg": "check ok"
}

TASK [Do that]
OK

TASK [Do x]
Fatal: "Error message x"

TASK [Do y]
OK

TASK [Do z]
Fatal: "Stopped because of previous error"'''

errors = re.findall(r'Fatal: "(.+)"', s)

for x in errors:
    print(x)
Answered By: Tom Kaufman
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.