How to check for several keywords in a text file and append what's after each one of them into three different lists

Question:

Here’s my problem:

  • I have a text document just like
    this
  • I need to append the text after the keywords "Name :", "Nationality:", and "Age :" to three corresponding lists (between "Name" and ":" (and "Age" and ":") there are some spaces which this editor doesn’t let me include in plain text);
  • The function I’m using only has one input which is the file name itself;
  • The function needs to return the three different lists: l_name, l_nationality, and l_age.

Here’s my current attempt:

    def read_information(file_name):
     l_name = []
     l_nationality = []
     l_age = []
    
    with open (file_name) as file:
        information = file.read()
        if text.startswith("Name       :"):
            l_name = information.append()
        if text.startswith("Nationality:"):
            l_nationality = information.append()
        if text.startswith("Age        :"):
            l_age = information.append()    
    return l_name, l_nationality, l_age

The output isn’t exactly what I want, it’s rather nothing, and I don’t understand what I’m exactly doing wrong.

Answers:

First your code can’t work at all because of several issues:

  • information.append() doesn’t exist as information is a string
  • text variable isn’t defined

Then other errors

  • l_name = .. would reassign something else to the variable, you want to keep the list
  • you didn’t extract the value part
def read_information(file_name):
    l_name = []
    l_nationality = []
    l_age = []

    with open(file_name) as file:
        for line in file:
            if not line.strip():  # skips empty line
                continue
            value = line.split(":")[1].strip()
            if line.startswith("Name"):
                l_name.append(value)
            if line.startswith("Nationality"):
                l_nationality.append(value)
            if line.startswith("Age"):
                l_age.append(value)

    return l_name, l_nationality, l_age

Execution would give

print(read_information("data.txt"))
# (['John', 'Jack'], ['English', 'Spanish'], ['12', '15'])

Now you could do differently, with a dict

def read_information(file_name):
    result = {}
    with open(file_name) as file:
        for line in file:
            if not line.strip():  # skips empty line
                continue
            key, value = line.split(":")
            result.setdefault(key.strip(), []).append(value.strip())
    return result
print(read_information("data.txt"))
# {'Name': ['John', 'Jack'], 'Nationality': ['English', 'Spanish'], 'Age': ['12', '15']}
Answered By: azro

You can also use list comprehension and yield each list then do unpacking:

def read_information(file_name):
    with open(file_name) as file:
        information = file.read().splitlines()
        yield [info.split(':')[1] for info in information if 'Name' in info.split(':')[0]]
        yield [info.split(':')[1] for info in information if 'Nationality' in info.split(':')[0]]
        yield [info.split(':')[1] for info in information if 'Age' in info.split(':')[0]]

l_name, l_nationality, l_age = read_information('file_name')

print(l_name); print(l_nationality); print(l_age)

# [' Anthony Hopkins', ' Ezequiel Ruiz', ' Gabriel Jesus', ' Kyllian mbappé', ' Diego Maradona']
# [' English', ' Spanish', ' Brazilian', ' French', ' Argentinian']
# [' 23', ' 39', ' 15', ' 27', ' 50']
Answered By: Arifa Chan
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.