Returning keywords found in a text file from a keyword list to a new file?

Question:

Introduction:

I’m currently building a keyword detection program. It is given a number of ‘.txt’ files and loops through them, searching for a keyword in them from a list of keywords, returning which files contained the keyword. The keywords are stored in a list in a separate python file, which is then imported into the main program file.

Goal:

The goal I want to achieve is to print which keyword was found out of the list when it parses the text file. So for example when it searches the text files and "Hello" is in the keyword list, I want the output to be "Hello, found in example_text01.txt". At the moment it just returns if a keyword was found or not. Ideally the process should look like what is below.

Example Wordlist:

word_list = ["Demo", "Text", "Hello", "Example"]

Example Text:

Hello how are you?

Desired Outcome:

"Hello, found in example_text01.txt"

What I Have Tried:

  • Tried to use the in keyword.

Ran without errors but it would skip any text file with a keyword and not process it.

  • Make keyword file plain text and use readline() to parse the text.

Received the following error: AttributeError: 'list' object has no attribute 'readlines'

  • Returning keyword class when writing the result document.

Just returned <class 'ast.keyword'>

Code:

The following is the code I am currently using.

keywords = ['Hello', 'Example', 'Keywords']

# Create and open result.txt where results of keyword scan will be stored
 with open("/PATH/TO/result.txt", "w") as f:
    #Path to the folder the .txt files are stored in within the loop
            for filename in listdir("/PATH/TO/txt"):
        # Opens all text files as they are processed through the loop
                with open('/PATH/TO/CURRENT/TEXT/FILE/IN/txt/example.txt') as currentFile:
                    text = currentFile.read()
                    if any(keyword in text for keyword in keywords):
                        f.write('Keyword found in ' + filename[:-4] + 'n')
                    else:
                        f.write('No keyword in ' + filename[:-4] + 'n')

The current output of code is if a keyword(s) from the keyword list is found in one of the text files then the program will write to the ‘results.txt’ file if a keyword is found or not. However along with it, I would like to find a way to include which keyword was found. Any help would be greatly appreciated, thanks!

Asked By: syntaxerrorSteve

||

Answers:

Why not just modify the bottom part:

Instead of


    if any(keyword in text for keyword in keywords):
        f.write('Keyword found in ' + filename[:-4] + 'n')
    else:
        f.write('No keyword in ' + filename[:-4] + 'n')
    

    ...
    for k in keywords:
        f.write((f'Keyword "{k}" found in ' if keyword in text else 'No keyword in ') + filename[:-4] + 'n')

Answered By: v0rtex20k

Just change:

if any(keyword in text for keyword in keywords):
    f.write('Keyword found in ' + filename[:-4] + 'n')
else:
    f.write('No keyword in ' + filename[:-4] + 'n')

to:

keywordsFound = [k for k in keywords if k in text] #get all found keywords
if keywordsFound: #if keywords were found
    for k in keywordsFound:#for each found keyword
        f.write(f'{k}, found in {filename[:-4]}n') #say it was found
else:
    f.write(f'No keyword in {filename[:-4]}n') #if non-found say it was not found

This gets each keyword that is found in the file then writes to the other file.

If you want only the first keyword that is found you can use:

keywordsFound = [k for k in keywords if k in text] #get all found keywords
if keywordsFound: #if keywords were found
    k = keywordsFound[0] #get only first keyword
    f.write(f'{k}, found in {filename[:-4]}n') #say it was found
else:
    f.write(f'No keyword in {filename[:-4]}n') #if non-found say it was not found
Answered By: Eli Harold

I will like to see if I can get you small cod running – but I get this error – will you happen to know the reason?

File "/home/ddl-devlop/tools/PyPDF2/Keyword test/Key_file.py", line 4
with open("/home/ddl-devlop/tools/PyPDF2/Keyfile_test/result.txt", "w") as f:
IndentationError: unexpected indent

Answered By: papso