How to write all the sentences with the word "apple" from a txt file

Question:

I have tried this using regex, loops and simple functions as well but cannot figure out anything. here are the few codes I have tried.

import re
fp = open("apple.txt")
re.findall(r"([^.]*?apple[^.]*.)",fp)
Asked By: chaitanya chitturi

||

Answers:

with open("./WABA-crash.txt", "r") as input:
    content = input.read().replace('n', '')
sentences = list(map(str.strip, content.split(".")))
with open("./resultFile.txt", "w") as output:
    for result in sentences:
        if ('police' in result or 'Police' in result):
            output.write(f'{result}. n')

There is no need for regex. It might not be the cleanest answer but it works. This is the beauty of Python. When working with files I recommend you to use with open instead of open(). Since this will automatically close the for you. Otherwise you’d need to use the close() method at the end of your file.

Hope this helps you! Enjoy

Answered By: DJ Freeman

Instead of using a regex that can be quite complicated in order to be solid (you need to account many different type of potential sentence form) a good solution may be using a NLP library like NLTK or Spacy.

Here is how to tokenize with nltk:

from nltk import tokenize

with open("WABA-crash.txt") as file:
    content=file.read()
    sentences=tokenize.sent_tokenize(content)

    police_sentences=[x for x in sentences if "police" in x]

print(police_sentences)
Answered By: Liutprand
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.