check if an element from a list exists in a strings of a txt file (not working) Python

Question:

I have tried many ways but I do not get any output at all. I have a list containing different types of strings:

lst=['ATCGG','GTAACGCT','AATCGAT',...]

and I have a text file as below:

>seq1
NNNGTAACGCTNNN
>seq2
NNNNNAATCGATNNNN
>seq3
NNNNNNNN
.
.
.

I want to print lines of the text file if any item in the list exists in the line. Based on the examples above, the desired output should be:

NNNGTAACGCTNNN
NNNNNAATCGATNNNN

I used the command below but nothing is getting printed out:

main_file =  open('test_file.txt', 'r')
contn = main_file.read()
#print(contn)


for dna in contn:
    if any(i in dna for i in lst):
        print(dna)
Asked By: Apex

||

Answers:

An explicit loop version for intuitiveness:

def find_lines_containing_any(filename, wanted_list):
    with open(filename, 'r') as dna_file:
       for dna_line in dna_file:
            for wanted in wanted_list:
                if wanted in dna_line:
                    yield dna_line

for dna in find_lines_containing_any('test_file.txt', lst):
    print(dna)
Answered By: Sparkofska

You need readlines instead of read, read will be creating a single string so when you iterate, dna is actually just individual characters

contn = main_file.read()

should be

contn = main_file.readlines()
Answered By: Sayse

With for dna in contn, you are iterating over the characters as read() returns a str object, you can simply do:

main_file =  open('test_file.txt', 'r')
for line in main_file:
    if any(i in line for i in lst):
        print(line)
Answered By: Krishna Chaurasia
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.