Compare URLs from two different text files Python

Question:

I have a text file (links.txt) in the following format:

www.independent.co.uk www.bbc.co.uk www.theguardian.com www.telegraph.co.uk 
www.dailymail.co.uk en.wikipedia.org www.huffingtonpost.co.uk www.bbc.co.uk 
www.newsnow.co.uk www.express.co.uk 

I have another text file (keys.txt) in the following format:

www.independent.co.uk www.bbc.co.uk www.theguardian.com

I want to compare both the text files and the URLs that are common in both the files has to be printed

I tried using the urltools package in python but couldn’t do it for multiple urls

Asked By: clawstack

||

Answers:

How about this:

links = open('links.txt', 'r')
links_data = links.read()
links.close()

keys = open('keys.txt', 'r')
keys_data = keys.read()
keys.close()

keys_split = keys_data.split()

for url in keys_split:
    if url in links_data:
        print(url)

Just make sure that links.txt and keys.txt are in the current working directory and everything should work fine. I’m assuming your URLs will always be space-delimited.

Answered By: agillgilla

To print only unique URL instead common URL, just modify condition not in, here is complete code –

links = open('links.txt', 'r')
links_data = links.read()
links.close()

keys = open('keys.txt', 'r')
keys_data = keys.read()
keys.close()

keys_split = keys_data.split()

for url in keys_split:
    if url not in links_data:
        print(url)
Answered By: Puneet Verma
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.