Read content of txt files into lists to find duplicates

Question:

I’m new to Python.

My code should read 2 different .txt files into lists and compare them to find and delete duplicates.

Code

import os
dir = os.listdir
T = "Albums"
if T not in dir():
    os.mkdir("Albums")
with open('list.txt','w+') as f:
    linesA = f.readlines()
    print(linesA)   # output empty
with open('completed.txt','w+') as t:
    linesB = t.readlines()
    print(linesB)  # output empty
for i in linesA[:]:
    if i in linesB:
        linesA.remove(i)
print(linesA)
print(linesB)

I tried the code above with following inputs:

  • in list.txt I wrote (on separate lines) A, B and C.
  • in completed.txt I wrote (also on separate lines) A and B.

It should have first output the content of the lists, which were empty for some reasons.

Why are the read lists empty?

Asked By: Serialgamer07

||

Answers:

This makes little sense.

dir = os.listdir

You wanted to call os.listdir().
What you did was assign a reference to that function,
without actually calling the function.

Better to dispense with dir and just phrase it this way:

if T not in os.listdir():

with open('list.txt','w+') as f:
    linesA = f.readlines()
    ...
with open('completed.txt','w+') as t:
    linesB = t.readlines()

You wanted to open those with 'r' read mode,
rather than write.

Answered By: J_H

Does this help:

  • I suggest using not os.path.exists(entry) instead of not entry in os.listdir(), it’s not relevant for the problem, but I point it out anyway. (Also, you overwrote the built-in dir function)
  • I’ve split up the file using split("n")
  • I’ve changed the way the files are opened to r+, this doesn’t clear the file unlike w+.

Please note that if you want to use readlines you have to remove the new line for each entry.

import os

with open('list.txt','w+') as file:
    file.write("Foon")
    file.write("Bar")
with open('completed.txt','w+') as file:
    file.write("Barn")
    file.write("Python")

T = "Albums"
if not os.path.exists(T):
    os.mkdir("Albums")
with open('list.txt','r+') as f:
    linesA = f.read().split("n")
    print(linesA)
with open('completed.txt','r+') as t:
    linesB = t.read().split("n")
    print(linesB)
for entry in list(linesA):
    if entry in linesB:
        linesA.remove(entry)
print(linesA)
print(linesB)

Output:

['Foo', 'Bar']
['Bar', 'Python']
['Foo']
['Bar', 'Python']
Answered By: Nineteendo
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.