How to optimize reading and cleaning file?

Question:

I have a file, which contains strings separated by spaces, tabs and carriage return:

one     two

    three

         four

I’m trying to remove all spaces, tabs and carriage return:

def txt_cleaning(fname):
    with open(fname) as f: 
    new_txt = []
        fname = f.readline().strip()
        new_txt += [line.split() for line in f.readlines()]
    return new_txt

Output:

[['one'], ['two'], [], ['three'], [], ['four']]

Expecting, without importing libraries:

['one', 'two', 'three', 'four']
Asked By: L10B

||

Answers:

def txt_cleaning(fname):
    new_text = []
    with open(fname) as f:
        for line in f.readlines():
            new_text += [s.strip() for s in line.split() if s]
    return new_text

Or

def txt_cleaning(fname):
    with open(fname) as f:
        return [word.strip() for word in f.read().split() if word]
Answered By: Johnny Mopp

My method:

  • use read (not readline) to get the whole text in a single element
  • replace tabs and newlines with a space
  • split
def txt_cleaning(fname):
  with open(fname) as f:
    return f.read().replace( 't', ' ').replace( 'n', ' ').split()
Answered By: user3435121
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.