startswith TypeError in function

Question:

Here is the code:

    def readFasta(filename):
        """ Reads a sequence in Fasta format """
        fp = open(filename, 'rb')
        header = ""
        seq = ""
        while True:
            line = fp.readline()
            if (line == ""):
                break
            if (line.startswith('>')):
                header = line[1:].strip()
            else:
                seq = fp.read().replace('n','')
                seq = seq.replace('r','')          # for windows
                break
        fp.close()
        return (header, seq)

    FASTAsequence = readFasta("MusChr01.fa")

The error I’m getting is:

TypeError: startswith first arg must be bytes or a tuple of bytes, not str

But the first argument to startswith is supposed to be a string according to the docs… so what is going on?

I’m assuming I’m using at least Python 3 since I’m using the latest version of LiClipse.

Asked By: user2287873

||

Answers:

It’s because you’re opening the file in bytes mode, and so you’re calling bytes.startswith() and not str.startswith().

You need to do line.startswith(b'>'), which will make '>' a bytes literal.

Answered By: TerryA

Without having your file to test on try encoding to utf-8 on the ‘open’

fp = open(filename, 'r', encoding='utf-8')
Answered By: Andre Odendaal

If remaining to open a file in binary, replacing ‘STR’ to bytes(‘STR’.encode(‘utf-8’)) works for me.

Answered By: wenching
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.