unicode-string

Strange character added when decoding with urllib

Strange character added when decoding with urllib Question: I’m trying to parse a query string like this: filename=logo.txt\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01x&filename=.hidden.txt Since it mixes bytes and text, I tried to alter it such that it will produce the desired escaped url output like so: extended = ‘filename=logo.txt\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01x&filename=.hidden.txt’ fixbytes = bytes(extended, ‘utf-8’) fixbytes = fixbytes.decode("unicode_escape") algoext = ‘?’ + …

Total answers: 3

Python 3: os.walk() file paths UnicodeEncodeError: 'utf-8' codec can't encode: surrogates not allowed

Python 3: os.walk() file paths UnicodeEncodeError: 'utf-8' codec can't encode: surrogates not allowed Question: This code: for root, dirs, files in os.walk(‘.’): print(root) Gives me this error: UnicodeEncodeError: ‘utf-8’ codec can’t encode character ‘udcc3’ in position 27: surrogates not allowed How do I walk through a file tree without getting toxic strings like this? Asked …

Total answers: 4