utf-8 | Page 2

Can you safely read utf8 and latin1 files with a naïve try-except block?

Can you safely read utf8 and latin1 files with a naïve try-except block? Question: I believe that any valid latin1 character will either be interpreted correctly by Python’s utf8 encoder or throw an error. I, therefore, claim that if you work with only either utf8 files or latin1 files, you can safely write the following …

Total answers: 3

how to decode file when writing in yaml format

how to decode file when writing in yaml format Question: I am trying to write a dictionary file that contains Tibetan language word into yaml format. Problem is i couldn’t encode/decode the file when writing the yaml file. Here is code : with open(‘tibetan_dict.yml’, ‘w’, encoding=’utf-8′) as outfile: yaml.dump(tibetan_dict, outfile, default_flow_style=False) tibetan_dict contains: {‘ཀ་ཅ’: ‘༡.་ནོར་རྫས་ཀྱི་སྤྱི་མིང་སྟེ། …

Total answers: 1

Python 3.9.x created CSV with non-English (Unicode) characters (UTF-8 encoded) does not show correctly when opened in Excel (Windows)

Python 3.9.x created CSV with non-English (Unicode) characters (UTF-8 encoded) does not show correctly when opened in Excel (Windows) Question: My original Python 2.7 code that created the CSV file with non-English characters used the NOT recommended hack of: reload(sys) sys.setdefaultencoding(‘utf8′) In order to achieve "UTF-8" compatibility. (changed from ASCII). In addition , I’ve added …

Total answers: 1

UnicodeDecodeError during file transfer between Linux and Windows using Python socket programming

UnicodeDecodeError during file transfer between Linux and Windows using Python socket programming Question: I am trying to send an image file from Raspberry Pi (the client) to the Laptop (the server). When I run client.py on Raspberry Pi (linux OS) and server.py on laptop (windows OS) connected in LAN, I get the following error message …

Total answers: 1

UTF-8 characters in python string even after decoding from UTF-8?

UTF-8 characters in python string even after decoding from UTF-8? Question: I’m working on converting portions of XHTML to JSON objects. I finally got everything in JSON form, but some UTF-8 character codes are being printed. Example: { "p": { "@class": "para-p", "#text": "Iu2019m not on Earth." } } This should be: { "p": { …

Total answers: 2

Why are these two non-English strings (with exactly the same appearance) different in Python?

Why are these two non-English strings (with exactly the same appearance) different in Python? Question: I’m reading a csv file by Pandas pd.read_csv. I want to remove all columns but Mã NPP. However, the string Mã NPP I input from keyboard is not the same as the one in the column names of the dataframe. …

Total answers: 1

Python opening files with utf-8 file names

Python opening files with utf-8 file names Question: In my code I used something like file = open(path +’/’+filename, ‘wb’) to write the file but in my attempt to support non-ascii filenames, I encode it as such naming = path+’/’+filename file = open(naming.encode(‘utf-8’, ‘surrogateescape’), ‘wb’) write binary data… so the file is named something like …

Total answers: 2

Preventing Python requests.post to encode strings to UTF-8

Preventing Python requests.post to encode strings to UTF-8 Question: I am making an API call to an appliance, passing a message in a JSON payload via HTTP POST. Despite not doing any character encoding, the string received is encoded in UTF-8. Unfortunately, the appliance manufacturer requires no encoding for the message, and characters with accents …

Total answers: 1

Removing literal backslashes from utf-8 encoded strings in python

Removing literal backslashes from utf-8 encoded strings in python Question: I have a bunch of strings containing UTF-8 encoded symbols, for example ‘\u00f0\u009f\u0098\u0086’. In that case, it represents this emoji , encoded in UTF-8. I want to be able to replace it to the literal emoji. The solution someone recommended to me was to encoded …

Total answers: 2

How Python decodes UTF8 Encoding in String Format

How Python decodes UTF8 Encoding in String Format Question: Now there is a string of utf-8： s = ‘\346\235\216\346\265\267\347\216\211’ I need to decode it, but now I only do it in this way： result = eval(bytes(f"b'{s}’", encoding="utf8")).decode(‘utf-8’) This is not safe, so is there a better way? Asked By: yternal || Source Answers: you can …

Total answers: 3