utf-8 | py4u

Using UTF-8 in Python 3 string literals

Using UTF-8 in Python 3 string literals Question: I have a script I’m writing where I need to print the character sequence "Qä" to the terminal. My terminal is using UTF-8 encoding. My file has # -*- coding: utf-8 -*- at the top of it, which I think is not actually necessary for Python 3, …

Total answers: 1

Preserving special characters when writing to a CSV – What encoding to use?

Preserving special characters when writing to a CSV – What encoding to use? Question: I am trying to save the string the United Nations’ Sustainable Development Goals (SDGs) into a csv. If I use utf-8 as the encoding, the apostrophe in the string gets converted to an ASCII char import csv str_ = "the United …

Total answers: 1

How to make Python treat literal string as UTF-8 encoded string

How to make Python treat literal string as UTF-8 encoded string Question: I have some strings in Python loaded from a file. They look like lists, but are actually strings, for example: example_string = ‘["hello", "there", "w\u00e5rld"]’ I can easily convert it into an actual list of strings: def string_to_list(string_list:str) -> List[str]: converted = string_list.replace(‘"’, …

Total answers: 1

Python: Convert several/multiple .docx file from ANSI to UTF-8 on a particular folder

Python: Convert several/multiple .docx file from ANSI to UTF-8 on a particular folder Question: I am not very good programmer. But I want to make a py code that may convert several/multiple .docx file from ANSI to UTF-8, from a particular folder. I will start with this. But I don’t know further, how to select …

Total answers: 1

Convert non UTF-8 ASCII literals in otherwise UTF-8 text to their respective character

Convert non UTF-8 ASCII literals in otherwise UTF-8 text to their respective character Question: I have a UTF8 encoded text that has been mangled and contains some ‘cp1252’ ASCII literals. I am trying to isolate the literals and convert them one by one, however following code does not work and I can’t understand why… text …

Total answers: 1

UTF-8 support in reportlab (Python)

UTF-8 support in reportlab (Python) Question: Problem I can’t create a PDF from UTF-8 encoded text using reportlab. What I get is a document full of black squares. See the screenshot below: Prerequisites pip install faker reportlab Code import tempfile from faker import Faker from reportlab.lib.pagesizes import letter from reportlab.lib.styles import getSampleStyleSheet from reportlab.lib.units import …

Total answers: 1

'utf-8' codec can't decode byte 0xa0 in position 121: invalid start byte

'utf-8' codec can't decode byte 0xa0 in position 121: invalid start byte Question: How do I resolve this issue? Thank you. from google.colab import drive drive.mount(‘/content/drive’) path = "/content/drive/MyDrive/Colab Notebooks/Meteo.csv" Meteo = pd.read_csv(path, parse_dates=[[‘Date’, ‘Time’]]) Meteo.head() Sample of CSV I’m looking for help to fix the error because I cannot get the output display yet. …

Total answers: 1

Python utf-8 encoding not following unicode rules

Python utf-8 encoding not following unicode rules Question: Background: I’ve got a byte file that is encoded using unicode. However, I can’t figure out the right method to get Python to decode it to a string. Sometimes is uses 1-byte ASCII text. The majority of the time it uses 2-byte "plain latin" text, but it …

Total answers: 1

Socket error: "UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte"

Socket error: "UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte" Question: I am trying to implement a way to send and receive files using the socket library, but when I run the code, I keep getting the error "UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xff in position 0: invalid start …

Total answers: 1

How to combine two code points to get one?

How to combine two code points to get one? Question: I know that unicode code point for Á is U+00C1. I read on internet and many forums and articles that I can also make an Á by combining characters ´ (unicode: U+00B4) and A (unicode: U+0041). My question is simple. How to do it? I …

Total answers: 2