How to handle Japanese characters?

Question:

I have an input in Japanese language from other source which is out of my control.

But I get this error:

UnicodeEncodeError: 'charmap' codec can't encode characters in position 15-41: character maps to undefined

Code:

import mutagen

def addTag(fpath, title, albumName):
    audio = mutagen.File(fpath, easy=True)
    audio.add_tags()
    audio['title'] = title
    audio['album'] = albumName
    audio.save(fpath)

# The Code below this comment is out of my control but this is how it is implemented
file = "1.mp3"
title = "We Must Go TV"
album = "アニメ「風が強く吹いている」オリジナルサウンドトラック"
addTag(file, title, album)

Answers:

Read the documentation: https://docs.python.org/3/howto/unicode.html

It says how to process and include non-ASCII text in your Python code. Essentially, you use unicode literals to represent a single character. This will print one character:

ru = u'u30EB'

You could also try to force the string to be a unicode object in python 2:

album = u"uアニメ「風が強く吹いている」オリジナルサウンドトラック"

By default, all strings are already unicode.

Also check out this informative video: https://www.youtube.com/watch?v=oEbNWXhS_mk

Answered By: krmogi

there are 2 solutions

enter image description here
1- when you look at end of error message you will see the encoding library have the issue in my case it was cp1252 I could insert the value by encode string first to utf-8 then decode the string to the library have the issue and used ignore errors so data will inserted but the unknown characters will replaced with not needed characters eg: ð¥ð¢ð ð¡ð­ ð“ð, but this not best way but good for inserting the data without errors

newCatalogue.product_description = newCatalogue.product_description.encode('utf-8').decode('cp1252', 'ignore')

second and this what I used I added charset in connection url and data inserted normal

mysql://user:pass@localhost/dbname?charset=utf8mb4
Answered By: Mahmoud Magdy
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.