Can I insert UTF8 encoded characters into a Latin-1 table if I know only Latin-1 characters will be used?

Question:

I have 10 tables in a database. 9 of them only store data with standard ascii 1-byte characters supported by Latin-1. 1 of them requires that I store special characters that are only supported by UTF8. I would like to use the same MySQL connection object (using Python’s PyMySQL library) to populate all 10 tables.

Previously, when creating the MySQL connection object, I did not specify the character set and it defaulted to Latin-1. That was fine when I was only populating the 9 Latin-1 tables. Now that I am populating the UTF8 table, I modified the connection object by passing in the parameter charset=’utf8mb4′ to the PyMySQL connection object function:

# Connect to the database
connection = pymysql.connect(host='localhost',
                             user='user',
                             password='passwd',
                             db='db',
                             charset='utf8mb4',
                             cursorclass=pymysql.cursors.DictCursor)

Now I am confident that, when inserting into my UTF8 MySQL table, all of my data is being stored fine. However, I am unsure if problems may arise when using my UTF8 connection object and inserting into the Latin-1 tables. After my first rounds of testing, everything looks great.

Is there anything I have overlooked? Are there any potential issues with inserting UTF8 encoded characters into a Latin-1 table?

Answers:

Hi utf8 and latin 1 both are simple encoding they support some character which not included in both so problem may occur. if you pass some data of utf8 which is not in latin 1. In this process double encoding occour.
Here is a link to insert utf8 to latin

Answered By: jai dutt

It can be done. But… You must set some things correctly, else you will get any of several forms of garbage.

If the bytes in your client are UTF-8 encoded, then you must tell MySQL that fact. This is usually done on the connect string. Your charset='utf8mb4' connection argument does that. Here are some Python-specific tips: http://mysql.rjweb.org/doc.php/charcoll#python

Meanwhile, the column(s) in the table(s) can be either latin1 or utf8 (since you are sure the data is limited to the characters that are common between them).

A character example: é is hex E9 in latin1 and C3A9 in MySQL’s utf8 (or utf8mb4). The conversion will occur during INSERT and SELECT if you correctly state the clients encoding.

(For your purposes, either utf8 and utf8mb4 will work.)

If you have further troubles, see Trouble with utf8 characters; what I see is not what I stored and/or provide SHOW CREATE TABLE and hex of some offending character.

Answered By: Rick James

I had the same problem and solved it by using the Convert and Cast function :

mycursor.execute("INSERT INTO `topics` (`title`,parent_id)
 VALUES (convert(cast(convert( %s using utf8) as binary) using latin1),0)" ,(name,) )
Answered By: Sadegh Ghanbari
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.