How to deal with the problem of MySQL Chinese data 04/19 Update SLTechnology News&Howtos

How to deal with the problem of MySQL Chinese data

2025-04-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

The following mainly brings you how to deal with MySQL Chinese data problems, I hope these contents can bring you practical use, this is also my main purpose of editing MySQL Chinese data problems how to deal with this article. All right, don't talk too much nonsense, let's just read the following.

Chinese data problem

The essence of Chinese data problem is the problem of character set.

Because computers only recognize binary data, and humans are more likely to recognize characters (symbols), there needs to be a correspondence between binary and characters, that is, character sets.

When we insert Chinese data into the CVM through the client of the MySQL database, we may fail. This may be due to the different character set settings of the client and the CVM, for example:

If the character set of the client is gbk, one Chinese character corresponds to two bytes

If the character set of the CVM is utf8, then one Chinese character corresponds to three bytes.

This will obviously cause problems in the process of transcoding, resulting in the failure of inserting Chinese data.

Since some of the characteristics of all database cloud servers are maintained through variables on the cloud server, the system will first read its own variables to see the specific form of expression. In this way, we can use the following statement to see which character sets are recognized by the CVM:

-- View all character sets show character set recognized by the CVM

Through the above query, we will find that the cloud server is omnipotent and supports all character sets.

Since the CVM supports so many character sets, there will always be a default character set for the CVM to deal with the client. Therefore, we can view the default external processing character set of the CVM with the following statement:

-- View the default character set show variables like 'character_set%' of the CVM for external processing

Note 1: the data character set sent by the default client of the CVM is utf8

Label 2: the connection layer character set is utf8

Annotation 3: the character set of the current database is utf8

Note 4: the default character set for external processing of the CVM, utf8.

Through the above query, we will find that the default external processing character set of cloud servers is utf8.

Instead, we are looking at the character sets supported by the client through the properties of the client:

Obviously, we have found the root of the problem, and it is true that the character set supported by the client is gbk, while the default character set for external processing of the cloud server is utf8, so there is a contradiction.

Now that the problem has been found, the solution is to change the default character set received by the cloud server to gbk.

-- modify the default character set received by the CVM to GBK (case-insensitive) set character_set_client = gbk

In this way, when we insert Chinese data again, the insertion will be successful! But, when we looked at the data, we found another problem, that is, the Chinese data we inserted earlier showed garbled! However, this is also normal, because when querying, the source of the data is the cloud server (utf8), the client side parses the data, and the client only recognizes the data in gbk format, and it is expected to display garbled code!

Therefore, the solution is to change the data character set from the cloud server to the client to gbk.

-- modify the data character set given to the client by the CVM to GBK (case-insensitive) set character_set_results = gbk

As shown in the above figure, the problem of inserting Chinese data into the cloud server has been solved!

In addition, the SQL statement we used earlier:

-- only the session level is modified, that is, the current client connection is valid, and the invalid set variable = value when closed.

In this way, whenever we restart the client, we have to reset it in turn, which is more troublesome, so we can use a quick setting method, that is:

Set names character set

For example,

/ * always equals set character_set_client = gbk;* equals set character_set_results = gbk;* equals set character_set_connection = gbk;*/set names gbk

Indicates that the above statement will change the values of three variables at the same time. Among them, connection is the connection layer and the middle of character set conversion. If it is consistent with the character set of client and results, it will be more efficient, and it does not matter if it is inconsistent.

For the above about MySQL Chinese data how to deal with, we do not think it is very helpful. If you need to know more, please continue to follow our industry information. I'm sure you'll like it.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.