Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to deal with the mysql character set problem under Linux

2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

Editor to share with you how to deal with the mysql character set under Linux, I believe most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to know it!

MySQL does not support multiple languages, so it will save the data you give it "intact" and read it "intact". Version 4. 1 and later began to support multilingual, the so-called multilingual, that is, MySQL will transcode for you during input and output. This conversion rule is determined by client-side coding and server-side coding. The rule of transcoding is to convert the encoding from "client-side encoding" to "server-side encoding" when entering data and from "server-side encoding" to "client-side encoding" on output.

MySQL 4.1.x began to support the following things

Use multiple character sets (Character Set) to store characters

Compare strings using multiple proofreading rules (Collation)

Use different character sets or proofreading rules to mix strings on the same server, the same database, or even in the same table

Allows you to define any level of character set and proofreading rules

MySQL 4.1and later character set support (Character Set Support) has two aspects: character set (Character Set) and proofreading rules (Collation). Character sets and proofing rules have four levels of default settings: server (server), database (database), datasheet (table), and connection (connection).

MySQL character set Settings

Character_set_server: default internal operation character set

Character_set_database: the default character set of the currently selected database

Vi / etc/my.cnf-- global options

[mysqld]

Default-character-set=utf8

You can only change the settings for the storage layer (server,database,table,column,system) and have no effect on the communication layer on the client side and server side

Character_set_client: the character set used by the client source data

Character_set_connection: connection layer character set

Character_set_results: query result character set

Vi / etc/my.cnf-- global options

[mysql]

Default-character-set=utf8

Modify the current session value:

The SET NAMES'x 'statement is equivalent to these three statements:

Mysql > SET character_set_client = x

Mysql > SET character_set_results = x

Mysql > SET character_set_connection = x

Character_set_system: system metadata (field name, etc.) character set

There are also variables that start with collation_ and correspond to the above, which describe the character order.

Or:

CREATE DATABASE db_name DEFAULT CHARACTER SET charset_name

ALTER DATABASE db_nameDEFAULT CHARACTER SET charset_name

Character set conversion process in MySQL

1. MySQL Server converts the request data from character_set_client to character_set_connection when it receives the request

two。 To convert the request data from character_set_connection to internal operation character set before internal operation, the determination method is as follows:

Use the CHARACTER SET of each data field to set the value

If the above value does not exist, the DEFAULT CHARACTER SET setting value of the corresponding data table is used (MySQL extension, non-SQL standard)

If the above value does not exist, the DEFAULT CHARACTER SET of the corresponding database is used to set the value

If the above value does not exist, use character_set_server to set the value.

3. Converts the result of the operation from the internal operation character set to character_set_results.

Some means of detecting character set problems

SHOW CHARACTER SET

SHOW COLLATION

SHOW VARIABLES LIKE 'character%'

SHOW VARIABLES LIKE 'collation%'

SQL functions HEX, LENGTH, CHAR_LENGTH

SQL functions CHARSET, COLLATION

STATUS

Analysis of common problems

The connection character set is not set before inserting utf8-encoded data into the data table with the default character set of utf8. When querying, the connection character set is set to utf8.

-character_set_client, character_set_connection, and character_set_results are all latin1 when inserted according to the default settings of the MySQL server

-the data of the insert operation will go through the character set conversion process of latin1= > latin1= > utf8. In this process, each inserted Chinese character will be changed from 3 bytes to 6 bytes.

-the result of the query will go through the character set conversion process of utf8= > utf8, and the saved 6 bytes will be returned intact, resulting in garbled code.

The connection character set is set to utf8 before inserting utf8-encoded data into the data table with the default character set of latin1

-character_set_client, character_set_connection and character_set_results are all utf8 according to the connection character set setting when inserting

-the inserted data will be converted into the character set of utf8= > utf8= > latin1. If the original data contains Unicode characters outside the range of u0000~u00ff, it will be converted to the "?" (0 × 3F) symbol because it cannot be represented in the latin1 character set. No matter how the connection character set is set, the content of the original data cannot be restored later.

The contents of this two tests:

Environment ed Hat Enterprise Linux Server release 5.5 (Tikanga) LANG=zh_CN.UTF-8

Mysql5.0.77

Test 1:

Vi/etc/my.cnf

[mysqld]

Default-character-set=utf8

Mysql > show variables like 'char%'

+-+

| | Variable_name | Value |

+-+

| | character_set_client | lantin1 |

| | character_set_connection | lantin1 |

| | character_set_database | utf8 |

| | character_set_filesystem | binary |

| | character_set_results | lantin1 |

| | character_set_server | utf8 |

| | character_set_system | utf8 |

| | character_sets_dir | / usr/share/mysql/charsets/ |

Result: the server side is normal

Accessing WindowXp EMS SQL Manager for my sql on the client is garbled:

Test 2:

Set names utf8

Mysql > show variables like 'char%'

+-+

| | Variable_name | Value |

+-+

| | character_set_client | utf8 |

| | character_set_connection | utf8 |

| | character_set_database | utf8 |

| | character_set_filesystem | binary |

| | character_set_results | utf8 |

| | character_set_server | utf8 |

| | character_set_system | utf8 |

| | character_sets_dir | / usr/share/mysql/charsets/ |

The data in Test 1 becomes garbled, and the data in Test 2 is displayed normally.

The data of test 2 is also displayed normally on the client side.

If you insert data from the client: test three

Garbled codes are displayed on the server:

Test 3:

Set names utf8

Set character_set_results=gb2312

Insert into test values ('test')

The client is normal and the server is garbled

Client:

Ideas for dealing with garbled codes:

Keep the server-side and client-side character sets consistent.

Server-side encoding is determined by character set (Character Set) and proofreading rules (Collation).

As mentioned above, MySQL determines what character set is used on the server side and the client side based on the following variables:

Character_set_client client character set

Character set used for character_set_connection client-server connection

Character set of data returned by character_set_results SELECT query

Character set used in character_set_database database

In other words, as long as you ensure that these variables use a consistent character set, there will be no garbled problems.

In many cases, after this setting, the garbled problem can be solved. But still can not completely avoid the possibility of garbled, why? Because character_set_client and character_set_connection are only used to ensure consistency with the character_set_database encoding, while character_set_results is used to ensure that the results returned by SELECT are consistent with the program's encoding. For example, if your database (character_set_database) uses the character set of utf8, then you should make sure that character_set_client and character_set_connection are also the character set of utf8. And your program may not use utf8, for example, your program uses gbk, then if you also set character_set_results to utf8, there will be garbled problems. At this point you should set character_set_results to gbk. This ensures that the results returned by the database are consistent with the coding of your program.

Note:

1. Ensure that the data stored in the database is consistent with the database code, that is, the data code is consistent with character_set_database

2. Ensure that the character set of the communication is consistent with that of the database, that is, character_set_client, character_set_connection and character_set_database

3. Make sure that the return of SELECT is consistent with the code of the program, that is, character_set_results is consistent with the code of the program.

4. Ensure that the program code is consistent with the browser code, that is, the program code is the same as the browser code.

"/ > consistent.

One way to change the setting is by recompiling. If you want to change the default server character set and proofing rules when building from the source program, use:-- with-charset and-- with-collation as parameters for configure. For example:

Shell >. / configure-- with-charset=latin1

I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report