In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
Editor to share with you how to deal with the mysql character set under Linux, I believe most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to know it!
MySQL does not support multiple languages, so it will save the data you give it "intact" and read it "intact". Version 4. 1 and later began to support multilingual, the so-called multilingual, that is, MySQL will transcode for you during input and output. This conversion rule is determined by client-side coding and server-side coding. The rule of transcoding is to convert the encoding from "client-side encoding" to "server-side encoding" when entering data and from "server-side encoding" to "client-side encoding" on output.
MySQL 4.1.x began to support the following things
Use multiple character sets (Character Set) to store characters
Compare strings using multiple proofreading rules (Collation)
Use different character sets or proofreading rules to mix strings on the same server, the same database, or even in the same table
Allows you to define any level of character set and proofreading rules
MySQL 4.1and later character set support (Character Set Support) has two aspects: character set (Character Set) and proofreading rules (Collation). Character sets and proofing rules have four levels of default settings: server (server), database (database), datasheet (table), and connection (connection).
MySQL character set Settings
Character_set_server: default internal operation character set
Character_set_database: the default character set of the currently selected database
Vi / etc/my.cnf-- global options
[mysqld]
Default-character-set=utf8
You can only change the settings for the storage layer (server,database,table,column,system) and have no effect on the communication layer on the client side and server side
Character_set_client: the character set used by the client source data
Character_set_connection: connection layer character set
Character_set_results: query result character set
Vi / etc/my.cnf-- global options
[mysql]
Default-character-set=utf8
Modify the current session value:
The SET NAMES'x 'statement is equivalent to these three statements:
Mysql > SET character_set_client = x
Mysql > SET character_set_results = x
Mysql > SET character_set_connection = x
Character_set_system: system metadata (field name, etc.) character set
There are also variables that start with collation_ and correspond to the above, which describe the character order.
Or:
CREATE DATABASE db_name DEFAULT CHARACTER SET charset_name
ALTER DATABASE db_nameDEFAULT CHARACTER SET charset_name
Character set conversion process in MySQL
1. MySQL Server converts the request data from character_set_client to character_set_connection when it receives the request
two。 To convert the request data from character_set_connection to internal operation character set before internal operation, the determination method is as follows:
Use the CHARACTER SET of each data field to set the value
If the above value does not exist, the DEFAULT CHARACTER SET setting value of the corresponding data table is used (MySQL extension, non-SQL standard)
If the above value does not exist, the DEFAULT CHARACTER SET of the corresponding database is used to set the value
If the above value does not exist, use character_set_server to set the value.
3. Converts the result of the operation from the internal operation character set to character_set_results.
Some means of detecting character set problems
SHOW CHARACTER SET
SHOW COLLATION
SHOW VARIABLES LIKE 'character%'
SHOW VARIABLES LIKE 'collation%'
SQL functions HEX, LENGTH, CHAR_LENGTH
SQL functions CHARSET, COLLATION
STATUS
Analysis of common problems
The connection character set is not set before inserting utf8-encoded data into the data table with the default character set of utf8. When querying, the connection character set is set to utf8.
-character_set_client, character_set_connection, and character_set_results are all latin1 when inserted according to the default settings of the MySQL server
-the data of the insert operation will go through the character set conversion process of latin1= > latin1= > utf8. In this process, each inserted Chinese character will be changed from 3 bytes to 6 bytes.
-the result of the query will go through the character set conversion process of utf8= > utf8, and the saved 6 bytes will be returned intact, resulting in garbled code.
The connection character set is set to utf8 before inserting utf8-encoded data into the data table with the default character set of latin1
-character_set_client, character_set_connection and character_set_results are all utf8 according to the connection character set setting when inserting
-the inserted data will be converted into the character set of utf8= > utf8= > latin1. If the original data contains Unicode characters outside the range of u0000~u00ff, it will be converted to the "?" (0 × 3F) symbol because it cannot be represented in the latin1 character set. No matter how the connection character set is set, the content of the original data cannot be restored later.
The contents of this two tests:
Environment ed Hat Enterprise Linux Server release 5.5 (Tikanga) LANG=zh_CN.UTF-8
Mysql5.0.77
Test 1:
Vi/etc/my.cnf
[mysqld]
Default-character-set=utf8
Mysql > show variables like 'char%'
+-+
| | Variable_name | Value |
+-+
| | character_set_client | lantin1 |
| | character_set_connection | lantin1 |
| | character_set_database | utf8 |
| | character_set_filesystem | binary |
| | character_set_results | lantin1 |
| | character_set_server | utf8 |
| | character_set_system | utf8 |
| | character_sets_dir | / usr/share/mysql/charsets/ |
Result: the server side is normal
Accessing WindowXp EMS SQL Manager for my sql on the client is garbled:
Test 2:
Set names utf8
Mysql > show variables like 'char%'
+-+
| | Variable_name | Value |
+-+
| | character_set_client | utf8 |
| | character_set_connection | utf8 |
| | character_set_database | utf8 |
| | character_set_filesystem | binary |
| | character_set_results | utf8 |
| | character_set_server | utf8 |
| | character_set_system | utf8 |
| | character_sets_dir | / usr/share/mysql/charsets/ |
The data in Test 1 becomes garbled, and the data in Test 2 is displayed normally.
The data of test 2 is also displayed normally on the client side.
If you insert data from the client: test three
Garbled codes are displayed on the server:
Test 3:
Set names utf8
Set character_set_results=gb2312
Insert into test values ('test')
The client is normal and the server is garbled
Client:
Ideas for dealing with garbled codes:
Keep the server-side and client-side character sets consistent.
Server-side encoding is determined by character set (Character Set) and proofreading rules (Collation).
As mentioned above, MySQL determines what character set is used on the server side and the client side based on the following variables:
Character_set_client client character set
Character set used for character_set_connection client-server connection
Character set of data returned by character_set_results SELECT query
Character set used in character_set_database database
In other words, as long as you ensure that these variables use a consistent character set, there will be no garbled problems.
In many cases, after this setting, the garbled problem can be solved. But still can not completely avoid the possibility of garbled, why? Because character_set_client and character_set_connection are only used to ensure consistency with the character_set_database encoding, while character_set_results is used to ensure that the results returned by SELECT are consistent with the program's encoding. For example, if your database (character_set_database) uses the character set of utf8, then you should make sure that character_set_client and character_set_connection are also the character set of utf8. And your program may not use utf8, for example, your program uses gbk, then if you also set character_set_results to utf8, there will be garbled problems. At this point you should set character_set_results to gbk. This ensures that the results returned by the database are consistent with the coding of your program.
Note:
1. Ensure that the data stored in the database is consistent with the database code, that is, the data code is consistent with character_set_database
2. Ensure that the character set of the communication is consistent with that of the database, that is, character_set_client, character_set_connection and character_set_database
3. Make sure that the return of SELECT is consistent with the code of the program, that is, character_set_results is consistent with the code of the program.
4. Ensure that the program code is consistent with the browser code, that is, the program code is the same as the browser code.
"/ > consistent.
One way to change the setting is by recompiling. If you want to change the default server character set and proofing rules when building from the source program, use:-- with-charset and-- with-collation as parameters for configure. For example:
Shell >. / configure-- with-charset=latin1
I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.