In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces the specific reasons for MySQL garbled and how to set UTF8 data format, the contents of the article are carefully selected and edited by the author, with a certain pertinence, the reference significance for everyone is still relatively great, the following with the author to understand the specific reasons for MySQL garbled and how to set UTF8 data format it.
When using MySQL, one painful thing must be the result of garbled code. Setting the encoding format to UTF8 can solve this problem. Let's talk about why and how to set it today.
MySQL character format
Character set
In the programming language, in order to prevent Chinese characters from garbled, we use unicode to deal with Chinese characters, and in order to reduce network bandwidth and save storage space, we use UTF8 for coding. Students who do not know enough about the differences between the two can refer to this article on the past and present lives of Unicode character set and UTF8 coding.
We can also do this in MySQL. We can view the encoding method (character set) set in the current database:
Mysql > show variables like'% char%' +-- +-- + | Variable_name | Value | +-- -- + | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | latin1 | | character_set_filesystem | binary | | Character_set_results | latin1 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | / usr/local/mysql/share/charsets/ | +-- +-+ 8 rows in set (0.00 sec)
The current character set is shown in the table. Take a look at a few values that you don't need to pay attention to:
Character_set_filesystem | binary: storage format on the file system. Default is binary (binary).
Character_set_system | utf8: storage format of the system. Default is utf8.
Character_sets_dir | / usr/local/mysql/share/charsets/: the file path of the character set that can be used
The rest are the parameters that affect reading and writing garbled on a daily basis:
-character_set_client: character set of data requested by the client
-character_set_connection: the character set that receives data from the client and then transmits
-character_set_database: the character set of the default database; if there is no default database, use the character_set_server field
-character_set_results: the character set of the result set
-character_set_server: default character set of database CVM
The character set conversion process is divided into three steps:
1. The client requests database data, and the data sent uses the character_set_client character set.
2. After receiving the data sent by the client, the MySQL instance converts it to the character_set_connection character set
3. When performing internal operations, convert the data character set to the internal operation character set:
(1) use the character set of each data field to set the value
(2) if it does not exist, use the default character set setting value of the corresponding data table
(3) if it does not exist, use the default character set setting value of the corresponding database.
(4) if it does not exist, use character_set_server to set the value
4. Convert the operation result value from the internal operation character set to character_set_results
Character order
Before we talk about character order, we need to know some basics:
Character (Character) refers to the smallest semantic symbol in human language. For example,'A','B', etc.
Given a series of characters, each character is assigned a numeric value that represents the corresponding character, which is the Encoding of the character. For example, if we assign a value of 0 to the character'A' and a value of 1 to the character'B', then 0 is the encoding of the character'A'.
Given a series of characters and given the corresponding encoding, the set of all these characters and coding pairs is the Character Set. For example, if the list of characters given is {'Aggregano'}, {'A'= > 0, 'Bai = > 1} is a character set.
Character order (Collation) refers to the comparison rules between characters in the same character set.
After the character order is determined, what is the equivalent character and the size relationship between the characters can be defined on a character set
Each character order uniquely corresponds to a character set, but a character set can correspond to multiple character orders, one of which is the default character order (Default Collation)
Character order names in MySQL follow the naming convention: they start with the character set name corresponding to the character order, and end with _ ci (for case-insensitive, case insensitive), _ cs (for case-sensitive, case sensitive), or _ bin (for comparison by coded value, binary). For example, under the character order "utf8_general_ci", the characters "a" and "A" are equivalent.
Therefore, the character order is different from the character set and is used for the equality or size comparison of database fields. Let's look at the character order set by the MySQL instance:
Mysql > show variables like 'collation%' +-- +-+ | Variable_name | Value | +-+-+ | collation_connection | latin1_swedish_ci | | collation_database | latin1_ Swedish_ci | | collation_server | latin1_swedish_ci | +-- +-+ 3 rows in set (0.00 sec)
The common character order corresponding to utf8 is: utf8_unicode_ci/utf8_general_ci and utf8_bin, etc., so what's the difference between them?
1. _ bin is stored in binary and compared, case is distinguished, and used when storing binary content.
2. Utf8_general_ci: the proofreading speed is fast, but the accuracy is poor, and it is used when using Chinese and English.
3. Utf8_unicode_ci: high accuracy, but slightly slower proofreading speed, used when using foreign languages such as Germany, France and Russia
For detailed differences, please refer to the summary of the differences between the collation rules utf8_unicode_ci and utf8_general_ci in Mysql.
Modify character set and character order
If there is a garbled problem during the MySQL connection, it can be basically determined that the various character set / sequence settings are not uniform. MySQL default latin1 format does not support Chinese, because we are in China, so we choose the utf8 format which is very perfect for Chinese and various languages. Therefore, we need to change the character set and character order we need to pay attention to to utf8 format.
You can also choose the utf8mb4 format, which supports saving emoji
After reading the specific reasons for MySQL garbled and how to set UTF8 data format, many readers must have some understanding. If you need to get more industry knowledge and information, you can continue to follow our industry information column.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.