In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article shows you what sort rules there are in MySQL, which are concise and easy to understand, which will definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.
Brief introduction
When it comes to collation, it is inseparable from the character set. Strictly speaking, the collation depends on the character set.
Character set is used to define how MySQL stores different characters, and collation generally refers to the rules for comparing and sorting strings in the character set. A character set can correspond to multiple collations, but a collation can only correspond to a specified character set, and two different character sets cannot have the same collation.
In the above figure, the Collation column represents the sorting method, and the Charset column represents the character set. You can see that the utf8 character set corresponds to many sorting methods. The value of each item in that column is different, and each item corresponds to a unique character set, in this case, the utf8 character set.
Default collation
Default collation for character set
Mysql > show character set like 'utf8%' +-+ | Charset | Description | Default collation | Maxlen | + -+ | utf8 | UTF-8 Unicode | utf8_general_ci | 3 | | utf8mb4 | UTF-8 Unicode | utf8mb4_general_ci | 4 | +-+ 2 rows in set (0.00 sec)
Each character set has a default collation, and you can query the default collation for the specified character set through the following SQL statement
The above example is the default sorting method in which the query character set prefix contains utf8, from which you can see:
The default sorting method for the utf8 character set is that the maximum length of characters in the utf8_general_ci character set is 3 bytes.
The default sorting method for the utf8mb4 character set is that the maximum length of characters in the utf8mb4_general_ci character set is 4 bytes.
Default collation for the database
The default character set of the MySQL server can be modified under [mysqld] in the / etc/my.cnf configuration.
For example, you need to set the default character set of the MySQL server to utf8 and the default collation to utf8_general_ci. You only need to add the following subitems under [mysqld] of the / etc/my.cnf configuration file.
Character-set-server=utf8 collation-server=utf8_general_ci
If you do not specify a character set when creating a database, the MySQL server default character set and default collation will be used
Suppose: in the following example, the default character set and default collation for the MySQL server are utf8 and utf8_general_ci, respectively.
Mysql > create database at; Query OK, 1 row affected (0.00 sec) mysql > select SCHEMA_NAME,DEFAULT_CHARACTER_SET_NAME,DEFAULT_COLLATION_NAME from INFORMATION_SCHEMA.SCHEMATA where SCHEMA_NAME='at' +-+ | SCHEMA_NAME | DEFAULT_CHARACTER_SET_NAME | DEFAULT_COLLATION_NAME | +-+- -+-+ | at | utf8 | utf8_general_ci | + -+ 1 row in set (0.00 sec)
In the above example, the at database is created, no character set and collation are specified for the database, and the default character set and collation of the MySQL server are used
Query the default character set and default collation of at database through SQL statement, and the result is the same as the default character set and default collation of MySQL server.
Collation naming and name suffix
Naming
The naming of a collation begins with the name of the character set associated with it, followed by one or more suffixes to represent the specified character set.
For example, utf8_general_ci and utf8_bin are the two collations of the utf8 character set, and latin1_swedish_ci is the collation of the latin1 character set.
Note: the binary character set has only one collation, and its collation name is the same as the character set name, which is also binary
Suffix
The suffix of the collation name has a special meaning. According to the suffix, you can know whether the collation is case-sensitive, accent-sensitive, binary, and so on. Some of the suffixes are listed below.
_ ci: case-insensitive, abbreviation of Case-insensitive _ cs: case-sensitive, abbreviation of Case-sensitive _ ai: insensitive to stress, abbreviation of Accent-insensitive _ as: stress-sensitive, abbreviation of Accent-sensitive _ bin: binary
1: case-insensitive
Mysql > SET NAMES 'utf8' COLLATE' utf8_unicode_ci'; Query OK, 0 rows affected (0.02 sec) mysql > select 'a' ='A'; +-+ |'a' ='A' | +-+ | 1 | +-+ 1 row in set (0.00 sec)
In the example, the collation of utf8_unicode_ci is case-insensitive, so the characters an and An are treated as the same characters.
2: case sensitive
Mysql > SET NAMES 'latin1' COLLATE' latin1_general_cs'; Query OK, 0 rows affected (0 sec) mysql > select'a'='A'; +-+ |'a' ='A'| +-+ | 0 | +-+ 1 row in set (0 sec)
In the example, the collation of latin1_general_cs is case sensitive, so the character an and the character An are considered to be two different characters.
3: binary
Mysql > SET NAMES 'utf8' COLLATE' utf8_bin'; Query OK, 0 rows affected (0 sec) mysql > select'a'='A'; +-+ |'a' ='A'| +-+ | 0 | +-+ 1 row in set (0 sec) mysql > select'à'='a' +-+ |'à'='a' | +-+ | 0 | +-+ 1 row in set (0.00 sec)
The collation used in the above example is utf8_bin. As you can see from the result, the utf8_bin collation is case-sensitive as well as accented characters.
4: whether to distinguish stress
Accented characters are characters similar to à, "," and ". Indistinguishable accent means that the characters an and à, e and", and n and "are considered to be the same character.
For collations that are not binary (with the suffix _ bin), if the collation name suffix does not contain _ ai and _ as, the _ ci in the collation name implies _ ai by default, and _ cs implies _ as by default.
For example, the utf8_unicode_ci collation is case-insensitive and implicitly insensitive to stress
The latin1_general_cs collation is case-sensitive and implicitly case-sensitive
For details, please see the following example
Mysql > SET NAMES 'utf8' COLLATE' utf8_unicode_ci'; Query OK, 0 rows affected (0.02 sec) mysql > select'à'='a'; +-+ |'à'='a' | +-+ | 1 | +-+ 1 row in set (0.00 sec) mysql > SET NAMES 'latin1' COLLATE' latin1_general_cs' Query OK, 0 rows affected (0.00 sec) mysql > select'à'='a sec; +-+ |'à'='a' | +-+ | 0 | +-+ 1 row in set (0.01 sec)
View collation
View the collation of the database
Mysql > select SCHEMA_NAME,DEFAULT_CHARACTER_SET_NAME,DEFAULT_COLLATION_NAME from INFORMATION_SCHEMA.SCHEMATA where SCHEMA_NAME='at' +-+ | SCHEMA_NAME | DEFAULT_CHARACTER_SET_NAME | DEFAULT_COLLATION_NAME | +-+- -+-+ | at | latin1 | latin1_swedish_ci | + -+ 1 row in set (0.00 sec)
The above example is to look at the character set and collation of the at database. From the result, we can see that the collation of the at database is latin1_swedish_ci.
Method 2: directly query the value of collation_database variable
Mysql > use at; Database changed mysql > show variables like 'collation_database' +-+-+ | Variable_name | Value | +-- + | collation_database | latin1_swedish_ci | +- -+-+ 1 row in set (0.00 sec)
In the instance, the use at SQL statement switches to the at database, and then uses the show variables like 'collation_database'; statement to query the collation of the at database.
View the collation of the table
Method 1: by viewing the TABLE_COLLATION field in the INFORMATION_SCHEMA.TABLES table according to the database name and table name, you can get the sorting rules of the specified table in the specified database. The specific example is as follows:
Mysql > SELECT TABLE_SCHEMA, TABLE_NAME,TABLE_COLLATION FROM INFORMATION_SCHEMA.TABLES where TABLE_SCHEMA='test' and TABLE_NAME = 'tc' +-+ | TABLE_SCHEMA | TABLE_NAME | TABLE_COLLATION | +-+ | test | tc | Utf8_general_ci | +-+ 1 row in set (0.00 sec)
Method 2: execute the show create table table name statement to view
Mysql > show create table tc\ G * * 1. Row * * Table: tc Create Table: CREATE TABLE `tc` (`id` int (11) NOT NULL, `a` char (32) NOT NULL, `b` char (32) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL `c` char (32) CHARACTER SET latin1 COLLATE latin1_general_cs NOT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec)
In the above example, show create table tc\ G is used to view the creation statement of the tc table. Generally, the collation is specified when the table is created. The collation of the specified table is not shown in the example, because the default collation of the character set is used. The character set of the tc table is utf8, and the default collation is utf8_general_ci.
View the collation of a field
Method 2 above to view the collation section of the table is to view the table creation statement, and the collation of the field can also be seen in the table creation statement. If the collation of the specified field is not displayed, the collation cannot be seen in the statement that creates the table, which means that the field uses the default sorting method of the character set.
Mysql > show create table tc\ G * * 1. Row * * Table: tc Create Table: CREATE TABLE `tc` (`id` int (11) NOT NULL, `a` char (32) NOT NULL, `b` char (32) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL `c` char (32) CHARACTER SET latin1 COLLATE latin1_general_cs NOT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec)
In the above example, the character set of table tc is utf8
Field a does not display the specified collation, it uses the default collation utf8_general_ci of the utf8 character set
The collation of field b is utf8_bin
The collation of field c is latin1_general_cs
Modify collation
Modify the collation of the database
The default collation of the database can be modified through the alter database database name collate new collation name SQL statement
Mysql > select SCHEMA_NAME,DEFAULT_CHARACTER_SET_NAME,DEFAULT_COLLATION_NAME from INFORMATION_SCHEMA.SCHEMATA where SCHEMA_NAME='at' +-+ | SCHEMA_NAME | DEFAULT_CHARACTER_SET_NAME | DEFAULT_COLLATION_NAME | +-+- -+-+ | at | utf8 | utf8_general_ci | + -+ 1 row in set (0.00 sec) mysql > alter database at collate utf8_bin Query OK, 1 row affected (0.00 sec) mysql > select SCHEMA_NAME,DEFAULT_CHARACTER_SET_NAME,DEFAULT_COLLATION_NAME from INFORMATION_SCHEMA.SCHEMATA where SCHEMA_NAME='at' +-+ | SCHEMA_NAME | DEFAULT_CHARACTER_SET_NAME | DEFAULT_COLLATION_NAME | +-+- -+-+ | at | utf8 | utf8_bin | + -+ mysql > alter database at collate latin1_swedish_ci Query OK, 1 row affected (0.00 sec) mysql > select SCHEMA_NAME,DEFAULT_CHARACTER_SET_NAME,DEFAULT_COLLATION_NAME from INFORMATION_SCHEMA.SCHEMATA where SCHEMA_NAME='at' +-+ | SCHEMA_NAME | DEFAULT_CHARACTER_SET_NAME | DEFAULT_COLLATION_NAME | +-+- -+-+ | at | latin1 | latin1_swedish_ci | + -+ 1 row in set (0.00 sec)
In the above example, the initial character set and collation of the at database are utf8 and utf8_general_ci, respectively.
SQL statement: alter database at collate utf8_bin; changes the character set and collation to utf8 and utf8_bin
SQL statement: alter database at collate latin1_swedish_ci; changes the character set and collation to latin1 and latin1_swedish_ci
Because the latin1_swedish_ci collation belongs to the latin1 character set, when the at database collation is changed to latin1_swedish_ci, the character set will be changed from utf8 to latin1 accordingly.
Modify the collation of a table
Collate the new collation name through the alter table table name; the statement can modify the collation of the table
Note: the above statement modifies the table collation and does not affect the collation of existing fields, but only the default collation of subsequent newly added fields.
Mysql > show create table tc\ G * * 1. Row * * Table: tc Create Table: CREATE TABLE `tc` (`id` int (11) NOT NULL, `a` char (32) NOT NULL, `b` char (32) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL `c` char (32) CHARACTER SET latin1 COLLATE latin1_general_cs NOT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec) mysql > alter table tc collate latin1_swedish_ci Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql > show create table tc\ G * * 1. Row * * Table: tc Create Table: CREATE TABLE `tc` (`id` int (11) NOT NULL, `a` char (32) CHARACTER SET utf8 NOT NULL `b` char (32) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL, `c` char (32) CHARACTER SET latin1 COLLATE latin1_general_cs NOT NULL) ENGINE=InnoDB DEFAULT CHARSET=latin1 1 row in set (0.00 sec)
In the above example, the initial character set of the tc table is utf8, and the default collation is that utf8_general_ci executes the alter table tc collate latin1_swedish_ci;SQL statement to change the collation of the table to latin1_swedish_ci
Because the latin1_swedish_ci collation belongs to the latin1 character set, the character set of the current table will also be changed to latin1
Before modifying the table collation, the collation of field an is the default utf8_general_ci, which is not shown in the statement that created the table
After the collation is modified, field a displays its original character set because of the change in the table's default collation.
Modify the collation of a field
Mysql > show create table tc\ G * * 1. Row * * Table: tc Create Table: CREATE TABLE `tc` (`id` int (11) NOT NULL, `a` char (32) CHARACTER SET utf8 NOT NULL, `b` char (32) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL `c` char (32) CHARACTER SET latin1 COLLATE latin1_general_cs NOT NULL, `d` char (32) DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=latin1 1 row in set (0.00 sec) mysql > alter table tc modify b char (32) not null collate latin1_general_cs Query OK, 0 rows affected (0.05sec) Records: 0 Duplicates: 0 Warnings: 0 mysql > show create table tc\ G * * 1. Row * * Table: tc Create Table: CREATE TABLE `tc` (`id` int (11) NOT NULL, `a` char (32) CHARACTER SET utf8 NOT NULL `b` char (32) CHARACTER SET latin1 COLLATE latin1_general_cs NOT NULL, `c` char (32) CHARACTER SET latin1 COLLATE latin1_general_cs NOT NULL, `d` char (32) DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=latin1 1 row in set (0.00 sec)
In the above example, the original character set and collation of field b are utf8 and utf8_bin, respectively.
The statement alter table tc modify b char (32) not null collate latin1_general_cs; modifies the collation of field b to latin1_general_cs. Since the latin1_general_cs collation belongs to the latin1 character set, the character set and collation of field b have changed after the modification.
What are the sorting rules in MySQL? have you learned any knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.