In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
What is the role of Collation in SQL Server? aiming at this question, this article introduces the corresponding analysis and solution in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible way.
The composition of Collation
After talking about the storage form of characters, we can talk about what collation is. Collation describes what rules data follow to describe characters in the database, and how characters are sorted and compared. In SQL Server, Collation consists of two parts, for example, a collation in China is Chinese_PRC_CI_AI_WS, the first half refers to the supported character set, corresponding to code page, such as Chinese_PRC corresponding to the code page is 936, all the characters that can be used are defined in this code page. The second half of the CI_AI_WS is used to represent collations, such as:
_ CI (CS) indicates whether letters are case-sensitive, CI is not case-sensitive, CS is case-sensitive. If case-sensitive, lowercase letters are sorted before uppercase; if they are not case-sensitive, they are sorted as case-sensitive letters.
_ AI (AS) indicates whether to distinguish stress, AI does not distinguish, AS does not distinguish. If the stress is not distinguished, then "a" and "a" are sorted as the same characters.
_ KI (KS) indicates whether to distinguish the types of katakana, KI does not distinguish, KS does not distinguish. It is used in Japanese.
_ WI (WS) indicates whether to distinguish full half angles, WI does not distinguish, and WS distinguishes. Half-width is single byte, full-width is double-byte.
Four levels of Collation
Collation has a total of four levels, namely server-level, database-level, column-level and expression-level.
Server-level Collations
The server-level collation is specified when the database instance is installed, and if not specified, windows collation is used as the server-level collation. Windows collation is determined by the regional language in the operating system.
Because we chose Chinese (Simplified,PRC), our default server-level collations is: Chinese_PRC_. Server-level collation is also the default collation for system and user databases. In general, once server-level collation is set, it cannot be changed unless you export all the objects and data in the database, create a master, and then import the data back.
Database-level collations
Database-level collations can be found in create database. When collate, you specify that if you want to modify database-level collations, you can use alter database. Collate to modify. In general, you cannot modify the collations of the system database (master, etc.) unless you use the previously mentioned method of modifying server-level collations to modify the system database.
Column-level collations
When you create or change a table, you can use the COLLATE clause to specify a collation for each string column. Of course, you can also modify column-level collations.
Expression-level collations
Expression-level means to specify collations when executing a sql statement, such as:
SELECT name FROM customer ORDER BY name COLLATE Latin1_General_CS_AI
This query statement indicates that it is sorted according to the collation of Latin1_General_CS_AI. One of the benefits of Expression-level collations is its flexibility.
Lab 1: resolve conflicts caused by collation mismatch
An error is reported when joining tables in two databases with different collations levels. It is possible to use expression-level collations to specify which collations to use to solve the problem. For example, using Collate Database_Default will resolve the conflict by defining or converting the field to the default collation of the current database.
Step1:
Create two tables, the first using the default collation and the second specifying collation on the stuname column.
Create table student1
(
Stuid int not null
Stuname nvarchar (20) not null
)
Create table student2
(
Stuid int not null
Stuname nvarchar (20) COLLATE Latin1_General_CS_AI not null
)
-- find the table connection Step2:
Select s1.century journal s2.* from student1 s1century student 2 s2 where s1.stuname=s2.stuname
Execute the above query and report an error as follows:
Cannot resolve the collation conflict between "Latin1_General_CS_AI" and "Chinese_PRC_CI_AS" in the equal to operation.
Then use Collate Database_Default in expression-level
Select s1.century journal s2.* from student1 s1century student 2 s2 where s1.stuname=s2.stuname Collate Database_Default
The above query was executed successfully.
It is important to note that collation can only be used on columns of string type, and an error will be reported if you use collate on an int column.
Experiment 2: the impact of changing collation on the database.
Objective 1:
Create a database to see if the default database collation of the data is the same as server collation.
Objective 2:
Insert Chinese into the database where database collation is Chinese_PRC_CI_AS, and then change collation to Latin1_General_CS_AI to see if the saved data has changed. If you change collation back to Chinese_PRC_CI_AS again, what will happen?
Purpose 3:
In the case of Latin1_General_CS_AI collation, insert Chinese, what will happen and how to solve it.
Experiment 1: test the storage length of nvarchar and varchar
-- create a database TESTDB3 with the default collation of Chinese_PRC_CI_AS
USE TESTDB1
CREATE TABLE test
(
Lastname NVARCHAR (8) NOT NULL,--nvarchar type, double byte storage
Title VARCHAR (8) NOT NULL,-- varchar type, single byte storage
)
Insert into test values ('name 1' title 1')
Select * from test
Insert into test values ('123456789);-- String or binary data would be truncated.
Insert into test values ('12345678')
Insert into test values (12345678')
Insert into test values ('one two three four five six seven eight', 'one two three four')
Select * from test
-- Summary:
/ *
1.nvarchar (n), which is stored in characters, whether English or Chinese. It can store up to n Chinese or English, but the storage space occupied is 2n+2 bytes.
2.varchar (n) is stored in bytes, with a maximum of n English letters and 2 Chinese characters. But the storage space occupied is n bytes.
* /
-- experiment on the impact of changes in 2:collation on data.
USE TESTDB1
Select * from test
-- step1: change the collation of the modified database from the default Chinese_PRC_CI_AS to Latin1_General_CS_AI. This collation is used in English-speaking countries.
Use master
Alter database TESTDB1 collate Latin1_General_CS_AI
-- step2: the database in the instance that is different from the default collation can be found through the following statement, and our previous TESTDB1 can be found.
Use master
SELECT
NAME AS DATABASE_NAME
, DATABASEPROPERTYEX (NAME,'COLLATION') AS DBCOLLATION
, SERVERPROPERTY ('COLLATION') AS SERVERCOLLATION
FROM SYS.DATABASES
WHERE CONVERT (SYSNAME,DATABASEPROPERTYEX (NAME,'COLLATION')) SERVERPROPERTY ('COLLATION')
-- step3: look at the data in the table after modifying the collation and find that the data has not changed.
USE TESTDB1
Select * from test
-- step4: insert data under the new collation.
Insert into test values ('name 1' title 1');-- after insertion, I found that Chinese has become garbled "?"
Select * from test
-- step5: because lastname is of type nvarchar, we specify nvarchar when inserting.
Insert into test values (N' name 2 'title 2');-- at this time found that the name is not garbled, the title is needless to say or garbled?
Select * from test
-- step6: try inserting nvarchar data in a field of type varchar.
Insert into test values (N 'name 3 minutes N' title 3');-- found that the title is not garbled.
Select * from test
--
-- step7: change the collation from Latin1_General_CS_AI to the default Chinese_PRC_CI_AS.
Use master
Alter database TESTDB1 collate Chinese_PRC_CI_AS
-- step8: look at the data in the table after modifying the collation and find that the data has not changed.
USE TESTDB1
Select * from test
-- step9: insert characters without specifying unicode type. The query is displayed in Chinese, indicating that the current collation uses unicode encoding by default.
Insert into test values ('name 414' title 4')
Select * from test
Summary:
The change of collation does not change the data originally stored in the database, what it is, or what it is after modification, has not changed.
Latin1_General_CS_AI defaults to non-unicode, so inserting Chinese into garbled code under this collation must be specified when inserting data in the form of unicode, that is, adding the keyword "N", while Chinese_PRC_CI_AS this collation uses double-byte code page, which defines all Chinese characters, so there is no need to specify the keyword "N" when inserting data.
You can insert nvarchar data into columns of the varchar data type, that is, using varcha to store unicode data.
The answer to the question about the role of Collation in SQL Server is shared here. I hope the above content can be of some help to you. If you still have a lot of doubts to be solved, you can follow the industry information channel to learn more about it.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.