In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
The database I use is oracle11.2.0.4, and the database character set is al32utf8.
The client is windows 7 of the same machine.
Connect with the client of window7
View Chinese characters in windows client
Virtual machine 192.168.10.5 database is test1 database character set is al32utf8.
C:\ Users\ Administrator > echo% NLS_LANG%-View client character commands
SIMPLIFIED CHINESE_CHINA.AL32UTF8
Set NLS_LANG=SIMPLIFIED CHINESE_CHINA.ZHS16GBK-set client character command
The database is test1 and the database character set is al32utf8.
Session 1 sets the client character set to zhs16gbk (the characterset that modifies the nls_lang key of the registry to zhs16gbk) inserts two Chinese characters into the table.
C:\ Users\ Administrator > echo% NLS_LANG%
SIMPLIFIED CHINESE_CHINA.ZHS16GBK
C:\ Users\ Administrator > sqlplus sys/oracle@test1 as sysdba
SQL > create table test (col1 number (1), col2 varchar2 (10))
SQL > insert into test values (1 'China');-- 1 is the mark of session 1
1 row created.
SQL > commit
Commit complete.
SQL > create table test (col1 number (1), col2 varchar2 (10))
The table has been created.
SQL > insert into test values (1Jing 'China')
1 line has been created.
SQL >
-session 2 sets the client character set al32utf8 (the characterset that modifies the registry nls_lang key to al32utf8), which is the same as the database character set. Insert two Chinese characters identical to session 1 into the table.
C:\ Users\ Administrator > echo% NLS_LANG%
SIMPLIFIED CHINESE_CHINA.AL32UTF8
C:\ Users\ Administrator > sqlplus sys/oracle@test1 as sysdba
SQL*Plus: Release 11.2.0.1.0 Production on 11? 28 10:35:33 2016
Copyright (c) 1982, 2010, Oracle. All rights reserved.
What are you talking about?
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0-64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL >
SQL > insert into test values (2 'China');-- 2 is the mark of session 2
1 row created.
SQL > commit
Commit complete.
-- session 1
SQL > select * from test
COL1 COL2
--
1 China
2? ??
-- session 2
SQL > select * from test
COL1 COL2
--
1 trickle down
2 China
SQL >
As you can see from the results of session 1 and session 2, the same character (note that I mean what we see, displayed as the same character China) appears garbled under different character set input environments (client environment variables).
On the client side of the zhs16gbk character set, we see that the same Chinese entered by the utf8 character set client becomes garbled-- > the col2 field of the col1=2
SQL > select * from test
COL1 COL2
--
1 China
2? ??
In the utf8 character set client, we see that the Chinese input by the client of the zhs16gbk character set becomes another character-- the col2 field of > col1=1
SQL > select * from test
COL1 COL2
--
1 trickle down-three characters
2 China
SQL > select col1,dump (col2,1016) from test
COL1
-
DUMP (COL2,1016)
one
Typ=1 Len=6 CharacterSet=AL32UTF8: e4,b8,ad,e5,9b,bd
two
Typ=1 Len=4 CharacterSet=AL32UTF8: d6,d0,b9,fa
The Chinese characters entered by different clients are different in the character encoding stored in the database.
The character "China" entered by session 1 is encoded in the database as "e4, B8, adjournal, e5, 9, and BD".
The character "China" entered by session 2 is encoded in the database as "d6magentin d0remenb9recoverfa".
Conversation one,
The database of China-e4 16gbk b8 and utf8 found that the utf8 characters on the client side were inconsistent with the utf8 characters on the database.
Conversation II.
The database of China-d6 and utf8 found that the client utf8 and the utf8 characters on the database side cheated the database side, without character conversion, it was directly stored in the database with 16gbk (double-byte) character coding (utf8 database coding is three bytes).
The character "China" entered by session 1 is stored in the database with the character code of "e4memb8pemedree5pjing9bd".
The character "China" entered by session 2 is stored in the database and encoded as "d6magentin d0remenb9recoverfa".
The database sees that the character set of the client is the same as that of the database, and oracle will no longer convert the characters because it believes that the character encodings on both sides are the same. And at this point,
We cheated the database, although we set the client character set to be the same as the database, but we actually used the zhs16gbk character set encoding (because this is the character encoding used by windows)
For the character "China", the corresponding code in the zhs16gbk character set is d6magentin d0recoery b9refa. At this point, oracle ignores the code and saves it to the database.
The function of the character set of the client is to inform the character encoding transmitted by the database side, which is stored directly when it is consistent with the character set of the database side, and converted if it is inconsistent.
A query is the inverse process of stored character conversion.
Session1 query
SQL > select * from test
COL1 COL2
--
1 China
2? ??
When session 1 starts to query, oracle takes these two characters from the table and converts them into zhs16gbk character encoding according to the coding mapping table of character set al32utf8 and character set zhs16gbk. For the coding "e4jingb8joradree5pr 9bd"
The character code of its corresponding zhs16gbk is "d6magentin d0jinb9jurfa", and the corresponding character of this code is "China", so we can see that this character is displayed normally.
However, for the al32utf8 character encoding stored in the character set, "d6 ~ d0 ~ b9 ~ fa"
Because the windows environment we use to display characters uses the zhs16gbk character set, and there are no characters corresponding to this encoding or symbols that cannot be displayed in the zhs16gbk character set, the utf8---16gbk conversion cannot be found.
So I used "?" Such characters are replaced, which is why we see the characters entered by session 2 become such garbled characters.
Session2 query
SQL > select * from test
COL1 COL2
--
1 trickle down
2 China
SQL >
When session 2 starts the query, oracle fetches these two characters from the table. Because the character set settings of the client (nls_lang) and the database are the same, oracle will ignore the character conversion problem.
The characters stored in the database are returned directly to the client. For the character encoded as "d6Maged0Magneb9Maginfa", it is returned to the client, and the character set used by the client is exactly zhs16gbk. In this character set, # (although the client becomes ut8, it is still converted back to 16gbk, and windows is still 16gbk. )
This code corresponds to the two characters "China", so it is displayed normally.
For the character encoding "e4recoverb8pje e5jbjjbd", it is returned to the client, because double-byte character storage mode is used in zhs16gbk. (although the client becomes utf8, it is still converted back to 16gbk, and windows is still 16gbk. )
So these 6 bytes correspond to the three characters of the zhs16gbk character set, which is what we see as "trickle down".
Experiment on the influence of Import and Export exp imp client on characters
Create two libraries
Test1 source database character set-SIMPLIFIED CHINESE_CHINA.AL32UTF8
Test2 target database character set-SIMPLIFIED CHINESE_CHINA.16gbk
Create user jiang/oracle to establish test table
Oracle@linux5:/oracle > export NLS_LANG= "SIMPLIFIED CHINESE_CHINA.ZHS16GBK" # environment variable
SQL > select col1,dump (col2,1016) from test
COL1
-
DUMP (COL2,1016)
two
Typ=1 Len=4 CharacterSet=AL32UTF8: d6,d0,b9,fa
one
Typ=1 Len=6 CharacterSet=AL32UTF8: e4,b8,ad,e5,9b,bd
Export: Release 11.2.0.4.0-Production on Monday November 28 17:05:50 2016
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Connect to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0-64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Exported ZHS16GBK character set and AL16UTF16 NCHAR character set
The server uses the AL32UTF8 character set (possible character set conversion)
. Exporting pre-schema process objects and operations
. Exporting external function library name of user JIANG
. Export PUBLIC type synonyms
. Exporting specialized type synonyms
. Exporting object type definitions for user JIANG
The object that is about to export JIANG.
. Exporting database links
. Exporting serial number
. Exporting cluster definitions
. The table that is about to be exported to JIANG passes through the regular path.
. . Exporting table TEST exported 2 rows
. Exporting synonyms
. Exporting views
. Exporting stored procedures
. Exporting operators
. Exporting referential integrity constraints
. Exporting triggers
. Exporting index types
. Exporting bitmaps, functional indexes, and extensible indexes
. Exporting post-period table activities
. Exporting materialized view
. Exporting snapshot log
. Exporting job queue
. Exporting refresh groups and subgroups
. Exporting dimensions
. Exporting post-schema process objects and operations
. Exporting statistics
The export was terminated successfully without warning.
Import to test2, and the target database character set is 16gbk
Oracle@linux5:/oracle > imp jiang/oracle@test2 fromuser=jiang touser=jiang file=/backup/test1.dmp log=/backup/testimp.log
Import: Release 11.2.0.4.0-Production on Monday November 28 17:28:32 2016
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Connect to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0-64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
An export file created by EXPORT:V11.02.00 via a regular path
The import of ZHS16GBK character set and AL16UTF16 NCHAR character set has been completed
. . Importing table "TEST" imported 2 rows
The import was terminated successfully without warning.
Oracle@linux5:/oracle >
SQL > select * from test
COL1 COL2
--
2? ??
1 China
SQL > select col1,dump (col2,1016) from test
COL1
-
DUMP (COL2,1016)
two
Typ=1 Len=4 CharacterSet=ZHS16GBK: a3,bf,3f,3f
one
Typ=1 Len=4 CharacterSet=ZHS16GBK: d6, d0, d0, B9, and fa-2 bytes of a Chinese character, 16gbk coding and storage database has done digital coding conversion.
SQL >
Reference http://blog.csdn.net/wuweilong/article/details/39694531
Source database test1 (1) character set is utf8 → EXP client (2) → IMP client (3) → target database (4) test2 character set is 16gbk
The data has to go through the above four points in the process of migration, and in the process of data flow (such as the three arrows above), the character sets at both ends of the arrows need to be compared in turn.
If it is the same, it is not converted, if it is different, it is converted. If the character set is not the same between the two adjacent points, it needs to be converted 3 times.
The best way to set is because (1) (4) the character set of the database is fixed, then the character set of the client (2) (3) is the same as (1).
In this way, character set conversion occurs at most once in the process of (3) → (4). But the premise is that the character set of (4) must be a superset of the character set of (1). The client character set is set through the environment variable NLS_LANG.
Write data-character conversion process
By simply judging whether the character set of the client environment variable is consistent with the character set of the client environment variable, the database side will not convert the character if it is consistent with the character set of the client environment variable, and it will not be converted if it is stored in the database directly.
The function of the character set of the client is to inform the character encoding transmitted by the database.
The database sees that the character set of the client is the same as that of the database, and oracle will no longer convert the characters because it believes that the character encodings on both sides are the same. And at this point,
We cheated the database. Although we set the client character set to be the same as the database, the windows7 we used was zhs16gbk character set encoding (because this is the character encoding used by windows).
For the character "China", the corresponding code in the zhs16gbk character set is d6magentin d0recoery b9refa. At this point, oracle ignores the code and saves it to the database.
Find data-character conversion process (win7 16gbk must be)
When session 1 starts to query, oracle takes these two characters from the table and converts them into zhs16gbk character encoding according to the coding mapping table of character set al32utf8 and character set zhs16gbk. For the coding "e4jingb8joradree5pr 9bd"
The character code of its corresponding zhs16gbk is "d6magentin d0remenb9recoverfa", and the corresponding character of this code is "China", so we can see that this character is displayed normally.
However, for the al32utf8 character encoding stored in the character set, "d6 ~ d0 ~ b9 ~ fa"
Because the windows environment we use to display characters uses the zhs16gbk character set, and there are no characters corresponding to this encoding or symbols that cannot be displayed in the zhs16gbk character set, the utf8---16gbk conversion cannot be found.
So I used "?" Such characters are replaced, which is why we see the characters entered by session 2 become such garbled characters.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.