Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to solve the garbled character set in mysql

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

What this article shares to you is about how to solve the garbled character set in mysql. The editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article.

Solve the problem of garbled code in mysql character set

Character-set-server/default-character-set: server character set, which is used by default.

Character-set-database: database character set.

Character-set-table: database table character set.

The priority is increased in turn. So in general, you only need to set up character-set-server, and when you create databases and tables, you don't specify a character set, so you uniformly use the character-set-server character set.

Character-set-client: the character set of the client. The client default character set. When the client sends a request to the server, the request is encoded with this character set. Www.2cto.com

Character-set-results: the resulting character set. When the server returns the result or information to the client, the result is encoded in this character set.

On the client side, if no character-set-results is defined, the character-set-client character set is used as the default character set. So you only need to set the character-set-client character set.

To handle Chinese, you can set both character-set-server and character-set-client to GB2312, or to UTF8 if you want to handle multiple languages at the same time.

On the Chinese question of MySQL

The solution to garbled code is to set the following three system parameters of MySQL to the same character set as the server character set character-set-server before executing the SQL statement.

Character_set_client: the character set of the client.

Character_set_results: the resulting character set.

Character_set_connection: concatenation character set.

Set these three system parameters by sending a statement to MySQL: set names gb2312

About GBK, GB2312, UTF8

UTF- 8:Unicode Transformation Format-8bit, which allows BOM, but usually does not contain BOM. Is used to solve the international character of a multi-byte encoding, it uses 8 bits (that is, one byte) for English, 24 (three bytes) for Chinese coding. UTF-8 contains characters needed by all countries in the world. It is an international code with strong versatility. UTF-8-encoded text can be displayed on browsers that support the UTF8 character set in various countries. For example, if it is a UTF8 code, Chinese can also be displayed on the English IE of foreigners, and they do not need to download the Chinese language support package of IE.

GBK is a standard compatible with GB2312 after expansion based on the national standard GB2312. The text coding of GBK is represented by double bytes, that is, both Chinese and English characters are represented by double bytes. In order to distinguish between Chinese characters, the highest bit is set to 1. GBK, which contains all Chinese characters, is a national code, and its versatility is worse than UTF8, but UTF8 occupies a larger database than GBD.

GBK, GB2312, etc., and UTF8 must be encoded by Unicode before they can be converted to each other:

GBK 、 GB2312--Unicode--UTF8 www.2cto.com

UTF8--Unicode--GBK 、 GB2312

For a website, if there are more English characters, it is recommended to use UTF-8 to save space. But now many forum plug-ins generally only support GBK.

GB2312 is a subset of GBK, and GBK is a subset of GB18030

GBK is a collection of large characters including Chinese, Japanese and Korean characters

If it is a Chinese website recommending GB2312 GBK, there is still a problem sometimes.

In order to avoid all garbled code problems, UTF-8 should be used, and it is very convenient to support internationalization in the future.

UTF-8 can be thought of as a large character set, which contains the encoding of most of the text.

One of the benefits of using UTF-8 is that users in other regions (such as Hong Kong and Taiwan) can view your text without garbled code without installing simplified Chinese support.

Gb2312 is a simplified Chinese code

Gbk supports both simplified and traditional Chinese

Big5 supports traditional Chinese

Utf-8 supports almost all characters

First of all, analyze the situation of garbled code

1. Write as garbled when writing to the database

two。 The query result is returned in garbled code.

What kind of situation is it when the garbled code occurs?

Let's type under the mysql command line first.

Show variables like'% char%'

View the mysql character set settings:

Mysql > show variables like'% char%'

+-+

| | Variable_name | Value |

+-+

| | character_set_client | gbk |

| | character_set_connection | gbk |

| | character_set_database | gbk |

| | character_set_filesystem | binary |

| | character_set_results | gbk |

| | character_set_server | gbk |

| | character_set_system | utf8 |

| | character_sets_dir | / usr/local/mysql/share/mysql/charsets/ |

+-+

In the query results, you can see the client, database connection, database, file system, query in the mysql database system.

Character set settings for results, servers, and systems

Here, the character set of the file system is fixed, and the character set of the system and server is determined at the time of installation, which has nothing to do with the garbled problem.

The problem of garbled codes is related to the character set settings of clients, database connections, databases, and query results.

* Note: the client looks at the way to access the mysql database. Through the command line access, the command line window is the client.

Through connection access such as JDBC, the program is the client www.2cto.com.

When we write Chinese data to mysql, we need to transfer the code to the client, the database connection and the database.

Change

When executing the query, the results are returned, the database connection and the client are converted respectively.

It should be clear by now that garbled occurs in one or more of the databases, clients, query results and database connections.

Each link

Next, let's solve this problem.

When we log in to the database, we use the mysql-- default-character-set= character set-u root-p to connect, and then we

Then use the show variables like'% char%'; command to check the character set settings, and you can find clients, database connections,

The character set of the query result has been set to the character set selected when logging in.

If you are already logged in, you can use the set names character set; the command to achieve the above effect is equivalent to the following command:

Set character_set_client = character set

Set character_set_connection = character set

Set character_set_results = character set

If you are connecting to the database through JDBC, you can write URL as follows:

URL=jdbc:mysql://localhost:3306/abs?useUnicode=true&characterEncoding= character set

Terminals such as JSP pages should also set corresponding character sets.

The character set of the database can be specified by modifying the startup configuration of mysql, or you can add it to the create database

Default character set character set to force the character set of database

Through this setting, the character set is unified in the whole data writing and reading process, and there will be no garbled code.

Why do you write Chinese directly from the command line without setting and garbled code?

It is clear that from the command line, the character set settings of the client, database connection and query results remain unchanged.

After a series of transcoding, the input Chinese is then transferred back to the original character set. Of course, what we see is not garbled.

But this does not mean that Chinese is correctly stored as Chinese characters in the database.

For example, there is now a utf8 encoding database, client connections use GBK encoding, and connection uses the default

ISO8859-1 (that is, latin1 in mysql), we send the string "Chinese" on the client side, the client side

A string of binary codes in GBK format is sent to the connection layer, and the connection layer sends the segment in ISO8859-1 format

The binary code is sent to the database, and the database stores the code in utf8 format, and we put this field in utf8

When the format is read out, it must be garbled, that is to say, Chinese data is stored in garbled form when it is written into the database.

When the query operation is performed on the same client, a set of operations contrary to the write operation is done, which is the wrong binary in utf8 format.

The code is converted to the correct GBK code and displayed correctly.

/ * java, set the code in * * /

First of all, let's talk about where the coding can be set in java.

The following two encoding formatting methods apply to jsp pages (* .jsp)

The following methods are suitable for jsp, servlet, action (* .java)

Request.setCharacterEncoding ("UTF-8")

Response.setCharacterEncoding ("UTF-8")

The following is suitable for html pages (* .htm;*.html)

Www.2cto.com

Tomcate setting Encoding (server.xml)

Mysql set Encoding Command

SET character_set_client = utf8

SET character_set_connection = utf8

SET character_set_database = utf8

SET character_set_results = utf8;/* note that it is very useful here * /

SET character_set_server = utf8

SET collation_connection = utf8_bin

SET collation_database = utf8_bin

SET collation_server = utf8_bin

Configure default encoding in my.ini

Default-character-set=utf8

Connection setting coding

Jdbc:://192.168.0.5:3306/test?characterEncoding=utf8

/ * * java corresponds to mysq code * * /

Common coding UTF-8;GBK;GB2312;ISO-8859-1 in java

Corresponding to the coded utf8;gbk;gb2312;latin1 in my

/ * filter use * /

/ / filter Settings Encoding filtering (SetCharacterEncodingFilter.java)

Package com.sorc

Import java.io.*

Import javax.servlet.*

Import javax.servlet.http.*

Public class SetCharacterEncodingFilter extends HttpServlet implements Filter {

Private FilterConfig filterConfig

Private String encoding=null

/ / Handle the passed-in FilterConfig

Public void init (FilterConfig filterConfig) {

This.filterConfig=filterConfig

Encoding=filterConfig.getInitParameter ("encoding")

}

/ / Process the request/response pair

Public void doFilter (ServletRequest request,ServletResponse response,FilterChain filterChain) {

Try {

Request.setCharacterEncoding (encoding)

FilterChain.doFilter (request,response)

} catch (ServletException sx) {

FilterConfig.getServletContext () .log (sx.getMessage ()

} catch (IOException iox) {

FilterConfig.getServletContext () .log (iox.getMessage ()

}

}

/ / Clean up resources

Public void destroy () {

} www.2cto.com

}

/ / web.xml configuration filter method (web.xmd)

Setcharacterencodingfilter

Com.sorc.SetCharacterEncodingFilter

Encoding

Utf8

Setcharacterencodingfilter

/ *

/ * with the above-mentioned satisfactory solution for the interview * * /

1. Solutions using GBK coding

The easiest place to set the code is to use the GBK database gbk and then use a filter to filter the code for gbk.

The effect is to add data no garbled code read out free garbled database management tool free of garbled code everywhere sql structure and data free of garbled code

two。 Use UTF-8 coding solution

All codes are set to UTF-8

Database coding utf8

Set filter coding utf8

Database connection? characterEncoding=utf8

Then run SET character_set_results = gbk on the database management tool or the mysql command line

The effect is to add data no garbled code read out free garbled database management tools free of garbled codes everywhere there are garbled codes in sql structures and data

3. The solution of using UTF8 database and latin1 on the page

Jap java tomcat is set to UTF-8 www.2cto.com

Filter utf8

Database connection? characterEncoding=latin1

Database other latin1

Then run SET character_set_results = gbk on the database management tool or the mysql command line

The effect is to add data no garbled code read out free garbled database management tools free of garbled codes everywhere there are garbled codes in sql structures and data

The above is how to solve the garbled character set in mysql. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report