Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to understand the bitter and difficult character set in WEB Development

2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

This article shows you how to understand the difficult character set in WEB development, the content is concise and easy to understand, it will definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.

I remember that when I first did javaweb development, I was confused by this coding problem. I was often confused when the coding was normal for a while and messed up again. At that time, most of them did not know why they knew the progress of the project. Later, when I had time, I went through the whole system, and finally figured out the whole context.

In C++ 's CGI development, we like to use latin, which belongs to a byte encoding format, saving space by storing mysql, and C++ is also a relatively easy language to control to the byte level. Therefore, it is not a big problem to go through the framework encapsulation.

In the Java language, there are a lot of things that involve changing the coding. If a place is not set up, it will fly everywhere. A general summary includes the following parts: browser, server, database, operating system.

Browser:

If you use a template language, html needs to set the character set that is displayed. This is suitable for browsers to determine what codes are displayed.

Extension, the order in which the browser recognizes the code:

1. If the HTTP header declares charset, the HTTP header will be used

two。 If the HTTP header is not set, the meta tag will be parsed.

3. If there is no meta, the browser will identify the code according to whether the auto detect is set.

4. Otherwise, the character encoding of the local UI is used.

Server:

For dynamic languages such as JSP, you need to set the encoding format in the jsp header. When the J2EE server parses the JSP, it will encode the entire page as UTF-8 output, otherwise it will output according to the system default encoding format ISO-8859-1. The format of JSP settings is as follows:

As we all know, JSP corresponds to servlet. The encoding of servlet corresponds to the following settings:

Public void service (HttpServletRequest request, HttpServletResponse response) throws ServletException,IOException {response.setContentType ("text/html;charset=utf-8");}

And do not miss the commonly used spring tool class, transcoding filter, very practical. This filter helps you convert unset code filters when you use struts or spring mvc. The settings are as follows:

Set CharacterEncoding org.springframework.web.filter.CharacterEncodingFilter encoding UTF-8

What if there are still garbled codes? Parameter passing in doGet mode is bound to have garbled problems. You only need to set the encoded character set in the tomcat listener as follows (files are generally stored in the / tomcat installation directory / conf/server.xml):

When you are developing, don't forget that the java file itself has an encoding format. Right-click in the class file to view the properties.

If you forget to change the encoding format of the file during development, windows defaults to GBK, and then you have to go all the way to utf8-encoded linux. There are so many files that you can't change them one by one. In fact, it is very simple, only need to set the environment parameters of the java command-Dfile.encoding=GBK to solve.

When compiling java code, if you use ant, you need to set the compiled character set in javac. In this way, the printed log output to the file or the console will not garbled.

The character set set by maven at compile time:

< artifactId>

Maven-compiler-plugin

< version>

2.5

< configuration>

< optimize>

True

< showDeprecation>

False

< debuglevel>

Lines,source

< source>

1.6

< target>

1.6

< encoding>

UTF-8

< meminitial>

128m

< maxmem>

768m

The xml of sqlmap's sql xml,sping also needs to be set because it involves cross-platform. Add at the top:

Database:

Here is a list of the most used Mysql character set settings. Open the configuration file for mysql (linux is usually at / etc/my.cnf, windows is in the mysql installation directory my.ini). The settings are as follows:

[mysqld] default-character-set = utf8 [mysql] character_set_server = utf8

Jdbc needs to be set

Jdbc: mysql://192.168.0.237:3306/dzh_db?useUnicode=true&characterEncoding=UTF-8

These are set up in general Chinese will not be a problem.

Recently, however, there is a problem that is very funny. I used to think that as long as all the characters are set up, all the data can be entered into the database, but as a result, some characters are not, such as ●■★. Later, I turned these characters into bytecodes, but they were not three-digit utf8. I was dripping with sweat. Later queries can be processed by filtering utf8 special characters.

Public static String Utf2String (byte buf []) {int len = buf.length; StringBuffer sb = new StringBuffer (len / 2); for (int I = 0; I < len; iTunes +) {if (by2int (Buf [I]) 2); int c = bh 2); bl = by2int (bl

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report