In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article introduces how to solve the Java coding and network transmission coding problems, the content is very detailed, interested friends can refer to, hope to be helpful to you.
Recently tried FTP search, encountered coding problems, studied the next.
The String inside the Java is encoded by Unicode, with two bytes for each character.
The Java encoding and decoding methods are as follows:
String str = "hi Hello me"; byte [] gbkBytes=str.getBytes ("GBK"); / / convert the Unicode encoding of String to GBK encoding and output to String string=new String (gbkBytes, "GBK") in bytes; / / Byte streams in gbkBytes are decoded into Unicode Java strings in GBK scheme
1. Coding of form data
The problem now is that in the network, we do not know the encoding scheme of the byte stream sent by the client (the browser will encode the data before sending! Each browser is not the same! )
Solutions are as follows:
Of course, the coding scheme in the URLEncoder.encode (str, "utf-8") and URLDecoder.decode (strReceive, "utf-8") methods should be the same.
2. The coding of the web site
However, the above methods are only suitable for the submission of form data; not for URL! The reason is that URLEncoder also encodes'/ 'and the browser sends the Times wrong! Then, as long as the http://IP/ subdirectory leaves the http://IP/ part intact (of course, this part does not have Chinese), the subsequent data can be segmented and encoded with'/'.
The code is as follows:
/ * * Encapsulation of {@ link URLEncoder#encode (String, String)} without encoding'/ 'characters, segment encoding of other characters * * @ param str * the URL * @ param encoding * encoding format * @ return string to be encoded is separated by the character' /', and each segment is encoded separately in encoding encoding format * @ version: 2012'01'10 *
* Note:':'is not considered. If you encode and decode http:// directly, an error will occur! Please separate it before using it. You can use the * {@ link # encodeURLAfterHost (String, String)} method to solve this problem *
* Note: encode characters / together, resulting in URL request exception! * / public static String encodeURL (String str, String encoding) {final char splitter ='/'; try {StringBuilder sb = new StringBuilder (2 * str.length ()); int start = 0; for (int I = 0; I
< str.length(); i++) { if (str.charAt(i) == splitter) { sb.append(URLEncoder.encode(str.substring(start, i), encoding)); sb.append(splitter); start = i + 1; } } if (start < str.length()) sb.append(URLEncoder.encode(str.substring(start), encoding)); return sb.toString(); } catch (UnsupportedEncodingException e) { e.printStackTrace(); } return null; } /** * 对IP地址后的URL通过'/'分割后进行分段编码. * * 对{@link URLEncoder#encode(String, String)} * 的封装,但不编码'/'字符,也不编码网站部分(如ftp://a.b.c.d/部分,检测方法为对三个'/'字符的检测,且要求前两个连续), * 对其他字符分段编码 * * @param str * 要编码的URL * @param encoding * 编码格式 * @return IP地址后字符串以字符'/'隔开,对每一段单独编码以encoding编码格式编码,其他部分不变 * @version: 2012_01_10 * * 注意:对字符/一起编码,导致URL请求异常!! */ public static String encodeURLAfterHost(String str, String encoding) { final char splitter = '/'; int index = str.indexOf(splitter);//***个'/'的位置 index++;//移到下一位置!! if (index < str.length() && str.charAt(index) == splitter) {//检测***个'/'之后是否还是'/',如ftp:// index++;//从下一个开始 index = str.indexOf(splitter, index);//第三个'/';如ftp://anonymous:tmp@g.cn:219.223.168.20/中的***一个'/' if (index >0) {return str.substring (0, index + 1) + encodeURL (str.substring (index + 1), encoding); / such as ftp://anonymous:tmp@g.cn:219.223.168.20/ Sky} else return str;// such as ftp://anonymous:tmp@g.cn:219.223.168.20} return encodeURL (str, encoding) } / * * the URL after the IP address is segmented and encoded by'/'. * this method is paired with {@ link # decodeURLAfterHost (String, String)} and uses * @ param str * the encoding format of URL * @ param encoding * str to be decoded * @ return IP address followed by a string separated by a character'/'. Each segment is decoded separately and decoded in encoding format, and the rest remains unchanged * @ version: 2012 "01" 10 * *
* Note: decoding characters / together will result in URL request exception! * / public static String decodeURLAfterHost (String str, String encoding) {final char splitter ='/'; int index = str.indexOf (splitter); / / * *'/ 'position index++;// move to the next location! If (index
< str.length() && str.charAt(index) == splitter) {//检测***个'/'之后是否还是'/',如ftp:// index++;//从下一个开始 index = str.indexOf(splitter, index);//第三个'/';如ftp://anonymous:tmp@g.cn:219.223.168.20/中的***一个'/' if (index >0) {return str.substring (0, index + 1) + decodeURL (str.substring (index + 1), encoding); / such as ftp://anonymous:tmp@g.cn:219.223.168.20/ Sky} else return str;// such as ftp://anonymous:tmp@g.cn:219.223.168.20} return decodeURL (str, encoding) } / * * this method is paired with {@ link # encodeURL (String, String)} *
* Encapsulation of {@ link URLDecoder#decode (String, String)} without decoding the'/ 'character, segment decoding of other characters * * @ param str * the encoding format of URL * @ param encoding * str to be decoded * @ return string is separated by the character' /', and each segment is encoded separately in encoding format * @ version: 201201o10 * *
* Note: encode characters / together, resulting in URL request exception! * / public static String decodeURL (String str, String encoding) {final char splitter ='/'; try {StringBuilder sb = new StringBuilder (str.length ()); int start = 0; for (int I = 0; I
< str.length(); i++) { if (str.charAt(i) == splitter) { sb.append(URLDecoder.decode(str.substring(start, i), encoding)); sb.append(splitter); start = i + 1; } } if (start < str.length()) sb.append(URLDecoder.decode(str.substring(start), encoding)); return sb.toString(); } catch (UnsupportedEncodingException e) { e.printStackTrace(); } return null; } 3、乱码了还能恢复? 问题如下: 貌似图中的utf-8改成iso8859-1是可以的,utf-8在字符串中有中文时不行(但英文部分仍可正确解析)!!!毕竟GBK的字节流对于utf-8可能是无效的,碰到无效的字符怎么解析,是否可逆那可不好说啊。 测试代码如下: package tests; import java.io.UnsupportedEncodingException; import java.net.URLEncoder; /** * @author LC * @version: 2012_01_12 */ public class TestEncoding { static String utf8 = "utf-8"; static String iso = "iso-8859-1"; static String gbk = "GBK"; public static void main(String[] args) throws UnsupportedEncodingException { String str = "hi好啊me"; // System.out.println("?的十六进制为:3F"); // System.err // .println("出现中文时,如果编码方案不支持中文,每个字符都会被替换为?的对应编码!(如在iso-8859-1中)"); System.out.println("原始字符串:\t\t\t\t\t\t" + str); String utf8_encoded = URLEncoder.encode(str, "utf-8"); System.out.println("用URLEncoder.encode()方法,并用UTF-8编码后:\t\t" + utf8_encoded); String gbk_encoded = URLEncoder.encode(str, "GBK"); System.out.println("用URLEncoder.encode()方法,并用GBK编码后:\t\t" + gbk_encoded); testEncoding(str, utf8, gbk); testEncoding(str, gbk, utf8); testEncoding(str, gbk, iso); printBytesInDifferentEncoding(str); printBytesInDifferentEncoding(utf8_encoded); printBytesInDifferentEncoding(gbk_encoded); } /** * 测试用错误的编码方案解码后再编码,是否对原始数据有影响 * * @param str * 输入字符串,Java的String类型即可 * @param encodingTrue * 编码方案1,用于模拟原始数据的编码 * @param encondingMidian * 编码方案2,用于模拟中间的编码方案 * @throws UnsupportedEncodingException */ public static void testEncoding(String str, String encodingTrue, String encondingMidian) throws UnsupportedEncodingException { System.out.println(); System.out .printf("%s编码的字节数据->JavaString- decoded with% s and converted to Unicode encoding > converted to byte stream with% s decoding-> String\ n ", encodingTrue, encondingMidian, encondingMidian, encodingTrue) after reading Java (decoded with% s); System.out.println (" original string:\ t\ t "+ str); byte [] trueEncodingBytes = str.getBytes (encodingTrue) System.out.println ("original byte stream:\ t\ t" + bytesToHexString (trueEncodingBytes) + "\ t\ t System.out.println / byte stream encoded with" + encodingTrue + "); String encodeUseMedianEncoding = new String (trueEncodingBytes, encondingMidian) System.out.println ("Intermediate string:\ t\ t" + encodeUseMedianEncoding + "\ t\ t System.out.println / string after decoding the original byte stream with" + encondingMidian + "); byte [] midianBytes = encodeUseMedianEncoding.getBytes (" Unicode ") System.out.println ("intermediate byte stream:\ t\ t" + bytesToHexString (midianBytes) + "\ t\ t System.out.println / that is, the Unicode byte stream corresponding to the intermediate string (consistent with Java memory data)"; byte [] redecodedBytes = encodeUseMedianEncoding .getBytes (encondingMidian) System.out.println ("Decoding byte stream:\ t\ t" + bytesToHexString (redecodedBytes) + "\ t\ t String restored / string after decoding intermediate string (stream) with" + encodingTrue + "); String restored = new String (redecodedBytes, encodingTrue); System.out.println (" decoded string:\ t\ t "+ restored +"\ t\ t is the same as the original data? "+ restored.endsWith (str);} / * encodes the string as a byte stream of GBK, UTF-8, iso-8859-1 and outputs * * @ param str * @ throws UnsupportedEncodingException * / public static void printBytesInDifferentEncoding (String str) throws UnsupportedEncodingException {System.out.println (") System.out.println ("original String:\ t\ t" + str + "\ t\ t length is:" + str.length ()); String unicodeBytes = bytesToHexString (str.getBytes ("unicode")); System.out.println ("Unicode bytes:\ t\ t" + unicodeBytes); String gbkBytes = bytesToHexString (str.getBytes ("GBK")); System.out.println ("GBK bytes:\ t\ t" + gbkBytes) String utf8Bytes = bytesToHexString (str.getBytes ("utf-8")); System.out.println ("UTF-8 bytes:\ t\ t" + utf8Bytes); String iso8859Bytes = bytesToHexString (str.getBytes ("iso-8859-1")); System.out.println ("iso8859-1 bytes:\ t" + iso8859Bytes + "\ t\ t length:" + iso8859Bytes.length () / 3) System.out.println ("visible Unicode adds two bytes of FE FF before, and then two bytes of each character") } / * convert each byte converted from the array to a two-bit hexadecimal character, separated by a space * * @ param bytes * byte sequence to be converted * @ return converted string * / public static final String bytesToHexString (byte [] bytes) {StringBuilder sb = new StringBuilder (bytes.length * 2) For (int I = 0; I < bytes.length; iTunes +) {String hex = Integer.toHexString (bytes [I] & 0xff); / / & 0xff is a high position complement 1 when byte is less than 0. Change back to 0 if (hex.length () = = 1) sb.append ('0'); sb.append (hex); sb.append ("") } return sb.toString (). ToUpperCase ();}} about how to solve the coding problems in Java coding and network transmission is shared here. I hope the above content can be helpful to you and learn more knowledge. If you think the article is good, you can share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.