In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article is about how to organize the URL encoding and character encoding supported in HTML5. Xiaobian thinks it is quite practical, so share it with you. I hope you can gain something after reading this article. Let's not say much. Let's take a look at it together with Xiaobian.
URL-encoded
URL encoding is the conversion of unprintable characters or characters with special meanings in URLs into a representation that is understood and generally accepted by Web browsers and servers. These characters include:
ASCII control characters-Unprintable characters are usually used for output control. Character ranges are 00-1F in hexadecimal (0-31 in decimal) and 7F (127 in decimal). The complete coding table is provided below.
Non-ASCII control characters-These characters are outside the range of the 128 ASCII character set. This range is part of the ISO-Latin character set and contains the "first half" of the entire hexadecimal ISO-Latin character set 00-FF (decimal 128-255). The complete coding table is provided below.
Preserve characters-such as dollar sign, ampersand, plus sign, universal sign, slash, colon, divide, equal sign, question mark, and "at." All of these symbols have different meanings within the URL and therefore need to be encoded. A complete coding table is provided below.
Unsafe characters-including spaces, question marks, less than symbols, greater than symbols, point characters, percentage symbols, left part of braces, right part of braces, pipe characters, backslashes, carets, wavy lines. Left bracket, right bracket, accent. For some reason, these characters appear in URLs with the potential for misunderstanding. These characters should also always be encoded. The complete coding table is provided below.
Encoding notation requires three characters to replace the expected character: a percent sign, two hexadecimal digits representing the character position in the ASCII character set,
example
One of the most common special characters is space. We cannot enter a space directly in the URL. A space in the character set is the hexadecimal 20. Therefore,%20 can be used to indicate spaces when requesting servers.
This URL actually retrieves a document named new pricing.html from www.example.com
ASCII control character encoding
This includes hexadecimal character codes 00-1F (decimal 0-31) and 7F (decimal 127).
decimal format hexadecimal value character URL code 000
101
202
303
404
505
606
707
808 Backspace 909tab%09100a Newline %0a110b
120c
130d carriage return %0d140e
150f
1610
1711
1812
1913
2014
2115
2216
2317
2418
2519
261a
271b
281c
291d
301e
311f
1277f
%7f
Non-ASCII control character encoding
Includes the entire hexadecimal ISO-Latin character set 80-FF (decimal 128-255) encoded "first half."
decimal format hexadecimal value character URL encoding 12880€%8012981?% 8113082?% 8213183?% 8313284?% 8413385…%8513486?% 8613587?% 8713688?% 8813789‰%891388a?% 8a1398b?% 8b1408c?% 8c1418d?% 8d1428e?% 8e1438f?% 8f14490?% 9014591‘%9114692’%9214793"%9314894"%9414995?% 9515096–%9615197-%9715298?% 9815399?% 991549a?% 9a1559b?% 9b1569c?% 9c1579d?% 9d1589e?% 9e1599f?% 9f160a0
%a0161a1?% a1162a2¢%a2163a3£%a3164a4¤%a4165a5¥%a5166a6|%a6167a7§%a7168a8¨%a8169a9?% a9170aaa%aa171ab?% ab172ac?% ac173ad-%ad174ae?% ae175afˉ%af176b0°%b0177b1±%b1178b22%b2179b33%b3180b4′%b4181b5μ%b5182b6?% b6183b7·%b7184b8?% b8185b91%b9186bao%ba187bb?% bb188bc?% bc189bd?% bd190be?% be191bf?% bf192c0à%c0193c1á%c1194c2?% c2195c3?% c3196c4?% c4197c5?% c5198c6?% v6199c7?% c7200c8è%c8201c9é%c9202caê%ca203cb?% cb204ccì%cc205cdí%cd206ce?% ce207cf?% cf208d0D%d0209d1?% d1210d2ò%d2211d3ó%d3212d4?% d4213d5?% d5214d6?% d6215d7×%d7216d8?% d8217d9ù%d9218daú%da219db?% db220dcü%dc221ddY%dd222deT%de223df?% df224e0à%e0225e1á%e1226e2a%e2227e3?% e3228e4?% e4229e5?% e5230e6?% e6231e7?% e7232e8è%e8233e9é%e9234eaê%ea235eb?% eb236ecì%ec237edí%ed238ee?% ee239ef?% ef240f0e%f0241f1?% f1242f2ò%f2243f3ó%f3244f4?% f4245f5?% f5246f6?% f6247f7÷%f7248f8?% f8249f9ù%f9250faú%fa251fb?% fb252fcü%fc253fdy%fd254fet%fe255ff?% ff
reserved character encoding
The following table is used to encode reserved characters.
decimal format hexadecimal value character URL encoding 3624$%243826&%26432b+%2b442c,%2c472f/%2f583a:%3a593b;%3b613d=%3d633f?% 3f6440@%40
unsafe character encoding
The following table is used to encode unsafe characters.
decimal format hexadecimal value character URL encoding 3220space%203422"%22603c%3e3523#%233725%%251237b{%7b1257d}%7d1247c| %7c925c\%5c945e^%5e1267e~%7e915b[%5b935d]%5d9660`%60
character encoding
Character encoding is a method of converting bytes into characters. To validate or display an HTML document, the program must select a character encoding. HTML5 authors have three ways to set character encoding:
HTTP Content-Type header:
If you are writing cgi programs or similar programs, you can set arbitrary character encodings using the HTTP Content-Type header:
Here is a simple example:
XML/HTML Code Copy content to clipboard
print "Content-Type: text/html; charset=utf-8\r\n";
Element:
You can specify the encoding of the first 512 bytes of an HTML5 document using elements with charset attributes:
Here is a simplified example:
XML/HTML Code Copy content to clipboard
Although this syntax is allowed, the syntax above requires substitution.
Unicode Byte Order Mark (BOM)
A byte order marker (BOM) consists of a U+FEFF character code at the beginning of a data stream, which can be used as a signature to define byte order and encoding, primarily unmarked plaintext files.
Many Windows programs (including Windows Notepad) add 0xEF, 0xBB, 0xBF to the beginning of any document saved as UTF-8. This is the UTF-8 encoding of Unicode byte order notation (BOM), often referred to as UTF-8 BOM, although it has nothing to do with byte order.
For HTML5 documents, we can use Unicode byte order markup (BOM) characters at the beginning of the file. This character provides a signature for the encoding used.
The above is how to organize the URL encoding and character encoding supported in HTML5. Xiaobian believes that some knowledge points may be seen or used in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.