In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article mainly explains "what are the types of coding". The content of the explanation is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "what are the types of coding"?
ASCII
ASCII code, the American standard code for information interchange, is a code developed by the United States to represent 128 English characters.
The ASCII code specifies a total of 128 characters, occupying only the last 7 bits of a byte, and the first bit is 0.
BASE64
BASE64 is a similar set of binary-to-text (binary-to-text) coding rules that allow binary data to be represented in the format of ASCII strings after being interpreted as radix-64.
BASE64 coding is commonly used in scenarios where binary data needs to be encoded by storing and transmitting binary data on a medium designed to process text data. This is to ensure the integrity of the data and do not have to modify the data during transmission.
BASE64 requires that every three bytes of 8Bit be converted into four bytes of 6Bit (38 = 46 = 24), and then add two more high-order zeros to the 6Bit to form four 8Bit bytes, that is, the converted string will theoretically be 1 and 3 longer than the original.
BCD
BCD code, the full name of Binary-Coded Decimal , referred to as BCD, also known as binary-decimal code.
BCD coding is a form of binary coding, which is a decimal code encoded in binary.
When a decimal number is stored in the calculation, it is much more efficient to convert the decimal number directly to the corresponding BCD code than to convert the remainder of the decimal number into binary by division.
UNICODE
Unicode, as its name suggests, is a uniform coding of all symbols.
There are many ways of coding in the world, and the same binary number can be interpreted as different symbols.
Therefore, if you want to open a text file, you must know how to encode it, otherwise it will be garbled if you interpret it in the wrong way.
UNICODE is only a collection of character coding, but there is no clear stipulation on how to store it. At present, UNICODE generally uses UCS-2 (Unicode Character Set), that is, two bytes to encode a character, but in fact some characters may need 3 bytes or even 4 bytes to encode.
UNICODE has a Bom header (Byte Order Mark), and the Bom header is 2 bytes, that is, FF FE, which is used to indicate whether the storage method is big or small. If the first two bytes of the document are FF FE, it is the small end, and the FE FF is the big end.
UTF8
With the popularity of the Internet, there is a strong demand for a unified coding method.
UTF-8 is the most widely used way to implement Unicode on the Internet.
Other implementations include UTF-16 (characters represented by two or four bytes) and UTF-32 (characters represented by four bytes), but are rarely used on the Internet.
Again, the relationship here is that UTF-8 is one of the ways Unicode is implemented.
One of the biggest features of UTF-8 is that it is a variable length coding method.
It can use 1 to 4 bytes to represent a symbol, and the byte length varies according to different symbols.
UTF-8 's coding rules are simple, with only two:
1) for a single-byte symbol, the first bit of the byte is set to 0, and the last 7 bits are the Unicode code of the symbol. So for English letters, the UTF-8 code and the ASCII code are the same.
2) for n-byte symbols (n > 1), the first n bits of the first byte are set to 1, the n + 1 bit is set to 0, and the first two bits of the next byte are all set to 10. The rest of the unmentioned binary bits are all Unicode codes of this symbol.
ANSI
ANSI is the code of the American National Standards Institute. Strictly speaking, it is not a kind of coding, and it only exists in windows systems.
For English, ANSI stands for ASCII coding, for simplified Chinese, ANSI represents GB2312 coding (for Windows simplified Chinese version only, if the traditional Chinese version will use Big5 code).
GB2312
A common coding method in simplified Chinese, which uses 2 bytes to represent one Chinese character, which can represent about 65536 Chinese characters.
Thank you for your reading, the above is the content of "what are the types of coding". After the study of this article, I believe you have a deeper understanding of the type of coding, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.