In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
The following brings you the specific difference between utf8 and utf8mb4 coding in MySQL. I believe you must have read similar articles. What's the difference between what we bring to everyone? Let's take a look at the text. I'm sure you'll get something after reading the specific difference between utf8 and utf8mb4 coding in MySQL.
I. brief introduction
MySQL added the utf8mb4 code after 5.5.3. Mb4 means most bytes 4 and is designed to be compatible with four bytes of unicode. Fortunately, utf8mb4 is a superset of utf8, and there is no need to do any conversion except to change the encoding to utf8mb4. Of course, in order to save space, utf8 is generally sufficient.
II. Content description
It says that since utf8 can store most Chinese characters, why use utf8mb4? It turns out that the maximum character length of the utf8 encoding supported by mysql is 3 bytes, and an exception will be inserted if a 4-byte wide character is encountered. The maximum Unicode character that a three-byte UTF-8 can encode is 0xffff, the basic multilingual plane (BMP) in Unicode. That is, any Unicode character that is not in the basic multi-text plane cannot be stored using Mysql's utf8 character set. This includes Emoji emoticons (Emoji is a special Unicode code commonly found on ios and android phones), and many less commonly used Chinese characters, as well as any new Unicode characters, and so on.
Third, the root cause of the problem
The original UTF-8 format used one to six bytes and could encode up to 31 characters. The latest UTF-8 specification uses only one to four bytes and can encode up to 21 bits, just enough to represent all 17 Unicode planes.
Utf8 is a character set in Mysql that supports only UTF-8 characters up to three bytes long, which is the basic multi-text plane in Unicode.
Why does utf8 in Mysql only support UTF-8 characters that are up to three bytes long?
I thought about it for a moment, maybe it was because Unicode didn't have an auxiliary plane at the beginning of Mysql development. At that time, the Unicode committee had a dream that "65535 characters is enough for the world to use". The string length in Mysql is the number of characters, not bytes, and for the CHAR data type, you need to reserve enough length for the string. When using the utf8 character set, the length that needs to be preserved is the maximum character length of utf8 multiplied by the length of the string, so it is only natural that the maximum length of utf8 is 3. For example, CHAR (100) Mysql retains 300 bytes. As for why subsequent versions do not support 4-byte UTF-8 characters, I think one is for backward compatibility, and that characters outside the basic multilingual plane are rarely used.
To save 4-byte UTF-8 characters in Mysql, you need to use the utf8mb4 character set, but only support it after version 5.5.3 (see version: select version ();). I think in order to get better compatibility, we should always use utf8mb4 instead of utf8. For CHAR type data, utf8mb4 consumes a little more space, and according to Mysql's official recommendation, use VARCHAR instead of CHAR.
Do you think it's what you want about the specific difference between utf8 and utf8mb4 coding in MySQL above? If you want to know more about it, you can continue to follow our industry information section.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.