In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
Xiaobian to share with you what is the difference between utf8 and utf8mb4 in mysql, I believe most people still do not know how, so share this article for your reference, I hope you have a lot of harvest after reading this article, let's go to understand it together!
I. Introduction
MySQL added this utf8mb4 encoding after 5.5.3, mb4 means most bytes 4, specifically for compatibility with four-byte unicode. Fortunately, utf8mb4 is a superset of utf8, and no conversion is required except for changing the encoding to utf8mb4. Of course, in order to save space, utf8 is usually enough.
II. Description of content
it says that since utf8 can store most Chinese characters, why use utf8 mb4? The maximum character length of utf8 encoding supported by mysql is 3 bytes. If you encounter a 4-byte wide character, you will insert an exception. The maximum Unicode character that can be encoded with three bytes of UTF-8 is 0xffff, which is the basic multilingual plane (BMP) in Unicode. That is, any Unicode character that is not in the basic multitext plane cannot be stored using Mysql's utf8 character set. This includes emoji (Emoji is a special Unicode encoding commonly found on iOS and Android phones), and many less commonly used Chinese characters, as well as any new Unicode characters and so on.
III. Root causes of problems
The original UTF-8 format used one to six bytes and could encode up to 31-bit characters. The latest UTF-8 specification uses only one to four bytes and can encode up to 21 bits, just enough to represent all 17 Unicode planes.
utf8 is a character set in Mysql that only supports UTF-8 characters up to three bytes long, which is the basic multitext plane in Unicode.
Why does utf8 in Mysql only support UTF-8 characters with a maximum length of three bytes?
I thought about it, maybe because Mysql was just starting to develop, Unicode did not have an auxiliary plane. At the time, the Unicode committee was dreaming of "65535 characters enough for the world." String length in Mysql counts as characters rather than bytes, and for CHAR data types, you need to reserve enough length for strings. When using the utf8 character set, the length that needs to be reserved is the longest character length of utf8 multiplied by the string length, so the maximum length of utf8 is naturally limited to 3, for example, CHAR(100) Mysql will retain 300 bytes. As for why later versions don't support 4-byte UTF-8 characters, I think one reason is backward compatibility, and the other is that characters outside the basic multilingual plane are rarely used.
To save UTF-8 characters of 4 bytes in Mysql, you need to use the utf8mb 4 character set, but only after version 5.5.3 (see version: select version();). I feel that for better compatibility you should always use utf8mb4 instead of utf8. For CHAR type data, utf8mb4 will consume some more space, according to Mysql official recommendations, use VARCHAR instead of CHAR.
What is the difference between utf8 and utf8mb4 in mysql? Thank you for reading! I believe that everyone has a certain understanding, hope to share the content to help everyone, if you still want to learn more knowledge, welcome to pay attention to the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.