Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

The solution of error reporting of Emoji emoticons into MySQL Database

2025-02-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

auther:Jane.Hoo

Last week, developers reported a problem: front-end apps grab microblogging information, and there are always several pieces of data inserted unsuccessfully every day. The application log shows:

java.sql.SQLException: Incorrect string value: '\xF0\x9F\x92\xAA",... ' for column 'raw_json' at row 1, Exception:org.springframework.jdbc.UncategorizedSQLException:

The raw_json field is displayed as follows:

As you can see from the image above, the inserted field contains emoji emoticons. It is basically possible to determine the problem of character coding. Our database uses utf8 encoding. Ordinary character strings or emoticons occupy 3 bytes, so utf8 is enough, but emoticons occupy 4 bytes on the mobile side. Ordinary utf8 is not enough. In order to cope with the opportunities and challenges of wireless Internet, avoid the problems brought by emoji emoticons, and MySQL database related to wireless, it is recommended to adopt utf8mb4 character set in advance. This must be used as a technical selection point for the mobile Internet industry.

utf8 and utf8mb4 Description:

UTF- 8: Unicode Transformation Format-8bit, allows BOM, but usually does not. is a multibyte encoding used to address international characters. It uses 8 bits (one byte) for English and 24 bits (three bytes) for Chinese.

Code. UTF-8 contains characters needed by all countries in the world, is an international code, and has strong universality. UTF-8 encoded text can be displayed in browsers from countries that support the UTF-8 character set. For example, if it is UTF8 code, it can also be used in English IE of foreigners.

Display Chinese, they don't need to download IE's Chinese language support pack.

UTF8MB4: MySQL added utf8mb4 encoding after 5.5.3, mb4 means most bytes 4, specifically for compatibility with four-byte unicode.

Now that you've found the cause, let's do it. Since only the raw_json column contains emoticons, modify the character set for this column first:

ALTER TABLE xx_pnl_weibo_usershow MODIFY ` raw_json` varchar(3000) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci ;

After the test, insertion failed.

Modify the character set of the table:

alter table xx_pnl_weibo_usershow CHARSET=utf8mb4 ;

After the test, insertion failed.

Modify character sets at the library level

vi /etc/my.cnf

[client]

default-character-set=utf8mb4

[mysqld]

character-set-server = utf8mb4

service mysqld restart

After the test, the insertion was successful.

emoji emoji code reference:

http://punchdrunker.github.io/iOSEmoji/table_html/flower.html

Because it is an online production business, you need to do a good job in the test environment first. Avoid peak business periods and apply online for downtime window operations.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report