In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly describes how to solve Mysql a large number of data import problems encountered, the article is very detailed, with a certain reference value, interested friends must read!
In projects, you often encounter importing a large amount of data into the database in order to use sql for data analysis. In the process of importing data will encounter some problems that need to be solved, here combined with the practice of importing a txt data of about 4G, the encountered problems and solutions to show, on the one hand to make a summary record, on the other hand, I hope to have a reference for those who encounter the same problem.
The data I imported is an encyclopedia txt file, the file size is more than 4G, the data has more than 65 million pieces, and each piece of data is separated by a newline. Each piece of data contains three fields separated by tabs. The method I use to extract the data is to use a TripleData class to store these three fields, the fields are all String, and then save multiple pieces of data to List, and then save List to mysql database, and save all data to mysql database in batches.
The above is a general idea, the following is the specific import process encountered in the problem.
1 database connection garbled code and compatibility issues.
If there is Chinese in the data, be sure to set the URL of the linked database to the following format.
URL="jdbc:mysql://"+IP+":"+PORT+"/"+DB_NAME+"? useSSL=false&useUnicode=true&characterEncoding=utf-8";
Set the encoding to UTF-8 to solve the garbled code problem, and set useSSL to solve the compatibility problem between JDBC and mysql. If useSSL is not set, an error will be reported. similar to
Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
such error messages. Mainly mysql version is relatively high, JDBC version is relatively low, need compatibility.
2 utf8mb4 coding problem
In the process of importing data, you will also encounter something similar to
SQLException :Incorrect string value: '\xF0\xA1\x8B\xBE\xE5\xA2... ' for column 'name'
This error message, this is because the utf-8 set in mysql is the default 3 bytes, for general data is no problem, if it is a large amount of data, it will inevitably contain some WeChat emoticons, or special characters, they occupy 4 bytes, utf-8 can not handle, so the error. The solution is that mysql introduced a four-byte utf-8 encoding in versions after 5.5.3, which is utf8mb4, and the mysql encoding needs to be reset.
You can follow the following steps: one is to backup the database to be modified. Although utf8 mb4 is backward compatible with utf8, in order to prevent improper operation, it is still necessary to prevent problems before they occur and do a good job of backup. The second is to modify the character set encoding of the database to utf8mb4-UTF-8 Unicode, and the collation utf8mb4_general_ci. The above modifications I use navicat to modify, how to modify with the command line, you can find yourself. The third is to modify the configuration file my.ini, under the root directory of mysql installation. Add the following settings.
[client]default-character-set = utf8mb4[mysqld]character-set-server=utf8mb4collation-server=utf8mb4_general_ci[mysql]default-character-set = utf8mb4
After the modification is complete, you need to restart mysql for the modification to take effect.
Then import the data, and you should be able to import it normally.
3 Time efficiency issues for mass import
Because our data volume is relatively large, we split the data. I divided the 65 million pieces of data into 500 files, each file has about 110,000 pieces of data, put these 110,000 pieces of data into ArrayList, and then import them in batches. "insert into tb (...)" values(...), (...)...; "The method, insert one-time insertion, this time will save a lot of time. An example method is as follows.
public static void insertSQL(String sql,List tripleObjectList) throws SQLException{ Connection conn=null; PreparedStatement psts=null; try { conn=DriverManager.getConnection(Common.URL, Common.DB_USERNAME, Common.DB_PASSWORD); conn.setAutoCommit(false); //Set manual commit //save sql suffix StringBuffer suffix = new StringBuffer(); int count = 0; psts=conn.prepareStatement(""); String s=""; String p=""; String o=""; while (count
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.