In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article focuses on "how to use Java to achieve unstructured data migration", interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Now let the editor take you to learn "how to use Java to achieve unstructured data migration"!
Operation instructions 1, JAVA version of the migration tool description
The S3Transfer tool for Java is currently in public beta. This migration tool supports the migration of files from AWS S3, Aliyun, Tencent Cloud, Baidu Cloud and other storage to Jingdong Cloud object storage, as well as local file list migration. The general logic is to obtain the address or outer chain of the file first, then read the data according to the outer chain, and then migrate it. This tool is the integration of the three tools of listObject,transfer,md5check:
The purpose of the listObject tool is to list all the files under the user-configured bucket. If prefix is configured, list all files under the prefix
The purpose of the transfer tool is to migrate the source files to the oss object store
The md5check tool is used for MD5 value checking.
2. Tool characteristics
Support for rich data sources:
Local data: migrating locally stored data to OSS
Other object storage: currently supports AWS S3, Aliyun OSS, Tencent Cloud COS, Baidu BOS, Huawei OBS storage migration to Jingdong Cloud OSS, and will continue to expand in the future
URL list: download and migrate to Jingdong Cloud OSS according to the specified URL download list
Bucket replication: the Bucket data of JD.com Cloud OSS replicate each other. Cross-account and cross-region data replication is supported.
Breakpoint continuation is supported
Support for flow control
Support for migrating files with a specific prefix
Support for parallel data download and upload
Migration check: the check after object migration.
3. Description of practical migration scenario
To ensure the maneuverability and visualization of practical operations, this document uses large file transfer (two 10G files) with a type of s3file, which is transferred from Jingdong Cloud account ① to the ② object storage space of Jingdong Cloud account through public network to simulate the migration of object storage across public clouds. The task controller is a Centos 7.4 CVM of JD.com Cloud.
4. Remarks
Large file transfer divides a single file into several slices for transfer, as shown in the figure:
two。 During the migration process, the migration log is printed to the. / log directory by default. All the migrated files will be printed to audit-0.log, and the successfully migrated files will be printed to the audit.success log (if the files transferred successfully on the destination side are deleted, the audit.success log files need to be deleted before they can be retransferred). If you need to filter the files that failed in the migration, use the command:
1 grep "1 $" audit-0.log* 2, environment preparation 1, new CVM regional operating system configuration bandwidth JDK version North China-Beijing CentOS 7.4 64-bit 8-core 16G20Mbps1.8.0_1912, new Bucket
You need to prepare two Jingdong Cloud accounts, one to create an object storage space in North China-Beijing and one in East China-Shanghai to simulate the migration of object storage across public clouds.
Account ①-North China-Beijing: beijing-to-shanghai
Account ②-East China-Shanghai: shanghai-from-beijing
3. Use S3fs to mount Bucket on the CVM instance
Install dependency packages
1 yum install automake fuse fuse-devel gcc-c++ git libcurl-devel libxml2-devel make openssl-devel-y
two。 Install and compile
1 git clone https://github.com/s3fs-fuse/s3fs-fuse.git2 cd s3fs-fuse3. / autogen.sh4. / configure5 make & & make install
3. Create a password file
1 echo Access_Key_ID:Access_Key_Secret > ~ / .passwd-s3fs2 chmod 600 ~ / .passwd-s3fs
How to obtain Access_Key_ID:Access_Key_Secret: https://uc.jdcloud.com/account/accessKey
4. The mount object is stored in the local directory / hcc (the directory name is defined by its own name abbreviation)
1 mkdir / hcc2 s3fs bucketname / hcc-o passwd_file=~/.passwd-s3fs-o url= "https://s3.cn-north-1.jcloudcs.com"
Mkdir: create a hcc folder as a local mount directory
S3fs: manual mount command, where bucketname is the bucket name, / hcc is the local mount path, passwd_file is the password file location, and url is the compatible S3 domain name for Jingdong Cloud object storage (enter the Bucket domain name of the space)
5. View the mount result
1 df-h
6. Generate files in the mounted object store through the dd command
This command generates two small files of 10GB size in the mounted source object store Bucket.
1 cd / hcc2 for
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.