In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
Data migration needs to import clickhouse from mysql. The summary scheme is as follows, including three methods supported by clickhouse itself and two methods supported by third-party tools.
create table engin mysqlCREATE TABLE [IF NOT EXISTS] [db.] table_name [ON CLUSTER cluster]( name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1] [TTL expr1], name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2] [TTL expr2], ... INDEX index_name1 expr1 TYPE type1(...) GRANULARITY value1, INDEX index_name2 expr2 TYPE type2(...) GRANULARITY value2) ENGINE = MySQL('host:port', 'database', 'table', 'user', 'password'[, replace_query, 'on_duplicate_clause']);
Official documentation: https://clickhouse.yandex/docs/en/operations/table_engines/mysql/
Note that the actual data is stored in a remote mysql database, which can be understood as appearances.
This can be verified by adding and deleting data in mysql.
insert into select from--CREATE TABLE [IF NOT EXISTS] [db.] table_name [ON CLUSTER cluster]( name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1], name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2], ...) ENGINE = engine--Import data INSERT INTO [db.] table [(c1, c2, c3)] select column or * from mysql ('host: port','db','table_name','user',' password')
You can customize the column type, the number of columns, and use clickhouse functions to process data, such as
select toDate(xx) from mysql("host:port","db","table_name","user_name","password")create table as select fromCREATE TABLE [IF NOT EXISTS] [db.] table_nameENGINE =LogAS SELECT *FROM mysql('host:port', 'db', 'article_clientuser_sum', 'user', 'password')
Website: http://jackpgao.github.io/2018/02/04/ClickHouse-Use-MySQL-Data/
Custom columns are not supported, and the ENGIN=MergeTree test written by the blogger in Resources fails.
Create table and insert into select.
Altinity/clickhouse-mysql-data-reader
Altinity is an open-source Python tool used to migrate data from mysql to clickhouse(supports binlog incremental updates and full imports), but the official readme is disconnected from the code and cannot run according to quick start.
##Create table clickhouse-mysql \ --src-host=127.0.0.1 \ --src-user=reader \ --src-password=Qwerty1# \ --table-templates-with-create-database \ --src-table=airline.ontime > create_clickhouse_table_template.sql##Modify script vim create_clickhouse_table_template.sql##Import table clickhouse-client -mn
< create_clickhouse_table_template.sql## 数据导入clickhouse-mysql \ --src-host=127.0.0.1 \ --src-user=reader \ --src-password=Qwerty1# \ --table-migrate \ --dst-host=127.0.0.1 \ --dst-table=logunified \ --csvpool 官方文档: https://github.com/Altinity/clickhouse-mysql-data-reader#mysql-migration-case-1-migrate-existing-data 注意,上述三种都是从mysql导入clickhouse,如果数据量大,对于mysql压力还是挺大的。下面介绍两种离线方式(streamsets支持实时,也支持离线) csv ## 忽略建表clickhouse-client \ -h host \ --query="INSERT INTO [db].table FORMAT CSV" < test.csv 但是如果源数据质量不高,往往会有问题,比如包含特殊字符(分隔符,转义符),或者换行。被坑的很惨。 自定义分隔符, --format_csv_delimiter="|"遇到错误跳过而不中止, --input_format_allow_errors_num=10 最多允许10行错误, --input_format_allow_errors_ratio=0.1 允许10%的错误csv 跳过空值(null) ,报 Code: 27. DB::Exception: Cannot parse input: expected , before: xxxx: (at row 69) ERROR: garbage after Nullable(Date): "8,0020205" sed ' :a;s/,,/,\\N,/g;ta' |clickhouse-client -h host --query "INSERT INTO [db].table FORMAT CSV" 将 ,, 替换成 ,\N,python clean_csv.py --src=src.csv --dest=dest.csv --chunksize=50000 --cols --encoding=utf-8 --delimiter=, clean_csv.py参考我另外一篇032-csv文件容错处理 streamsets streamsets支持从mysql或者读csv全量导入,也支持订阅binlog增量插入,参考我另外一篇 025-大数据ETL工具之StreamSets安装及订阅mysql binlog 。 本文只展示从mysql全量导入clickhouse 本文假设你已经搭建起streamsets服务Enable and restart services
Upload mysql and clickhouse jdbc jars and dependency packages
Convenient way, create pom.xml, use maven unified download
4.0.0 com.anjia demo jar 1.0-SNAPSHOT demo http://maven.apache.org ru.yandex.clickhouse clickhouse-jdbc 0.1.54 mysql mysql-connector-java 5.1.47
If maven is installed locally, execute the following command
mvn dependency:copy-dependencies -DoutputDirectory=lib -DincludeScope=compile
All required jars will be downloaded and copied to the lib directory
Then copy to streamsets /opt/streamsets-datacollector-3.9.1/streamsets-libs-extras/streamsets-datacollector-jdbc-lib/lib/directory
Restart streamsets service
summary
The above is a small series to introduce you to mysql migration to clickhouse 5 ways, I hope to help you, if you have any questions please give me a message, small series will reply to you in time. Thank you very much for your support!
If you think this article is helpful to you, welcome to reprint, please indicate the source, thank you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.