In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article introduces the relevant knowledge of "what are the advantages of ClickHouse". In the operation of practical cases, many people will encounter such a dilemma. Then let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
What is ClickHouse?
ClickHouse: a column database management system (DBMS) for online analysis (OLAP).
Let's first sort out some basic concepts.
OLTP: is a traditional relational database, mainly operating additions, deletions, changes and queries, emphasizing transaction consistency, such as banking system and e-commerce system.
OLAP: is a warehouse database, mainly reading data, doing complex data analysis, focusing on technical decision support, providing intuitive and simple results
Then we use the diagram to understand the difference between a column database and a row database.
In traditional row database systems (MySQL, Postgres, and MS SQL Server), the data is stored in the following order:
In a column database system (ClickHouse), the data is stored in the following order:
The two are compared in the mode of storage:
The above is the basic introduction of ClickHouse. More information can be found in the official manual.
II. Business problems
The business end has a large table with 50 million data and two auxiliary tables stored in Mysql. The query cost of a single join table is 3 minutes, and the execution efficiency is very low. After index optimization, horizontal table division and logic optimization, the effect is low, so I decided to solve this problem with the help of ClickHouse.
Finally, through optimization, the query time is reduced to less than 1 s, and the query efficiency is increased by 200 times.
I hope that through this article, we can help you master this weapon quickly and avoid detours in practice.
III. ClickHouse practice
Clickhouse installation under 1.Mac
I am installing through docker to view the tutorials. You can also download CK compilation and installation, which is relatively troublesome.
two。 Data migration: from Mysql to ClickHouse
ClickHouse supports most of the Mysql syntax and has a low migration cost. Currently, there are five migration scenarios:
Create table engin mysql, mapping scheme data is still in Mysql
Insert into select from, create the table first, and then import
Create table as select from, creating tables and importing at the same time
Csv offline import
Streamsets
Choose the third option for data migration:
CREATE TABLE [IF NOT EXISTS] [db.] table_name ENGINE = Mergetree AS SELECT * FROM mysql ('host:port',' db', 'database',' user', 'password')
3. Performance test comparison
Type data scale size query speed MySQL 5000 million 10G205sClickHouse50 million 600MB1s
4. Data synchronization scheme
1) temporary table
Create a new temp intermediate table, synchronize all the Mysql data to the temp table in the ClickHouse, and then replace the original ClickHouse table, which is suitable for scenarios with moderate amount of data, increments and frequent variables
2) synch
Open source synchronization software recommendation: the principle of synch is to obtain sql statements through Mysql's binlog log, and then consume task through message queues
Why is 5.ClickHouse fast?
You only need to read the column data to be calculated, instead of reading the whole row data, which reduces the IO cost
Same column, same type, ten times compression increase, further reduce IO
Clickhouse does personalized search algorithm according to different storage scenarios.
IV. The pit encountered
Differences in data types between 1.ClickHouse and mysql
Use the statement of Mysql to query and find an error:
Solution: LEFT JOIN B b ON toUInt32 (h.id) = toUInt32 (ec.post_id), transit, unified unsigned type association
two。 Deletions or updates are performed asynchronously, only ensuring final consistency
Querying the CK manual found that even if the best Mergetree is supported for data consistency, it only ensures final consistency:
If there is a high requirement for data consistency, it is recommended that everyone do full synchronization to solve the problem.
This is the end of the content of "what are the advantages of ClickHouse". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.