In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/02 Report--
Source code of this article: GitHub click here | | GitEE click here |
1. Introduction to ClickHouse. 1. Basic introduction.
Yandex's open source database for data analysis, called ClickHouse, is suitable for streaming or batch-loading time series data. ClickHouse should not be used as a general database, but as a distributed real-time processing platform for fast query of massive data with ultra-high performance. In terms of data summary query (such as GROUP BY), ClickHouse query speed is very fast.
Download warehouse: https://repo.yandex.ru/clickhouse Chinese documents: https://clickhouse.yandex/docs/zh/2, database features
(1) column database
Column database is a database that stores data based on column-related storage architecture, which is mainly suitable for batch data processing and real-time query.
(2) data compression
Data compression is not used in some sequential database management systems. However, data compression does play a key role in achieving excellent storage systems.
(3) disk storage of data
Many column databases can only work in memory, which results in more equipment budgets than they actually do. ClickHouse is designed for systems that work on traditional disks, providing lower storage costs per GB.
(4) Multi-core parallel processing
Large queries can be parallelized in ClickHouse in a natural way to use all the resources available on the current server.
(5) Multi-server distributed processing
In ClickHouse, data can be stored on different shard, each shard consists of a set of replica for fault tolerance, and queries can be processed on all shard in parallel.
(6) support SQL and indexing
ClickHouse supports an SQL-based query language, which is for the most part compatible with the SQL standard. Supported queries include GROUPBY,ORDERBY,IN,JOIN and unrelated subqueries. Window functions and related subqueries are not supported. Sorting the data by primary key will help ClickHouse find a specific value or range of data with a low latency of tens of milliseconds.
(7) Vector engine
In order to use CPU efficiently, data is not only stored in columns, but also processed as vectors (part of columns).
(8) Real-time data update
ClickHouse supports the definition of primary keys in tables. In order to enable the query to quickly look up the range in the primary key, the data is always stored in the MergeTree incrementally. Therefore, data can be continuously and efficiently written to the table, and there is no locking behavior in the process of writing.
2. Installation process under Linux
1. Download the warehouse
Curl-s https://packagecloud.io/install/repositories/altinity/clickhouse/script.rpm.sh | sudo os=centos dist=7 bash
2. View the installation package
Sudo yum list 'clickhouse*'
3. Install the service
Sudo yum install-y clickhouse-server clickhouse-client
4. View the installation list
Sudo yum list installed 'clickhouse*'
Console output
Installed Packagesclickhouse-client.noarchclickhouse-common-static.x86_64clickhouse-server.noarch
5. View the configuration
Cd / etc/clickhouse-server/vim config.xml data directory: / var/lib/clickhouse/ temporary directory: / var/lib/clickhouse/tmp/ log directory: / var/log/clickhouse-serverHTTP port: 8123TCP port: 9000
6. Configure access permissions
Remove the comments configured below from the config.xml file.
::
7. Start the service
/ etc/rc.d/init.d/clickhouse-server start
8. View the service
Ps-aux | grep clickhouse III. Basic operation 1. Table statement CREATE TABLE cs_user_info (`id` UInt64, `pass_ name` String, `pass_ word` String, `phone` String, `email` String, `create_ day` Date DEFAULT CAST (now (), 'Date')) ENGINE = MergeTree (create_day, intHash42 (id), 8192)
Note: official recommendation engine, MergeTree
The most powerful table engine in Clickhouse is the MergeTree (merge Tree) engine and other engines in this series (* MergeTree). The basic concepts of the MergeTree engine family are as follows. When you have a large amount of data to insert into the table, you need to write pieces of data in batches efficiently and want them to be merged according to certain rules in the background. This strategy is much more efficient than constantly modifying (rewriting) data into storage during insertion.
2. Write INSERT INTO cs_user_info (id,user_name,pass_word,phone,email) VALUES in batches (1) 13923456789), (2) write in batches of INSERT INTO cs_user_info (cicada1) VALUES (1), (3) write in batches (345) (3) query sentence SELECT * cicada.com'). SELECT * FROM cs_user_info WHERE user_name='smile' AND pass_word='234';SELECT * FROM cs_user_info WHERE id IN (1Jing 2); SELECT * FROM cs_user_info WHERE id=1 OR id=2 OR id=3
Query statements are very similar to manipulating MySQL databases.
Source code address GitHub address https://github.com/cicadasmile/linux-system-baseGitEE address https://gitee.com/cicadasmile/linux-system-base
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.