Summary of basic knowledge of clickhouse 04/25 Update SLTechnology News&Howtos

Summary of basic knowledge of clickhouse

2025-04-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article mainly explains the "summary of basic knowledge of clickhouse". The content of the explanation is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "summary of basic knowledge of clickhouse".

Clickhouse is a column database management system (columnar DBMS) for online analytical processing (OLAP).

The traditional database can provide services normally when the data size is small, the index size is suitable for memory, and the data cache hit rate is high enough. But the cruel thing is that this ideal situation will eventually come to an end as the business grows, and queries will become slower and slower. You may solve the problem by adding more memory, ordering faster disks, and so on (scale-up), but this is just delaying the solution of the essential problem. If your need is to figure out how to query results quickly, then ClickHouse may be able to solve your problem.

Application scenarios:

1. The vast majority of requests are for read access

two。 The data needs to be updated in large numbers (more than 1000 rows) rather than in a single row, or there is no update operation at all.

3. The data is just added to the database and there is no need to modify it

4. When reading data, a large number of rows are extracted from the database, but only a small number of columns are used

5. The table is "wide", that is, the table contains a large number of columns

6. Relatively low query frequency (usually hundreds or less per server per second)

7. For simple queries, allow a delay of about 50 milliseconds

8. Column values are relatively small numeric values and short strings (for example, only 60 bytes per URL)

9. High throughput is required when processing a single query (billions of rows per server per second)

10. No transaction required

11. Low data consistency requirements

twelve。 Only one large table is queried in each query. Except for a big watch, all the others are small watches.

13. The query result is significantly smaller than the data source. That is, the data is filtered or aggregated. The returned result does not exceed the memory size of a single server

Accordingly, using ClickHouse has its own limitations:

1. Whether real deletions / updates support transactions are not supported (look forward to future versions)

two。 Secondary index is not supported

3. Limited SQL support, join implementation is different

4. Window function is not supported

5. Metadata management requires manual intervention and maintenance

Commonly used SQL syntax-list database lists show databases;-- list tables in database show tables;-- create database create database test;-- delete a table drop table if exists test.t1;-- create the first table create / * temporary*/ table / * if not exists*/ test.m1 (id UInt16,name String) ENGINE = Memory -- insert test data insert into test.m1 (id, name) values (1, 'abc'), (2,' bbbb');-- query select * from test.m1

Default value

The default value is always available in ClickHouse. If it is not specified by display, it will be processed by field type:

Numeric type, 0

String, empty string

Empty array

Date, 0000-00-00

Time, 0000-0000: 00:00

Note: NULLs is not supported

Data type

1. Integer: UInt8,UInt16,UInt32,UInt64,Int8,Int16,Int32,Int64

Range starts with U-2N hand 2 ^ N-1; non-U starts 0 ~ 2 ^ N-1

two。 Enumeration type: Enum8,Enum16

Enum ('hello'=1,'test'=-1), Enum is a signed integer mapping, so negative numbers are also possible.

3. String type: FixedString (N), String

N is the maximum number of bytes, not character length. If it is a UTF8 string, it will occupy 3 bytes and GBK will occupy 2 bytes. String can be used to replace data types such as VARCHAR,BLOB,CLOB.

4. Time type: Date

5. Array type: Array (T)

T is a basic type, including arry. Officially, multidimensional arrays are not recommended.

6. Tuple: Tuple

7. Structure: Nested (name1 Type1,name2 Type2,...)

Similar to a kind of map knot

Physicochemical column

Specify the MATERIALIZED expression, which treats a column as a materialized column, which means that the value of the column cannot be obtained from the insert statement and can only be calculated by itself. meanwhile,

The materialized column also does not appear in the result of select *:

Drop table if exists test.m2;create table test.m2 (a MATERIALIZED (baux1), b UInt16) ENGINE = Memory;insert into test.m2 (b) values (1); select * from test.m2;select a, b from test.m2

Expression column

An ALIAS expression column is in some way the same as a materialized column, except that its value cannot be obtained from the insert statement. The difference is that the materialized column actually saves the data (so that the query does not need to be calculated)

The expression column does not save the data (which always needs to be evaluated when querying), but just returns the result of the expression when querying.

Create table test.m3 (an ALIAS (bread1), b UInt16) ENGINE = Memory;insert into test.m3 (b) values (1); select * from test.m3;select a, b from test.m3; engine / engine

Engine is the essence of clickhouse design

TinyLog

The simplest engine, each column is saved as a file, the content is compressed, does not support indexing

This engine has no concurrency control, so when you need to read and write, reading will make mistakes. If you write at the same time, the content will be broken.

Application scenarios:

a. Basically, it's the kind of thing that only writes once.

b. Then there is the read-only scene.

c. Not suitable for processing large amounts of data, officially recommended that tables using this engine have a maximum of 1 million rows of data

Drop table if exists test.tinylog;create table test.tinylog (a UInt16, b UInt16) ENGINE = TinyLog;insert into test.tinylog (a UInt16 b) values (7 Magi 13)

At this point / var/lib/clickhouse/data/test/tinylog saves the directory structure of the data:

├── a.bin ├── b.bin └── sizes.json

A.bin and b.bin are compressed corresponding column data, and the size of each * .bin file is recorded in sizes.json

Log

This kind of engine is basically the same as TinyLog.

Its improvement is to add a _ _ marks.mrk file, which records the offset of each data block.

One of the uses of this is that the range of reads can be accurately segmented, making it possible to use concurrent reads

However, it cannot support concurrent writes, and one write operation will block other read and write operations.

Log does not support indexing, and because there is a redundant data of _ _ marks.mrk, when writing data, the table will be invalidated if there is a problem.

Application scenarios:

Similar to TinyLog, it is suitable for scenarios where temporary data can be saved after a write, followed by a read-only scenario.

Drop table if exists test.log;create table test.log (a UInt16, b UInt16) ENGINE = Log;insert into test.log (a UInt16 b) values (7 Magi 13)

At this point / var/lib/clickhouse/data/test/log saves the directory structure of the data:

├── _ _ marks.mrk ├── a.bin ├── b.bin └── sizes.json

Memory

Memory engine, the data is stored directly in memory in uncompressed original form, and the server restart data will disappear.

It can read in parallel, and the time for reading and writing mutexes is also very short.

Does not support indexing, simple query has a very high performance

Application scenarios:

a. Conduct a test

b. In scenarios where very high performance is required and the amount of data is not too large (the upper limit is about 100 million rows)

Merge

A tool engine does not save data itself, but is only used to chain multiple tables in a specified library.

In this way, read operations can be performed concurrently and the index of the original table can be utilized at the same time, but this engine does not support write operations

When you specify the engine, you need to specify the library and table to be linked. The library name can use an expression, and the table name can be specified using a regular expression.

Create table test.tinylog1 (id UInt16, name String) ENGINE=TinyLog;create table test.tinylog2 (id UInt16, name String) ENGINE=TinyLog;create table test.tinylog3 (id UInt16, name String) ENGINE=TinyLog;insert into test.tinylog1 (id, name) values (1, 'tinylog1'); insert into test.tinylog2 (id, name) values (2,' tinylog2'); insert into test.tinylog3 (id, name) values (3, 'tinylog3'); use test Create table test.merge (id UInt16, name String) ENGINE=Merge (currentDatabase (),'^ tinylog [0-9] +'); select _ table,* from test.merge order by id desc

┌─ _ table ───┬─ id ─┬─ name ─┐

│ tinylog3 │ 3 │ tinylog3 │

│ tinylog2 │ 2 │ tinylog2 │

│ tinylog1 │ 1 │ tinylog1 │

└─┴────┴─┘

Note: the column _ table is due to the use of a virtual column of the extra Merge.

a. It represents the source table of the original data, and it will not appear in the results of show table

B. select * will not include it

Distributed

Similar to Merge, Distributed accesses various physical tables through a logical table. When setting up the engine, it looks like:

Distributed (remote_group, database, table [, sharding_key])

Where:

Remote_servers parameters in remote_group / etc/clickhouse-server/config.xml

Database is the name of the library in each server

Table is the table name

Sharding_key is an addressing expression, which can be a column name or a function call such as rand (). It works with weight in remote_servers to determine which shard to write to.

Remote_servers in the configuration file

1 false 172.17.0.3 9000 2 false 172.17.0.4 9000

Log is the name of a shard group, which is the value of the remote_group above

Shard is a fixed label

Weight is the weight, and the sharding_key mentioned above has something to do with it.

In a nutshell, the above configuration, in theory:

The probability that the first shard is "selected" is 1 / (1 + 2) and the second is 2 / (1 + 2), which is easy to understand. However, the work of sharding_key is calculated according to the "hit interval" of the actual number, that is, the first interval is the period of [0,1) and the second interval is the period of [1,1 / 2). For example, if sharding_key is set to id, when id=0 or id=3, it must be written into the first shard. If sharding_key is set to rand (), the system will make its own generalization conversion accordingly. This is a probability scenario.

Internal_replication defines the write behavior for multiple replica.

If false, data is written to all replica, but the consistency of data writing is not guaranteed, so over time, the data of each replica is likely to be different. If true, only the data is written to the first writable replica (the physical table handles the rest on its own).

Replica defines each redundant copy. Options include host, port, user, password, etc.

Looking at a practical example, let's first create a physical table on two machines and insert some test data:

Create table test.tinylog_d1 (id UInt16, name String) ENGINE=TinyLog;insert into test.tinylog_d1 (id, name) values (1, 'Distributed record 1'); insert into test.tinylog_d1 (id, name) values (2,' Distributed record 2')

Create a logical table in one of them:

Create table test.tinylog_d (id UInt16, name String) ENGINE=Distributed (log, test,tinylog_d1, id);-insert data into logical table and observe data distribution insert into test.tinylog_d (id, name) values (0, 'main'); insert into test.tinylog_d (id, name) values (1,' main'); insert into test.tinylog_d (id, name) values (2, 'main') Select name,sum (id), count (id) from test.tinylog_d group by name

Note: the write operation in the logical table is asynchronous and will be cached on the local file system first, and there is no strict control over the inaccessible state of the physical table, so it is possible for write failures to lose data.

Null

Empty engine, any data written will be ignored, and the result of the read must be empty.

Note, however, that although the data itself is not stored, structural and data format constraints still exist as normal tables, and you can also create views on this engine

Buffer

1.Buffer engine, like an upper application of Memory storage (there is no corresponding directory on disk)

two。 Its behavior is a buffer, and the written data is first placed in the buffer, and when a threshold is reached, the data is automatically written to another specified table.

3. Like Memory, there are many restrictions, such as no index

4.Buffer is a layer connected to the front of other tables, and the read operation of it will also be automatically applied to the later tables, but because of the restrictions mentioned above, generally, when we read the data, we can just read it directly from the source table. The delay of this bit of data in the buffer will have little impact as long as it is properly configured.

5.Buffer can also be followed by no tables, so that when the data reaches the threshold, it will be discarded.

Some features:

If the data written at one time is too large or too much to exceed the max condition, it will be written directly to the source table.

When deleting or changing the source table, it is recommended that the Buffer table be deleted and rebuilt.

When "friendly restart", the Buffer data will first fall to the source table, "brute force restart", and the data in the Buffer table will be lost.

Even if Buffer is used, many small data writes are much slower than one write by big data (the difference between thousands of rows and millions of rows)

-create source table create table test.mergetree (sdt Date, id UInt16, name String, point UInt16) ENGINE=MergeTree (sdt, (id, name), 10);-create Buffer table-- Buffer (database, table, num_layers, min_time, max_time, min_rows, max_rows, min_bytes, max_bytes) create table test.mergetree_buffer as test.mergetree ENGINE=Buffer (test, mergetree, 16, 3, 20, 2, 10, 1, 10000) Insert into test.mergetree (sdt, id, name, point) values ('2017-07-10, 1,' a, 20); insert into test.mergetree_buffer (sdt, id, name, point) values ('2017-07-10, 1,' baked, 10); select * from test.mergetree;select'-'; select * from test.mergetree_buffer

Database database

Table source table, in addition to string constants, variables can also be used.

Num_layers is similar to the concept of "partition". The min / max after each partition is calculated independently, and the officially recommended value is 16.

Min / max this group of configuration recommendations, is to set the threshold, respectively, time (seconds), number of lines, space (bytes).

The rule of threshold: "all min conditions are met, or at least one max condition is satisfied".

According to our table above, all the min conditions are: after 3 seconds, 2 pieces of data, 1 Byte. A max condition is: 20 seconds, or 10 pieces of data, or 10K

Set

The Set engine is a little special because it is only used on the right side of the IN operator, and you can't select it

Create table test.set (id UInt16, name String) ENGINE=Set;insert into test.set (id, name) values (1, 'hello');-- select 1 where (1,' hello') in test.set;-- default UInt8 requires manual type conversion select 1 where (toUInt16 (1), 'hello') in test.set

Note: the Set engine table runs in full memory, but the relevant data will be saved on disk and loaded into memory at startup. Therefore, accidental interruption or violent restart may cause the problem of data loss.

Join

TODO

MergeTree

This engine is the highlight of ClickHouse. It supports a two-tier index of a date and a set of primary keys, and updates data in real time. At the same time, the granularity of the index can be customized, plus direct support for sampling.

MergeTree (EventDate, (CounterID, EventDate), 8192) MergeTree (EventDate, intHash42 (UserID), (CounterID, EventDate, intHash42 (UserID)), 8192)

EventDate the column name of a date

IntHash42 (UserID) sampling expression

(CounterID, EventDate) primary key group (which supports expressions in addition to column names), or an expression

8192 granularity of primary key index

Drop table if exists test.mergetree1;create table test.mergetree1 (sdt Date, id UInt16, name String, cnt UInt16) ENGINE=MergeTree (sdt, (id, name), 10);-the format of the date seems to be yyyy-mm-ddinsert into test.mergetree1 (sdt, id, name, cnt) values ('2018-06-01, 1,' aaa', 10); insert into test.mergetree1 (sdt, id, name, cnt) values ('2018-06-02, 4,' bbb', 10) Insert into test.mergetree1 (sdt, id, name, cnt) values ('2018-06-03, 5,' ccc', 11)

The directory structure of / var/lib/clickhouse/data/test/mergetree1 at this time:

├── 20180601_20180601_1_1_0 │ ├── checksums.txt │ ├── columns.txt │ ├── id.bin │ ├── id.mrk │ ├── name.bin │ ├── name.mrk │ ├── cnt.bin │ ├── cnt.mrk │ ├── cnt.idx primary.idx sdt.bin sdt.mrk -- Save the block offset ├── 20180602 "20180602" 2 "2" 0 │ └──... ├── 20180603 "3" 3 "3" 0 │ └──... ├── format_version.txt └── detached

ReplacingMergeTree

1. On the basis of MergeTree, the function of "dealing with duplicate data" is added = > real-time data scenario.

two。 Compared to MergeTree, ReplacingMergeTree adds a "version column" at the end, which works with the time column to distinguish which pieces of data are "new" and throw away the old ones (this process is processed during merge, not when the data is written, and the duplicate data is saved and checked as usual).

3. Primary key column groups are used to distinguish duplicate rows

-the types allowed in the version column are integers of the UInt family, or Date or DateTimecreate table test.replacingmergetree (sdt Date, id UInt16, name String, cnt UInt16) ENGINE=ReplacingMergeTree (sdt, (name), 10, cnt); insert into test.replacingmergetree (sdt, id, name, cnt) values ('2018-06-10, 1,' a, 20); insert into test.replacingmergetree (sdt, id, name, cnt) values ('2018-06-10, 1,' a, 30) Insert into test.replacingmergetree (sdt, id, name, cnt) values ('2018-06-11, 1,' a, 20); insert into test.replacingmergetree (sdt, id, name, cnt) values ('2018-06-11, 1,' a, 30); insert into test.replacingmergetree (sdt, id, name, cnt) values ('2018-06-11, 1,' a, 10); select * from test.replacingmergetree If the record does not execute merge, you can manually trigger the merge behavior optimize table test.replacingmergetree

┌─ sdt ─┬─ id ─┬─ name ─┬─ cnt ─┐

│ 2018-06-11 │ 1 │ a │ 30 │

└─┴────┴─┴─┘

SummingMergeTree

1.SummingMergeTree is the sum summation of data in the merge phase.

2.sum summation columns can be specified. Unspecified columns that cannot be added will take a value that appears first.

Create table test.summingmergetree (sdt Date, name String, a UInt16, b UInt16) ENGINE=SummingMergeTree (sdt, (sdt, name), 8192, (a)); insert into test.summingmergetree (sdt, name, a, b) values ('2018-06-10,' a, 20); insert into test.summingmergetree (sdt, name, a, b) values ('2018-06-10,' baked, 2,11) Insert into test.summingmergetree (sdt, name, a, b) values ('2018-06-11,' baked, 3, 18); insert into test.summingmergetree (sdt, name, a, b) values ('2018-06-11,' baked, 3,82); insert into test.summingmergetree (sdt, name, a, b) values ('2018-06-11, 3, 11) Insert into test.summingmergetree (sdt, name, a, b) values ('2018-06-12 hours,' cages, 1,35);-- manually trigger the merge behavior optimize table test.summingmergetree;select * from test.summingmergetree

┌─ sdt ─┬─ name ─┬─ a ─┬── b ─┐

│ 2018-06-10 │ a │ 1 │ 20 │

│ 2018-06-10 │ b │ 2 │ 11 │

│ 2018-06-11 │ a │ 3 │ 11 │

│ 2018-06-11 │ b │ 6 │ 18 │

│ 2018-06-12 │ c │ 1 │ 35 │

└─┴─┴───┴────┘

Note: an add column cannot be a column in the primary key, and if a row of data can be added both are null, then the row will be deleted

AggregatingMergeTree

AggregatingMergeTree is a design that optimizes incremental computation for aggregate function results on the basis of MergeTree. It will preprocess aggregated data for primary keys during merge.

The aggregate functions applied to AggregatingMergeTree include not only ordinary sum, uniq, etc., but also sumState, uniqState, and sumMerge, uniqMerge.

1. Pre-calculation of aggregated data

It is a trade-off of "space for time", and at the cost of reducing dimensions.

Dim1dim2dim3measure1aaaaa11aaaab21bbbbb31ccccb21ccccc11ddddc21dddda11

Suppose the original has three dimensions, one of which requires count.

Dim1dim2dim3measure1aaaaa11aaaab21bbbbb31ccccb21ccccc11ddddc21dddda11

By reducing one dimension, the M is aggregated with the count function.

Dim2dim3count (measure1) a13b22b31c11c21

two。 Incremental calculation of aggregated data

For AggregatingMergeTree engine tables, you cannot use normal INSERT to add data, you can use:

A. INSERT SELECT to insert data

b. More commonly, you can create a materialized view

Drop table if exists test.aggregatingmergetree;create table test.aggregatingmergetree (sdt Date, dim1 String, dim2 String, dim3 String, measure1 UInt64) ENGINE=MergeTree (sdt, (sdt, dim1, dim2, dim3), 8192);-create a materialized view, using AggregatingMergeTreedrop table if exists test.aggregatingmergetree_view;create materialized view test.aggregatingmergetree_viewENGINE = AggregatingMergeTree (sdt, (dim2, dim3), 8192) asselect sdt,dim2, dim3, uniqState (dim1) as uvfrom test.aggregatingmergetreegroup by sdt,dim2, dim3 Insert into test.aggregatingmergetree (sdt, dim1, dim2, dim3, measure1) values ('2018-06-10,' aaaa', 'a,'10, 1); insert into test.aggregatingmergetree (sdt, dim1, dim2, dim3, measure1) values ('2018-06-10, aaaa',' a,'10, 1); insert into test.aggregatingmergetree (sdt, dim1, dim2, dim3, measure1) values ('2018-06-10, aaaa',' baked,'20, 1) Insert into test.aggregatingmergetree (sdt, dim1, dim2, dim3, measure1) values ('2018-06-10,' bbbb', 'baked,' 30, 1); insert into test.aggregatingmergetree (sdt, dim1, dim2, dim3, measure1) values ('2018-06-10, cccc',' baked,'20, 1); insert into test.aggregatingmergetree (sdt, dim1, dim2, dim3, measure1) values ('2018-06-10, cccc',' canals,'10, 1) Insert into test.aggregatingmergetree (sdt, dim1, dim2, dim3, measure1) values ('2018-06-10,' dddd', 'cations,' 20); insert into test.aggregatingmergetree (sdt, dim1, dim2, dim3, measure1) values ('2018-06-10, dddd',' a),'10);-- aggregate count (measure1) select dim2, dim3, count (measure1) from test.aggregatingmergetree group by dim2, dim3 by dim2 and dim3 -- aggregate UVselect dim2 by dim2. Uniq (dim1) from test.aggregatingmergetree group by dim2;-- manually triggers mergeOPTIMIZE TABLE test.aggregatingmergetree_view;select * from test.aggregatingmergetree_view;-- to check uvselect dim2 and uniqMerge (uv) from test.aggregatingmergetree_view group by dim2 order by dim2 of dim2.

CollapsingMergeTree

It is specially designed for a "flexible" stock method in OLAP scenarios. Under the premise that the data cannot be changed, let alone deleted, the impact of the old data can be erased and the old data can be "subtracted" by means of "operation", so as to solve the problem of "final state", such as how many people are online at present?

The incremental storage mode of "add instead of delete" brings the advantage of convenient aggregate calculation, but the cost is double the storage space, and for scenarios that only care about the latest state, the intermediate data are useless.

CollapsingMergeTree is basically the same as MergeTree when it is created, except for the last additional parameter, you need to specify the Sign bit (must be of type Int8)

Create table test.collapsingmergetree (sign Int8, sdt Date, name String, cnt UInt16) ENGINE=CollapsingMergeTree (sdt, (sdt, name), 8192, sign); Thank you for reading, the above is the content of "clickhouse basic knowledge Summary". After the study of this article, I believe you have a deeper understanding of the problem of clickhouse basic knowledge summary, and the specific use needs to be verified by practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.