What is the watch engine of ClickHouse? 07/19 Update SLTechnology News&Howtos

What is the watch engine of ClickHouse?

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article mainly talks about "what is the watch engine of ClickHouse". Interested friends may wish to have a look at it. The method introduced in this paper is simple, fast and practical. Next, let the editor take you to learn "what is the watch engine of ClickHouse"?

What is the role of the table engine to determine where tables are stored, how to store which queries are supported, and how to support the use of concurrent data access indexes? can perform multithreaded requests for data replication parameter table engine classification engine names MergeTree series MergeTree, ReplacingMergeTree, SummingMergeTree, AggregatingMergeTree CollapsingMergeTree, VersionedCollapsingMergeTree, GraphiteMergeTreeLog series TinyLog, StripeLog, LogIntegration EnginesKafka, MySQL, ODBC, JDBC, HDFSSpecial EnginesDistributed, MaterializedView, Dictionary, Merge, File, Null, Set, Join, URL View, Memory, Application scenario of BufferLog Series Table engine

The Log series table engine is relatively simple and is mainly used to quickly write to small tables (tables with about 1 million rows) and then read them all out. That is, multiple queries are written at a time.

Common characteristics of Log series table engines when data is stored on disk, appending data to the end of the file does not support concurrent read and write, and when writing data to the table, the query against the table will be blocked until the end of the write operation does not support index does not support atomic writing: if some operations (abnormal server shutdown) interrupt the write operation You may find that tables with corrupted data do not support ALTER operations (which modify table settings or data, such as delete, update, etc.)

TinyLog

TinyLog is one of the Log series engines with simple function and low performance. Its storage structure consists of two parts: data file and metadata. Among them, the data file is stored independently by column, that is, each column field corresponds to a file. Apart from that, TinyLog does not support concurrent data reading.

StripLog supports concurrent reading of data files, and when reading data, ClickHouse uses multiple threads to read, each thread processing a separate block of data. In addition, StripLog stores all column data in the same file, reducing the number of files used.

Log supports concurrent reading of data files, and when reading data, ClickHouse uses multiple threads to read, each thread processing a separate block of data. The Log engine stores each column data in a separate file.

Use of TinyLog table engine

The engine is suitable for scenarios that write once and read multiple times. This engine can be used for intermediate tables that process small batches of data. It is worth noting that the performance of using a large number of small tables to store data can be very low.

CREATE TABLE emp_tinylog (

Emp_id UInt16 COMMENT 'employee id'

Name String COMMENT 'employee name'

Work_place String COMMENT 'workplace'

Age UInt8 COMMENT 'employee age'

Depart String COMMENT 'Department'

Salary Decimal32 (2) COMMENT 'wages'

) ENGINE=TinyLog ()

INSERT INTO emp_tinylog

VALUES (1), (2), (2), (2), (), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2) (2), (2), (2), (2) (2), (2) (2), (2) (2) (2), (2) (2), (2) (2), (2) (2), (2) (2), (2) (2), (2) (2), (2), (2), (2), (2)

INSERT INTO emp_tinylog

VALUES (3 recordings bobbing department 'Beijing', 33 recorder 'Finance Department', 50000), (4 recordings tonybadge 'Hangzhou', 28 mine`s sales department, 50000)

Enter the default data storage directory and look at the underlying data storage form. You can see the file corresponding to each column of the TinyLog engine table.

[root@cdh04 emp_tinylog] # pwd

/ var/lib/clickhouse/data/default/emp_tinylog

[root@cdh04 emp_tinylog] # ll

Total dosage 28

-rw-r- 1 clickhouse clickhouse 56 September 17 14:33 age.bin

-rw-r- 1 clickhouse clickhouse 97 September 17 14:33 depart.bin

-rw-r- 1 clickhouse clickhouse 60 September 17 14:33 emp_id.bin

-rw-r- 1 clickhouse clickhouse 70 September 17 14:33 name.bin

-rw-r- 1 clickhouse clickhouse 68 September 17 14:33 salary.bin

-rw-r- 1 clickhouse clickhouse 18 9 17 14:33 sizes.json

-rw-r- 1 clickhouse clickhouse 80 September 17 14:33 work_place.bin

# # viewing sizes.json data

# # record the information of the corresponding data size in each .bin file using JSON format in sizes.json file

{

"yandex": {

"age%2Ebin": {

"size": "56"

}

"depart%2Ebin": {

"size": "97"

}

"emp_id%2Ebin": {

"size": "60"

}

"name%2Ebin": {

"size": "70"

}

"salary%2Ebin": {

"size": "68"

}

"work_place%2Ebin": {

"size": "80"

}

An error occurs when we perform the ALTER operation, indicating that the table engine does not support ALTER operation

-- the following operations will report an error:

DB::Exception: Mutations are not supported by storage TinyLog.

ALTER TABLE emp_tinylog DELETE WHERE emp_id = 5

ALTER TABLE emp_tinylog UPDATE age = 30 WHERE emp_id = 4; StripLog table engine uses

StripeLog has higher query performance than TinyLog (has .mrk tag files and supports parallel queries) and uses fewer file descriptors (all data is saved in the same file).

CREATE TABLE emp_stripelog (

Emp_id UInt16 COMMENT 'employee id'

Name String COMMENT 'employee name'

Work_place String COMMENT 'workplace'

Age UInt8 COMMENT 'employee age'

Depart String COMMENT 'Department'

Salary Decimal32 (2) COMMENT 'wages'

) ENGINE=StripeLog

-- insert data

INSERT INTO emp_stripelog

VALUES (1), (2), (2), (2), (), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2) (2), (2) (2), (2) (2), (2) (2) (2), (2) (2) (2), (2) (2), (2) (2) (2), (2) (2), (2) (2), (2) (2), (2) (2), (2) (2), (2)

INSERT INTO emp_stripelog

VALUES (3 recordings bobbing department 'Beijing', 33 recorder 'Finance Department', 50000), (4 recordings tonybadge 'Hangzhou', 28 mine`s sales department, 50000)

-- query data

-- because the data is inserted twice, there are two blocks when querying

Cdh04:) select * from emp_stripelog

SELECT *

FROM emp_stripelog

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 20000.00 │

│ 2 │ jack │ Shanghai │ 26 │ personnel Department │ 10000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ───┬─── salary ─┐

│ 3 │ bob │ Beijing │ 33 │ Finance Department │ 50000.00 │

│ 4 │ tony │ Hangzhou │ 28 │ sales Department │ 50000.00 │

└─┴─┘

Enter the default data storage directory and view the underlying data storage form

[root@cdh04 emp_stripelog] # pwd

/ var/lib/clickhouse/data/default/emp_stripelog

[root@cdh04 emp_stripelog] # ll

Total dosage 12

-rw-r- 1 clickhouse clickhouse 673 Sep 17 15:11 data.bin

-rw-r- 1 clickhouse clickhouse 281 September 17 15:11 index.mrk

-rw-r- 1 clickhouse clickhouse 69 September 17 15:11 sizes.json

You can see that the storage structure corresponding to the StripeLog table engine consists of three files:

Data.bin: data file. All column fields are saved in the same file, and their data is written to data.bin. Index.mrk: data tag, which stores the location information of the data in the data.bin file (the offset corresponding to each inserted data block). Using the data tag, multiple threads can be used to read the compressed data blocks in the data.bin in parallel, so as to improve the performance of data query. Sizes.json: metadata file that records information about data.bin and index.mrk sizes

Tip:

The StripeLog engine stores all the data in a file, and for each INSERT operation, ClickHouse appends the data block to the end of the table file.

StripeLog engine also does not support ALTER UPDATE and ALTER DELETE operations

Use of Log table engine

Log engine table is suitable for temporary data, one-time write, and test scenarios. Log engine combines the strengths of TinyLog table engine and StripeLog table engine and is the highest performance table engine in the Log series of engines.

CREATE TABLE emp_log (

Emp_id UInt16 COMMENT 'employee id'

Name String COMMENT 'employee name'

Work_place String COMMENT 'workplace'

Age UInt8 COMMENT 'employee age'

Depart String COMMENT 'Department'

Salary Decimal32 (2) COMMENT 'wages'

) ENGINE=Log

INSERT INTO emp_log VALUES (1), (2), (2), (2), (), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2), (2) (2), (2), (2), (2) (2), (2) (2), (2) (2) (2), (2) (2), (2) (2), (2) (2), (2) (2), (2) (2), (2) (2), (2), (2), (2), (2)

INSERT INTO emp_log VALUES (3 recordings bobbing department 'Beijing', 33 recorder 'Finance Department', 50000), (4 recordings tonybadge 'Hangzhou', 28 mine`s sales department, 50000)

-- query data

-- because the data is inserted twice, there are two blocks when querying

Cdh04:) select * from emp_log

SELECT *

FROM emp_log

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 20000.00 │

│ 2 │ jack │ Shanghai │ 26 │ personnel Department │ 10000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ───┬─── salary ─┐

│ 3 │ bob │ Beijing │ 33 │ Finance Department │ 50000.00 │

│ 4 │ tony │ Hangzhou │ 28 │ sales Department │ 50000.00 │

└─┴─┘

Enter the default data storage directory and view the underlying data storage form

[root@cdh04 emp_log] # pwd

/ var/lib/clickhouse/data/default/emp_log

[root@cdh04 emp_log] # ll

Total dosage 32

-rw-r- 1 clickhouse clickhouse 56 September 17 15:55 age.bin

-rw-r- 1 clickhouse clickhouse 97 September 17 15:55 depart.bin

-rw-r- 1 clickhouse clickhouse 60 September 17 15:55 emp_id.bin

-rw-r- 1 clickhouse clickhouse 192 September 17 15:55 _ _ marks.mrk

-rw-r- 1 clickhouse clickhouse 70 September 17 15:55 name.bin

-rw-r- 1 clickhouse clickhouse 68 September 17 15:55 salary.bin

-rw-r- 1 clickhouse clickhouse 16 September 17 15:55 sizes.json

-rw-r- 1 clickhouse clickhouse 80 September 17 15:55 work_place.bin

The storage structure of the Log engine consists of three parts:

Column .bin: data file, which stores _ _ marks.mrk: data tag separately by column, which uniformly stores the location information of the data in each .bin file. Using data tags, multiple threads can be used to read in parallel. Compressed data blocks in .bin to improve the performance of data queries. Sizes.json: recorded information about the .bin and _ _ marks.mrk sizes

Tip:

The Log table engine stores each column in a file and corresponds to a data block for each INSERT operation.

MergeTree series engine

Among all the table engines, the core is MergeTree series table engines, which have the most powerful performance and the most widely used occasions. For other non-MergeTree series engines, they are mainly used for special purposes, and the scenarios are relatively limited. The MergeTree series table engine is the official main storage engine, which supports almost all the core functions of ClickHouse.

MergeTree Table engine

When MergeTree writes a batch of data, the data is always written to disk in the form of data fragments, and the data fragments cannot be modified. To avoid too many fragments, ClickHouse merges these pieces of data periodically through background threads, and the pieces of data belonging to the same partition are synthesized into a new fragment. The characteristic of this kind of data fragment merging back and forth is the origin of the merge tree name.

As the most basic watch engine of the family series, MergeTree has the following characteristics:

The stored data is sorted by primary key: sparse indexes are allowed to be created to speed up data queries to support partitioning, and partition fields can be specified through the PRIMARY KEY statement. Support data copy support data sampling table syntax CREATE TABLE [IF NOT EXISTS] [db.] table_name [ON CLUSTER cluster]

(

Name1 [type1] [DEFAULT | MATERIALIZED | ALIAS expr1] [TTL expr1]

Name2 [type2] [DEFAULT | MATERIALIZED | ALIAS expr2] [TTL expr2]

...

INDEX index_name1 expr1 TYPE type1 (...) GRANULARITY value1

INDEX index_name2 expr2 TYPE type2 (...) GRANULARITY value2

) ENGINE = MergeTree ()

ORDER BY expr

[PARTITION BY expr]

[PRIMARY KEY expr]

[SAMPLE BY expr]

[TTL expr [DELETE | TO DISK 'xxx' | TO VOLUME' xxx'],...]

[SETTINGS name=value,...]

ENGINE:ENGINE = MergeTree (), the MergeTree engine has no parameter ORDER BY: sort field. For example, ORDER BY (Col1, Col2), it is worth noting that if no primary key is specified, sorting key (sort field) is the primary key by default. If sorting is not needed, you can use the * * ORDER BY tuple () * * syntax so that the created table does not contain the primary key. In this case, ClickHouse stores the data in the order in which it is inserted. Must be chosen. PARTITION BY: partition field, optional. PRIMARY KEY: specify the primary key. If the sort field is inconsistent with the primary key, you can specify the primary key field separately. Otherwise, the default primary key is the sort field. Optional. SAMPLE BY: sample field, which must also be included in the primary key if specified. For example, SAMPLE BY intHash42 (UserID) ORDER BY (CounterID, EventDate, intHash42 (UserID)). Optional. TTL: the survival time of the data. In MergeTree, you can set TTL for a column field or an entire table. When the time arrives, if it is a column field-level TTL, the data of this column will be deleted; if it is a table-level TTL, the data of the entire table will be deleted. Optional. SETTINGS: additional parameter configuration. Optional. Build an example CREATE TABLE emp_mergetree (

Emp_id UInt16 COMMENT 'employee id'

Name String COMMENT 'employee name'

Work_place String COMMENT 'workplace'

Age UInt8 COMMENT 'employee age'

Depart String COMMENT 'Department'

Salary Decimal32 (2) COMMENT 'wages'

) ENGINE=MergeTree ()

ORDER BY emp_id

PARTITION BY work_place

-- insert data

INSERT INTO emp_mergetree

VALUES (3 recordings bobbing department 'Beijing', 33 recorder 'Finance Department', 50000), (4 recordings tonybadge 'Hangzhou', 28 mine`s sales department, 50000)

-- query data

-- Partition by work_place

Cdh04:) select * from emp_mergetree

SELECT *

FROM emp_mergetree

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 3 │ bob │ Beijing │ 33 │ Finance Department │ 50000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 20000.00 │

│ 2 │ jack │ Shanghai │ 26 │ personnel Department │ 10000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ───┬─── salary ─┐

│ 4 │ tony │ Hangzhou │ 28 │ sales Department │ 50000.00 │

└─┴─┘

Looking at the data storage format, you can see that there are three partition folders, each of which stores data for the corresponding partition.

[root@cdh04 emp_mergetree] # pwd

/ var/lib/clickhouse/data/default/emp_mergetree

[root@cdh04 emp_mergetree] # ll

Total dosage 16

Drwxr-x--- 2 clickhouse clickhouse 4096 September 17 17:45 1c89a3ba9fe5fd53379716a776c5ac34_3_3_0

Drwxr-x--- 2 clickhouse clickhouse 4096 September 17 17:44 40d45822dbd7fa81583d715338929da9_1_1_0

Drwxr-x--- 2 clickhouse clickhouse 4096 September 17 17:45 a6155dcc1997eda1a348cd98b17a93e9_2_2_0

Drwxr-x--- 2 clickhouse clickhouse 6 September 17 17:43 detached

-rw-r- 1 clickhouse clickhouse 1 September 17 17:43 format_version.txt

Go to a partition directory to view

Checksums.txt: check files, stored in binary format. It keeps the remaining files (primary. The size size of idx, count.txt, etc.) and the hash value of size are used to quickly verify the integrity and correctness of the file.

Columns.txt: column information file, stored in clear text format. Used to save column field information under this data partition, such as

[root@cdh04 1c89a3ba9fe5fd53379716a776c5ac34_3_3_0] # cat columns.txt

Columns format version: 1

6 columns:

`emp_ id` UInt16

`name` String

`work_ place` String

`age`UInt8

`depart` String

`salary` Decimal (9,2)

Count.txt: count files, stored in clear text format. The total number of rows used to record data under the current data partition directory

Primary.idx: first-level index file, stored in binary format. For storing sparse indexes, a MergeTree table can only declare a first-level index once, that is, specify fields through ORDER BY or PRIMARY KEY. With the help of sparse index, the data files outside the range of primary key conditions can be excluded when querying, so as to effectively reduce the scope of data scanning and accelerate the speed of query.

Column .bin: data file, stored in compressed format, defaults to LZ4 compressed format, used to store data for a column. Because MergeTree is stored in column format, each column field has a separate .bin data file named after the column field name.

Column .mrk2: column field tag file, stored in binary format. The offset information of the data in the .bin file is saved in the tag file.

Partition.dat and minmax_ [Column] .idx: if a partitioning key is specified, additional partition.dat and minmax index files are generated, both of which are stored in binary format. Partition.dat is used to save the final value generated by the partition expression under the current partition, that is, the partition field value, while the minmax index is used to record the minimum and maximum values of the partition field corresponding to the original data under the current partition. For example, when the raw data corresponding to the EventTime field is 2020-09-17 and 2020-09-30, the partition expression is PARTITION BY toYYYYMM (EventTime), that is, partition by month. The value stored in partition.dat will be 2019-09, while the value stored in the minmax index will be 2020-09-17 2020-09-30.

Note that inserting data multiple times will generate multiple partition files-- two new pieces of data will be inserted.

Cdh04:) INSERT INTO emp_mergetree

VALUES (5 recordrobinders, 'Beijing', 35, 'Finance Department', 50000), (6 recordings, lileience, 'Beijing', 38): sales Department, 50000)

-- query results

Cdh04:) select * from emp_mergetree

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 3 │ bob │ Beijing │ 33 │ Finance Department │ 50000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ──┬─ work_place ─┬─ age ─┬─ depart ───┬─── salary ─┐

│ 5 │ robin │ Beijing │ 35 │ Finance Department │ 50000.00 │

│ 6 │ lilei │ Beijing │ 38 │ sales Department │ 50000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 20000.00 │

│ 2 │ jack │ Shanghai │ 26 │ personnel Department │ 10000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ───┬─── salary ─┐

│ 4 │ tony │ Hangzhou │ 28 │ sales Department │ 50000.00 │

└─┴─┘

As you can see, the newly inserted data generates a new data block and is not with the original partition data. We can execute the optimize command and perform the merge operation.

-- perform merge operation

Cdh04:) OPTIMIZE TABLE emp_mergetree PARTITION 'Beijing'

-execute the query again

Cdh04:) select * from emp_mergetree

SELECT *

FROM emp_mergetree

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 20000.00 │

│ 2 │ jack │ Shanghai │ 26 │ personnel Department │ 10000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ──┬─ work_place ─┬─ age ─┬─ depart ───┬─── salary ─┐

│ 3 │ bob │ Beijing │ 33 │ Finance Department │ 50000.00 │

│ 5 │ robin │ Beijing │ 35 │ Finance Department │ 50000.00 │

│ 6 │ lilei │ Beijing │ 38 │ sales Department │ 50000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ───┬─── salary ─┐

│ 4 │ tony │ Hangzhou │ 28 │ sales Department │ 50000.00 │

└─┴─┘

After performing the above merge operation, a new folder for the partition will be generated, and the partition folder will remain the same.

In MergeTree, the primary key is not used for deduplication, but for indexing, speeding up the query-- inserting data with the same primary key

INSERT INTO emp_mergetree

VALUES (1) "Hangzhou", 35 "Finance Department", 50000)

You will find that the data can be inserted, so that the primary key will not be deduplicated.

ReplacingMergeTree Table engine

As mentioned above, the MergeTree table engine cannot deduplicate the data of the same primary key. ClickHouse provides the ReplacingMergeTree engine, which can deduplicate the data of the same primary key, which can delete the duplicate data when merging partitions. It is worth noting that ReplacingMergeTree only solves the problem of data duplication to some extent, but it does not completely guarantee that the data is not duplicated.

Table syntax CREATE TABLE [IF NOT EXISTS] [db.] table_name [ON CLUSTER cluster]

(

Name1 [type1] [DEFAULT | MATERIALIZED | ALIAS expr1]

Name2 [type2] [DEFAULT | MATERIALIZED | ALIAS expr2]

...

) ENGINE = ReplacingMergeTree ([ver])

[PARTITION BY expr]

[ORDER BY expr]

[PRIMARY KEY expr]

[SAMPLE BY expr]

[SETTINGS name=value,...]

[ver]: optional parameter, the version of the column, which can be a field of type UInt, Date, or DateTime as the version number. This parameter determines how the data is deduplicated. When the [ver] parameter is not specified, the latest data is retained; if a specific value is specified, the maximum version data is retained. Build an example CREATE TABLE emp_replacingmergetree (

Emp_id UInt16 COMMENT 'employee id'

Name String COMMENT 'employee name'

Work_place String COMMENT 'workplace'

Age UInt8 COMMENT 'employee age'

Depart String COMMENT 'Department'

Salary Decimal32 (2) COMMENT 'wages'

) ENGINE=ReplacingMergeTree ()

ORDER BY emp_id

PRIMARY KEY emp_id

PARTITION BY work_place

-- insert data

INSERT INTO emp_replacingmergetree

VALUES (3 recordings bobbing department 'Beijing', 33 recorder 'Finance Department', 50000), (4 recordings tonybadge 'Hangzhou', 28 mine`s sales department, 50000)

Pay attention

When we insert data with the same primary key into the table again, observe the change of the query data

INSERT INTO emp_replacingmergetree

VALUES (1), "Shanghai", 25, "Technical Department", 50000)

-- query data. Because there is no merge, there is data with duplicate primary keys.

Cdh04:) select * from emp_replacingmergetree

SELECT *

FROM emp_replacingmergetree

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 20000.00 │

│ 2 │ jack │ Shanghai │ 26 │ personnel Department │ 10000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 3 │ bob │ Beijing │ 33 │ Finance Department │ 50000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ───┬─── salary ─┐

│ 4 │ tony │ Hangzhou │ 28 │ sales Department │ 50000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 50000.00 │

└─┴─┘

-- perform merge operation

Optimize table emp_replacingmergetree final

-- query again, the data with the same primary key, keep the recently inserted data, and the old data is cleared

Cdh04:) select * from emp_replacingmergetree

SELECT *

FROM emp_replacingmergetree

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 50000.00 │

│ 2 │ jack │ Shanghai │ 26 │ personnel Department │ 10000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ───┬─── salary ─┐

│ 4 │ tony │ Hangzhou │ 28 │ sales Department │ 50000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 3 │ bob │ Beijing │ 33 │ Finance Department │ 50000.00 │

└─┴─┘

As you can see from the above example, ReplacingMergeTree supports de-duplication of data, so what is the basis for de-duplication? The answer is: when ReplacingMergeTree removes duplicate data, it is based on the ORDERBY sort key, not PRIMARY KEY. We are looking at an example:

CREATE TABLE emp_replacingmergetree1 (

Emp_id UInt16 COMMENT 'employee id'

Name String COMMENT 'employee name'

Work_place String COMMENT 'workplace'

Age UInt8 COMMENT 'employee age'

Depart String COMMENT 'Department'

Salary Decimal32 (2) COMMENT 'wages'

) ENGINE=ReplacingMergeTree ()

ORDER BY (emp_id,name)-notice that sort key is two fields

PRIMARY KEY emp_id-the primary key is a field

PARTITION BY work_place

-- insert data

INSERT INTO emp_replacingmergetree1

VALUES (3 recordings bobbing department 'Beijing', 33 recorder 'Finance Department', 50000), (4 recordings tonybadge 'Hangzhou', 28 mine`s sales department, 50000)

Insert the same emp_id and name data into the table again, perform the merge operation, and then observe the data

-- insert data

INSERT INTO emp_replacingmergetree1

VALUES (1 recorder tomography department 'shanghai', 25 recorder 'technology department', 50000), (1 recollection sampled charge 'shanghai', 25 pence 'technology department', 20000)

-- perform merge operation

Optimize table emp_replacingmergetree1 final

When querying again, it can be seen that the same emp_id and name data are deduplicated, but the same primary key emp_id is not deduplicated.

-- ReplacingMergeTree is based on the ORDERBY sort key, not PRIMARY KEY, when removing duplicate data

Cdh04:) select * from emp_replacingmergetree1

SELECT *

FROM emp_replacingmergetree1

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 3 │ bob │ Beijing │ 33 │ Finance Department │ 50000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 1 │ sam │ Shanghai │ 25 │ Technology Department │ 20000.00 │

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 50000.00 │

│ 2 │ jack │ Shanghai │ 26 │ personnel Department │ 10000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ───┬─── salary ─┐

│ 4 │ tony │ Hangzhou │ 28 │ sales Department │ 50000.00 │

└─┴─┘

So far, we know that ReplacingMergeTree supports de-duplication, and it is based on the ORDERBY sort key. If you are careful, you will find that the duplicate data above is in a partition, so what happens if the duplicated data is not in a partition? We once again insert duplicate data from different partitions into the above emp_replacingmergetree1 table

-- insert data

INSERT INTO emp_replacingmergetree1

VALUES (1), "Beijing", 26 "Technical Department", 10000)

-- perform merge operation

Optimize table emp_replacingmergetree1 final

-- query again

-Discovery 1 │ tom │ Beijing │ 26 │ Technical Department │ 10000.00

-- with 1 │ tom │ Shanghai │ 25 │ Technical Department │ 50000.00

-- duplicate data because the two rows of data are not in the same partition

This is because ReplacingMergeTree removes duplicate data on a partition-by-partition basis.

Only duplicate data within the same data partition can be deleted, while duplicate data between different data partitions can still not be deleted.

Cdh04:) select * from emp_replacingmergetree1

SELECT *

FROM emp_replacingmergetree1

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 1 │ tom │ Beijing │ 26 │ Technology Department │ 10000.00 │

│ 3 │ bob │ Beijing │ 33 │ Finance Department │ 50000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 1 │ sam │ Shanghai │ 25 │ Technology Department │ 20000.00 │

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 50000.00 │

│ 2 │ jack │ Shanghai │ 26 │ personnel Department │ 10000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ───┬─── salary ─┐

│ 4 │ tony │ Hangzhou │ 28 │ sales Department │ 50000.00 │

└─┴─┘

Summarize how to judge data duplication

When ReplacingMergeTree removes duplicate data, it is based on ORDERBY sort keys, not PRIMARY KEY.

When to delete duplicate data

Deduplicated data is triggered when a partition merge is performed. The merge operation of optimize is performed in the background, and the exact time point of execution cannot be predicted unless it is performed manually.

Duplicate data from different partitions will not be deduplicated

ReplacingMergeTree removes duplicate data on a partition-by-partition basis. Only duplicate data within the same data partition can be deleted, while duplicate data between different data partitions can still not be deleted.

What is the strategy of data deduplication?

If the * * [ver] version number * * is not set, the newly inserted data in the same set of duplicate data is retained; if the * * [ver] version number * * is set, the row with the highest value of the ver field in the same set of duplicate data is retained.

The optimize command uses the

Generally, in the case of a large amount of data, try not to use this command. Because in massive data scenarios, it takes a lot of time to execute optimize

SummingMergeTree Table engine

The engine inherits the MergeTree engine, when merging data fragments of the SummingMergeTree table, ClickHouse will merge all rows with the same primary key into a row, which contains the summary value of columns with numerical data types in the merged rows, that is, if there is duplicate data, it will merge the duplicate data into one piece of data, similar to the effect of group by.

It is recommended to use this engine with MergeTree. For example, store the complete data in a MergeTree table and use SummingMergeTree to store the aggregated data. This approach avoids data loss due to incorrect primary key combinations.

If the user only needs to query the summary results of the data and does not care about the detailed data, and the summary condition of the data is clear in advance, that is, the grouping field of the GROUP BY is determined, you can use the table engine.

Table syntax CREATE TABLE [IF NOT EXISTS] [db.] table_name [ON CLUSTER cluster]

(

Name1 [type1] [DEFAULT | MATERIALIZED | ALIAS expr1]

Name2 [type2] [DEFAULT | MATERIALIZED | ALIAS expr2]

...

) ENGINE = SummingMergeTree ([columns])-- specify the merge summary field

[PARTITION BY expr]

[ORDER BY expr]

[SAMPLE BY expr]

[SETTINGS name=value,...]

Build an example CREATE TABLE emp_summingmergetree (

Emp_id UInt16 COMMENT 'employee id'

Name String COMMENT 'employee name'

Work_place String COMMENT 'workplace'

Age UInt8 COMMENT 'employee age'

Depart String COMMENT 'Department'

Salary Decimal32 (2) COMMENT 'wages'

) ENGINE=SummingMergeTree (salary)

ORDER BY (emp_id,name)-notice that sort key is two fields

PRIMARY KEY emp_id-the primary key is a field

PARTITION BY work_place

-- insert data

INSERT INTO emp_summingmergetree

VALUES (3 recordings bobbing department 'Beijing', 33 recorder 'Finance Department', 50000), (4 recordings tonybadge 'Hangzhou', 28 mine`s sales department, 50000)

When we insert data with the same emp_id,name again, observe the results

INSERT INTO emp_summingmergetree

VALUES (1 recorder tombalist 'Shanghai', 25 recorder 'Ministry of Information', 10000), (1 recorder tombalist 'Beijing', 26 mine26 'Ministry of personnel', 10000)

Cdh04:) select * from emp_summingmergetree

-- query

SELECT *

FROM emp_summingmergetree

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 3 │ bob │ Beijing │ 33 │ Finance Department │ 50000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 20000.00 │

│ 2 │ jack │ Shanghai │ 26 │ personnel Department │ 10000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ───┬─── salary ─┐

│ 4 │ tony │ Hangzhou │ 28 │ sales Department │ 50000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 1 │ tom │ Beijing │ 26 │ personnel Department │ 10000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 1 │ tom │ Shanghai │ 25 │ Information Department │ 10000.00 │

└─┴─┘

-- perform merge operation

Optimize table emp_summingmergetree final

Cdh04:) select * from emp_summingmergetree

-- query again, newly inserted data 1 │ tom │ Shanghai │ 25 │ Information Department │ 10000.00

-- original data: 1 │ tom │ Shanghai │ 25 │ Technical Department │ 20000.00

These two lines of data are merged into: 1 │ tom │ Shanghai │ 25 │ Technical Department │ 30000.00

SELECT *

FROM emp_summingmergetree

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 1 │ tom │ Beijing │ 26 │ personnel Department │ 10000.00 │

│ 3 │ bob │ Beijing │ 33 │ Finance Department │ 50000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 30000.00 │

│ 2 │ jack │ Shanghai │ 26 │ personnel Department │ 10000.00 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ───┬─── salary ─┐

│ 4 │ tony │ Hangzhou │ 28 │ sales Department │ 50000.00 │

└─┴─┘

Pay attention

Make sure that the primary key specified by PRIMARY KEY expr is the prefix of the field specified by ORDER BY expr, such as

-- allow

ORDER BY (A _ Magna B _ M C)

PRIMARY KEY A

-- will report an error

-- DB::Exception: Primary key must be a prefix of the sorting key

ORDER BY (A _ Magna B _ M C)

PRIMARY KEY B

This mandatory constraint ensures that even in the case of different definitions, the primary key is still the prefix of the sort key, and there will be no disorder between index and data.

Summarize on what basis SummingMergeTree merges the two pieces of data

Use the ORBER BY sort key as the conditional Key for aggregating data. That is, if the sort key is the same, it is merged into a single piece of data and the specified merge fields are aggregated.

Merge only rows of data with the same sort key within the partition

Aggregate data in terms of data partitions. When partitions are merged, data with the same aggregate Key within the same data partition is merged, while data between different partitions is not aggregated.

If no aggregate field is specified, how will it be aggregated

If no aggregate field is specified, it is aggregated according to a numeric type field that is not the primary key

For data that is not a summary field, which one should be retained

If the non-aggregate fields of the two rows of data are different except the sort fields, the original data will be retained when the aggregation occurs, and the field value corresponding to the newly inserted data will be discarded.

-- newly inserted data: 1 │ tom │ Shanghai │ 25 │ Information Department │ 10000.00

-- initial data: 1 │ tom │ Shanghai │ 25 │ Technical Department │ 20000.00

-- results of aggregation and merger: 1 │ tom │ Shanghai │ 25 │ Technology Department │ 30000.00

Aggregatingmergetree Table engine

The table engine inherits from MergeTree and can use the AggregatingMergeTree table for incremental data statistical aggregation. If you want to merge and reduce the number of rows by a set of rules, it is appropriate to use AggregatingMergeTree. AggregatingMergeTree calculates data through predefined aggregate functions and stores it in a table in binary format.

Unlike SummingMergeTree, SummingMergeTree does sum aggregation on non-primary key columns, while AggregatingMergeTree can specify various aggregate functions.

Table syntax CREATE TABLE [IF NOT EXISTS] [db.] table_name [ON CLUSTER cluster]

(

Name1 [type1] [DEFAULT | MATERIALIZED | ALIAS expr1]

Name2 [type2] [DEFAULT | MATERIALIZED | ALIAS expr2]

...

) ENGINE = AggregatingMergeTree ()

[PARTITION BY expr]

[ORDER BY expr]

[SAMPLE BY expr]

[SETTINGS name=value,...]

Build an example CREATE TABLE emp_aggregatingmergeTree (

Emp_id UInt16 COMMENT 'employee id'

Name String COMMENT 'employee name'

Work_place String COMMENT 'workplace'

Age UInt8 COMMENT 'employee age'

Depart String COMMENT 'Department'

Salary AggregateFunction (sum,Decimal32 (2)) COMMENT 'wages'

) ENGINE=AggregatingMergeTree ()

ORDER BY (emp_id,name)-notice that sort key is two fields

PRIMARY KEY emp_id-the primary key is a field

PARTITION BY work_place

For column fields of AggregateFunction type, it is very different from other table engines when writing and querying data. When writing data, you need to call the * *-State function; while when querying data, you need to call the corresponding-Merge function. For the table creation statement above, you need to use the sumState** function for data insertion

-- insert data

-- Note: INSERT is required. SELECT statement for data insertion

INSERT INTO TABLE emp_aggregatingmergeTree

SELECT 1, 'Shanghai', 'Shanghai', 25, 'Ministry of Information', sumState (toDecimal32 (10000 ~ 2))

INSERT INTO TABLE emp_aggregatingmergeTree

SELECT 1, 'Shanghai', 'Shanghai', 25, 'Ministry of Information', sumState (toDecimal32 (2000)

-- query data

SELECT

Emp_id

Name

SumMerge (salary)

FROM emp_aggregatingmergeTree

GROUP BY emp_id,name

-- result output

┌─ emp_id ─┬─ name ─┬─ sumMerge (salary) ─┐

│ 1 │ tom │ 30000.00 │

└─┴─┴─┘

The usage demonstrated above is very troublesome, but in more cases, we can use it in conjunction with materialized views as a table engine for materialized views. The materialized view here serves as a query view at the top of other data tables.

AggregatingMergeTree is usually used as a table engine for materialized views, in conjunction with normal MergeTree.

Create a schedule for the MereTree engine

-- used to store full amount of detail data

-- provide real-time query to the outside world

CREATE TABLE emp_mergetree_base (

Emp_id UInt16 COMMENT 'employee id'

Name String COMMENT 'employee name'

Work_place String COMMENT 'workplace'

Age UInt8 COMMENT 'employee age'

Depart String COMMENT 'Department'

Salary Decimal32 (2) COMMENT 'wages'

) ENGINE=MergeTree ()

ORDER BY (emp_id,name)

PARTITION BY work_place

Create a materialized view

-- using the AggregatingMergeTree table engine

CREATE MATERIALIZED VIEW view_emp_agg

ENGINE = AggregatingMergeTree ()

PARTITION BY emp_id

ORDER BY (emp_id,name)

AS SELECT

Emp_id

Name

SumState (salary) AS salary

FROM emp_mergetree_base

GROUP BY emp_id,name

-- insert data into the underlying parts list emp_mergetree_base

INSERT INTO emp_mergetree_base

VALUES (1), "Shanghai", 25, "Technical Department", 20000)

(1) 'Shanghai', 26 'personnel Department', 10000)

-- query materialized views

SELECT

Emp_id

Name

SumMerge (salary)

FROM view_emp_agg

GROUP BY emp_id,name

-- result

┌─ emp_id ─┬─ name ─┬─ sumMerge (salary) ─┐

│ 1 │ tom │ 30000.00 │

└─┴─┴─┘

CollapsingMergeTree Table engine

CollapsingMergeTree is a table engine that supports row-level data modification and deletion through the idea of adding and deleting. It records the status of the data row by defining a sign tag bit field. If sign is marked as 1, this is a valid row of data; if sign is marked with-1, this row of data needs to be deleted. When CollapsingMergeTree partitions are merged, a set of data marked with sign as 1 and-1 is offset and deleted within the same data partition.

Each time you need to add new data, write a row of data marked with sign 1, and when you need to delete data, write a row of data marked with sign-1.

Table syntax CREATE TABLE [IF NOT EXISTS] [db.] table_name [ON CLUSTER cluster]

(

Name1 [type1] [DEFAULT | MATERIALIZED | ALIAS expr1]

Name2 [type2] [DEFAULT | MATERIALIZED | ALIAS expr2]

...

) ENGINE = CollapsingMergeTree (sign)

[PARTITION BY expr]

[ORDER BY expr]

[SAMPLE BY expr]

[SETTINGS name=value,...]

Build an example

The table statement above uses CollapsingMergeTree (sign), where the field sign is a field of type Int8

CREATE TABLE emp_collapsingmergetree (

Emp_id UInt16 COMMENT 'employee id'

Name String COMMENT 'employee name'

Work_place String COMMENT 'workplace'

Age UInt8 COMMENT 'employee age'

Depart String COMMENT 'Department'

Salary Decimal32 (2) COMMENT 'wages'

Sign Int8

) ENGINE=CollapsingMergeTree (sign)

ORDER BY (emp_id,name)

PARTITION BY work_place

Mode of use

CollapsingMergeTree also uses the ORDER BY sort key as the basis for judging the uniqueness of data.

-- insert new data. Sign=1 indicates normal data.

INSERT INTO emp_collapsingmergetree

VALUES (1), "Shanghai", 25, "Technical Department", 20000 and 1)

-- updating the above data

-- first insert the same data as the original (ORDER BY field is the same), and set sign to-1

INSERT INTO emp_collapsingmergetree

VALUES (1), "Shanghai", 25, "Technical Department", 20000)

-- insert the updated data

INSERT INTO emp_collapsingmergetree

VALUES (1), "Shanghai", 25, "Technical Department", 30, 000, and 1)

-- check the results

Cdh04:) select * from emp_collapsingmergetree

SELECT *

FROM emp_collapsingmergetree

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┬─ sign ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 30000.00 │ 1 │

└─┴─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┬─ sign ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 20000.00 │-1 │

└─┴─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┬─ sign ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 20000.00 │ 1 │

└─┴─┴─┘

-- perform partition merge operation

Optimize table emp_collapsingmergetree

When queried again, the data of sign=1 and sign=-1 cancel each other out, that is, they are deleted

Cdh04:) select * from emp_collapsingmergetree

SELECT *

FROM emp_collapsingmergetree

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┬─ sign ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 30000.00 │ 1 │

└─┴─┴─┘

Pay attention to partition merging

Score data folding is not real-time and requires Compaction operation in the background, and users can also use manual merge commands, but the efficiency is very low and it is generally not recommended to be used in production environments.

When aggregating data, you can filter out the deleted data by changing the query method.

SELECT

Emp_id

Name

Sum (salary * sign)

FROM emp_collapsingmergetree

GROUP BY

Emp_id

Name

HAVING sum (sign) > 0

Only data within the same partition can be collapsed. In fact, when we modify or delete data, the modified data is usually in a partition, so it will not have an impact.

Data writing order

It is worth noting that CollapsingMergeTree has strict requirements on the order in which data is written, otherwise it cannot be folded properly.

-- create a table

CREATE TABLE emp_collapsingmergetree_order (

Emp_id UInt16 COMMENT 'employee id'

Name String COMMENT 'employee name'

Work_place String COMMENT 'workplace'

Age UInt8 COMMENT 'employee age'

Depart String COMMENT 'Department'

Salary Decimal32 (2) COMMENT 'wages'

Sign Int8

) ENGINE=CollapsingMergeTree (sign)

ORDER BY (emp_id,name)

PARTITION BY work_place

-- insert the data to be deleted first, that is, the data of sign=-1

INSERT INTO emp_collapsingmergetree_order

VALUES (1), "Shanghai", 25, "Technical Department", 20000)

-- insert the data of sign=1

INSERT INTO emp_collapsingmergetree_order

VALUES (1), "Shanghai", 25, "Technical Department", 20000 and 1)

-- query table

SELECT *

FROM emp_collapsingmergetree_order

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┬─ sign ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 20000.00 │ 1 │

└─┴─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┬─ sign ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 20000.00 │-1 │

└─┴─┴─┘

-- perform merge operation

Optimize table emp_collapsingmergetree_order

-- query the table again

-- Old data still exist

SELECT *

FROM emp_collapsingmergetree_order

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┬─ sign ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 20000.00 │-1 │

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 20000.00 │ 1 │

└─┴─┴─┘

If the data writer is executed by a single thread, the write order can be better controlled; if the amount of data to be processed is large, and the data writer is usually executed by multiple threads, then the write order of the data cannot be guaranteed at this time. In this case, there will be problems with the working mechanism of CollapsingMergeTree. But it can be solved through VersionedCollapsingMergeTree's table engine.

VersionedCollapsingMergeTree Table engine

It is mentioned above that the CollapsingMergeTree table engine cannot achieve the effect of data folding when the data is written out of order. The function of the VersionedCollapsingMergeTree table engine is exactly the same as that of CollapsingMergeTree, except that VersionedCollapsingMergeTree does not require the order in which the data is written, and any order of data can be collapsed in the same partition.

VersionedCollapsingMergeTree uses version columns to fold data out of order.

Table syntax CREATE TABLE [IF NOT EXISTS] [db.] table_name [ON CLUSTER cluster]

(

Name1 [type1] [DEFAULT | MATERIALIZED | ALIAS expr1]

Name2 [type2] [DEFAULT | MATERIALIZED | ALIAS expr2]

...

) ENGINE = VersionedCollapsingMergeTree (sign, version)

[PARTITION BY expr]

[ORDER BY expr]

[SAMPLE BY expr]

[SETTINGS name=value,...]

You can see that the engine needs to specify a version version number of type UInt8 in addition to a sign identity.

Build an example CREATE TABLE emp_versioned (

Emp_id UInt16 COMMENT 'employee id'

Name String COMMENT 'employee name'

Work_place String COMMENT 'workplace'

Age UInt8 COMMENT 'employee age'

Depart String COMMENT 'Department'

Salary Decimal32 (2) COMMENT 'wages'

Sign Int8

Version Int8

) ENGINE=VersionedCollapsingMergeTree (sign, version)

ORDER BY (emp_id,name)

PARTITION BY work_place

-- insert the data to be deleted first, that is, the data of sign=-1

INSERT INTO emp_versioned

VALUES (1 recorder tombalist 'shanghai', 25 recorder 'technical department', 20 000 record1 lime1)

-- insert the data of sign=1

INSERT INTO emp_versioned

VALUES (1 recorder tombalist 'shanghai', 25 recorder 'technical department', 20000 meme 1)

-- inserting a new version of data

INSERT INTO emp_versioned

VALUES (1 recorder tombalist 'shanghai', 25 recorder 'technical department', 30000 meme 1 pyrrine 2)

-- View the table data without performing the merge

Cdh04:) select * from emp_versioned

SELECT *

FROM emp_versioned

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┬─ sign ─┬─ version ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 30000.00 │ 1 │ 2 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┬─ sign ─┬─ version ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 20000.00 │ 1 │ 1 │

└─┴─┘

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┬─ sign ─┬─ version ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 20000.00 │-1 │ 1 │

└─┴─┘

-- get the correct query results

SELECT

Emp_id

Name

Sum (salary * sign)

FROM emp_versioned

GROUP BY

Emp_id

Name

HAVING sum (sign) > 0

-- merge manually

Optimize table emp_versioned

-- query again

Cdh04:) select * from emp_versioned

SELECT *

FROM emp_versioned

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┬─ sign ─┬─ version ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 30000.00 │ 1 │ 2 │

└─┴─┘

It can be seen that although the above data is inserted out of order, it can still achieve the effect of folding. This effect can be achieved because after defining the version field, VersionedCollapsingMergeTree automatically uses version as the sort condition and adds it to the end of the ORDER BY. In the above example, the final sort field is ORDER BY emp_id,name,version desc.

GraphiteMergeTree Table engine

The engine is used to 'slim down' and summarize Graphite data. It may be useful for developers who want to use CH to store Graphite data.

If you do not need to summarize Graphite data, you can use any CH table engine; but if necessary, use the GraphiteMergeTree engine. It can reduce the storage space and improve the query efficiency of Graphite data.

External integration table engine

ClickHouse provides many ways to integrate with external systems, including some table engines. These table engines, like other types of table engines, can be used to import external data into ClickHouse or to manipulate external data sources directly in ClickHouse.

For example, directly reading files in HDFS or tables in MySQL database. These table engines are only responsible for metadata management and data query, but they are usually not responsible for data writing, and the data files are provided directly by the external system. Currently, ClickHouse provides the following external integration table engines:

ODBC: read data source through specified odbc connection JDBC: read data source through specified jdbc connection; MySQL: use MySQL as data store and query its data directly HDFS: read data files in a specific format on HDFS directly; Kafka: import Kafka data into ClickHouseRabbitMQ: similar to Kafka HDFS usage ENGINE = HDFS (URI, format)

URI:HDFS file path format: file format, such as CSV, JSON, TSV, etc. Use examples-create tables

CREATE TABLE hdfs_engine_table (

Emp_id UInt16 COMMENT 'employee id'

Name String COMMENT 'employee name'

Work_place String COMMENT 'workplace'

Age UInt8 COMMENT 'employee age'

Depart String COMMENT 'Department'

Salary Decimal32 (2) COMMENT 'wages'

) ENGINE=HDFS ('hdfs://cdh03:8020/user/hive/hdfs_engine_table',' CSV')

-- write data

INSERT INTO hdfs_engine_table

-- query data

Cdh04:) select * from hdfs_engine_table

SELECT *

FROM hdfs_engine_table

┌─ emp_id ─┬─ name ─┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 20000.00 │

│ 2 │ jack │ Shanghai │ 26 │ personnel Department │ 10000.00 │

└─┴─┘

-- add a few pieces of data to the corresponding file on HDFS, and view it again

Cdh04:) select * from hdfs_engine_table

SELECT *

FROM hdfs_engine_table

┌─ emp_id ─┬─ name ───┬─ work_place ─┬─ age ─┬─ depart ─┬─── salary ─┐

│ 1 │ tom │ Shanghai │ 25 │ Technology Department │ 20000.00 │

│ 2 │ jack │ Shanghai │ 26 │ personnel Department │ 10000.00 │

│ 3 │ lili │ Beijing │ 28 │ Technology Department │ 20000.00 │

│ 4 │ jasper │ Hangzhou │ 27 │ personnel Department │ 8000.00 │

└─┴─┘

As you can see, this approach is similar to using Hive, where we can directly map the file corresponding to HDFS to a table in ClickHouse, so that we can use SQL to manipulate files on HDFS.

It is worth noting that: ClickHouse can not delete data on HDFS, when we delete the corresponding table in the ClickHouse client, only delete the table structure, the file on HDFS has not been deleted, which is very similar to the external table of Hive.

MySQL

In the previous article [1 | getting started with ClickHouse], you introduced the MySQL database engine, that is, ClickHouse can create a MySQL data engine so that you can manipulate the data in its corresponding database in ClickHouse. In fact, ClickHouse also supports the MySQL table engine, that is, mapping a table from MySQL to ClickHouse.

How to use CREATE TABLE [IF NOT EXISTS] [db.] table_name [ON CLUSTER cluster]

(

Name1 [type1] [DEFAULT | MATERIALIZED | ALIAS expr1] [TTL expr1]

Name2 [type2] [DEFAULT | MATERIALIZED | ALIAS expr2] [TTL expr2]

...

) ENGINE = MySQL ('host:port',' database', 'table',' user', 'password' [, replace_query,' on_duplicate_clause'])

Use example-connect to the test table of the clickhouse database in MySQL

CREATE TABLE mysql_engine_table (

Id Int32

Name String

) ENGINE = MySQL (

'192.168.200.241purl 3306'

'clickhouse'

'test'

'root'

'123qwe')

-- query data

Cdh04:) SELECT * FROM mysql_engine_table

SELECT *

FROM mysql_engine_table

┌─ id ─┬─ name ──┐

│ 1 │ tom │

│ 2 │ jack │

│ 3 │ lihua │

└────┴─┘

-- insert data, which will be inserted into the table corresponding to MySQL

So when you query MySQL data, you will find that a new piece of data has been added

INSERT INTO mysql_engine_table VALUES (4)

-- query again

Cdh04:) select * from mysql_engine_table

SELECT *

FROM mysql_engine_table

┌─ id ─┬─ name ──┐

│ 1 │ tom │

│ 2 │ jack │

│ 3 │ lihua │

│ 4 │ robin │

└────┴─┘

Note: for MySQL table engine, UPDATE and DELETE operations are not supported, for example, an error will be reported when executing the following command:

-- perform updates

ALTER TABLE mysql_engine_table UPDATE name = 'hanmeimei' WHERE id = 1

-- perform deletion

ALTER TABLE mysql_engine_table DELETE WHERE id = 1

-- reporting errors

DB::Exception: Mutations are not supported by storage MySQL.

How JDBC is used

JDBC table engine can not only interface with MySQL database, but also with PostgreSQL and other databases. In order to implement the JDBC connection, ClickHouse uses clickhouse-jdbc-bridge 's query agent service.

First of all, we need to download clickhouse-jdbc-bridge, and then follow the steps in ClickHouse's github to compile. After compilation, there will be a jar file for clickhouse-jdbc-bridge-1.0.jar. In addition to this file, you also need a driver file for JDBC. This article uses MySQL, so you also need to download the MySQL driver package. Put the driver package and clickhouse-jdbc-bridge-1.0.jar file of MySQL under the / opt/softwares path, and execute the following command:

[root@cdh04 softwares] # java-jar clickhouse-jdbc-bridge-1.0.jar-- driver-path. -- listen-host cdh04

Where-- driver-path is the path where the MySQL-driven jar is located, and listen-host is the host bound by the proxy service. By default, the bound port is: 9019. Download the above jar package:

Link: https://pan.baidu.com/s/1ZcvF22GvnvAQpVTleNry7Q extraction code: la9n

Then we configure / etc/clickhouse-server/config.xml, add the following configuration to the file, and then restart the service.

Cdh04

9019

Use the example to directly query the corresponding table SELECT * in MySQL

FROM

Jdbc (

'jdbc:mysql://192.168.200.241:3306/?user=root&password=123qwe'

'clickhouse'

'test')

Create a mapping table-syntax

CREATE TABLE [IF NOT EXISTS] [db.] table_name

(

Columns list...

)

ENGINE = JDBC (dbms_uri, external_database, external_table)

-- MySQL to build tables

CREATE TABLE jdbc_table_mysql (

Order_id INT NOT NULL AUTO_INCREMENT

Amount FLOAT NOT NULL

PRIMARY KEY (order_id))

INSERT INTO jdbc_table_mysql VALUES (1200)

-- create tables in ClickHouse

CREATE TABLE jdbc_table

(

Order_id Int32

Amount Float32

)

ENGINE JDBC (

'jdbc:mysql://192.168.200.241:3306/?user=root&password=123qwe'

'clickhouse'

'jdbc_table_mysql')

-- query data

Cdh04:) select * from jdbc_table

SELECT *

FROM jdbc_table

┌─ order_id ─┬─ amount ─┐

│ 1 │ 200 │

└─┴─┘

Kafka usage CREATE TABLE [IF NOT EXISTS] [db.] table_name [ON CLUSTER cluster]

(

Name1 [type1] [DEFAULT | MATERIALIZED | ALIAS expr1]

Name2 [type2] [DEFAULT | MATERIALIZED | ALIAS expr2]

...

) ENGINE = Kafka ()

SETTINGS

Kafka_broker_list = 'host:port'

Kafka_topic_list = 'topic1,topic2,...'

Kafka_group_name = 'group_name'

Kafka_format = 'data_format' [,]

[kafka_row_delimiter = 'delimiter_symbol',]

[kafka_schema ='',]

[kafka_num_consumers = N,]

[kafka_max_block_size = 0,]

[kafka_skip_broken_messages = N,]

[kafka_commit_every_batch = 0,]

[kafka_thread_per_consumer = 0]

Kafka_broker_list: comma-separated brokers address (localhost:9092). Kafka_topic_list: Kafka topic list, with multiple topics separated by commas. Kafka _ group_name: consumer group. Kafka_format-Message format. Such as JSONEachRow, JSON, CSV and so on.

Create a ck_topic topic in kafka and write data to the topic

CREATE TABLE kafka_table (

Id UInt64

Name String

) ENGINE = Kafka ()

SETTINGS

Kafka_broker_list = 'cdh04:9092'

Kafka_topic_list = 'ck_topic'

Kafka_group_name = 'group1'

Kafka_format = 'JSONEachRow'

-- query

Cdh04:) select * from kafka_table

SELECT *

FROM kafka_table

┌─ id ─┬─ name ─┐

│ 1 │ tom │

└────┴─┘

┌─ id ─┬─ name ─┐

│ 2 │ jack │

└────┴─┘

Pay attention

Once we have finished the query, ClickHouse will delete the data in the table. In fact, the Kafka table engine is just a data pipeline, and we can access the data in Kafka by materializing the view.

First create a Kafka table engine table to read data from Kafka and then create a normal table engine table, such as MergeTree, for end users to create materialized views for real-time synchronization of Kafka engine tables into tables used by end users-- create Kafka engine tables

CREATE TABLE kafka_table_consumer (

Id UInt64

Name String

) ENGINE = Kafka ()

SETTINGS

Kafka_broker_list = 'cdh04:9092'

Kafka_topic_list = 'ck_topic'

Kafka_group_name = 'group1'

Kafka_format = 'JSONEachRow'

-- create a table used by the end user

CREATE TABLE kafka_table_mergetree (

Id UInt64

Name String

) ENGINE=MergeTree ()

ORDER BY id

-- create materialized views and synchronize data

CREATE MATERIALIZED VIEW consumer TO kafka_table_mergetree

AS SELECT id,name FROM kafka_table_consumer

-- query, multiple queries, the data that has been queried will still be output

Cdh04:) select * from kafka_table_mergetree

SELECT *

FROM kafka_table_mergetree

┌─ id ─┬─ name ─┐

│ 2 │ jack │

└────┴─┘

┌─ id ─┬─ name ─┐

│ 1 │ tom │

└────┴─┘

Other special table engine Memory table engine

The Memory table engine stores the data directly in memory, and the data is neither compressed nor converted. When the ClickHouse service is restarted, all data in the Memory table is lost. It is usually used in testing.

CREATE TABLE table_memory (

Id UInt64

Name String

) ENGINE = Memory ()

How the Distributed table engine is used

Distributed table engine is synonymous with distributed table, it does not store any data, the data is scattered on a certain fragment, and can automatically route data to each node in the cluster, so Distributed table engine needs to work together with other data table engines.

Therefore, the bottom layer of a distributed table corresponds to multiple local sharding data tables, and the data is stored by a specific sharding table, and there is an one-to-many relationship between the distributed table and the sharding table.

The definition of the Distributed table engine is as follows

Distributed (cluster_name, database_name, table_name [, sharding_key])

The meanings of each parameter are as follows:

Cluster_name: cluster name, corresponding to the custom name in the cluster configuration. Database_name: database name table_name: table name sharding_key: optional key value for sharding. In the process of data writing, the distributed table will distribute the data to the local table of each node according to the rules of sharding key.

Scream hint:

Creating distributed tables is a read-time check mechanism, that is, there is no mandatory requirement for the order in which distributed tables and local tables are created.

It is also worth noting that ON CLUSTER distributed DDL is used in the above statement, which means that a Distributed table is created on each shard node of the cluster, so that read and write requests for all shards can be initiated from either end.

Use example-create a distribution table

CREATE TABLE IF NOT EXISTS user_cluster ON CLUSTER cluster_3shards_1replicas

(

Id Int32

Name String

) ENGINE = Distributed (cluster_3shards_1replicas, default, user_local,id)

When you create the above distributed table, look at the table on each machine and find that there is a newly created table on each machine.

Next, you need to create a local table, one on each machine:

CREATE TABLE IF NOT EXISTS user_local

(

Id Int32

Name String

) ENGINE = MergeTree ()

ORDER BY id

PARTITION BY id

PRIMARY KEY id

We first insert data into the user_ local table on a machine, and then query the user_cluster table

-- insert data

Cdh04:) INSERT INTO user_local VALUES (1 recordings tom'), (2)

Query the user_cluster table. It can be seen that all user_ local tables can be manipulated through the user_ Cluster table.

Cdh04:) select * from user_cluster

┌─ id ─┬─ name ─┐

│ 2 │ jack │

└────┴─┘

┌─ id ─┬─ name ─┐

│ 1 │ tom │

└────┴─┘

Next, we insert some data into the user_cluster and observe the changes in the data in the user_ local table, and we can see that the data is scattered to other nodes.

-- insert data into user_cluster

Cdh04:) INSERT INTO user_cluster VALUES (3), (4)

-- View user_cluster data

Cdh04:) select * from user_cluster

┌─ id ─┬─ name ─┐

│ 2 │ jack │

└────┴─┘

┌─ id ─┬─ name ──┐

│ 3 │ lilei │

└────┴─┘

┌─ id ─┬─ name ─┐

│ 1 │ tom │

└────┴─┘

┌─ id ─┬─ name ──┐

│ 4 │ lihua │

└────┴─┘

-- View user_local on cdh04

Cdh04:) select * from user_local

┌─ id ─┬─ name ─┐

│ 2 │ jack │

└────┴─┘

┌─ id ─┬─ name ──┐

│ 3 │ lilei │

└────┴─┘

┌─ id ─┬─ name ─┐

│ 1 │ tom │

└────┴─┘

-- View user_local on cdh05

Cdh05:) select * from user_local

┌─ id ─┬─ name ──┐

│ 4 │ lihua │

└────┴─┘ so far, I believe you have a deeper understanding of "what is the watch engine of ClickHouse". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.