Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What does CockroachDB mean?

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

Editor to share with you what CockroachDB refers to, I believe that most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to understand it!

Introduction

CockroachDB is a distributed data that supports SQL, ACID for distributed transactions, and Serializability, the highest isolation level for ANSI SQL.

In a distributed system, it is difficult to support Linearizability because there are clock errors between different machines and a global clock is needed. TiDB chooses the same scheme as Percolator, and a single point of timestamp oracle provides a clock source. Google Spanner has directly developed a hardware-based TrueTime API to provide relatively accurate clocks. CockroachDB has no atomic clock and does not use a single point of timestamp oracle, but based on NTP to synchronize the clock offset between machines as much as possible, the NTP error can reach 250ms or more, and can not be strictly guaranteed, which makes it difficult for CockroachDB to ensure Linearizability consistency and poor performance. In the end, although CockroachDB supports Linearizability, it is not officially recommended. By default, CockroachDB supports the Serializable isolation level, but Linearizability is not guaranteed.

Serializable

A real database system will have many concurrent transactions executing at the same time. How to make these transactions feel that only they are running in the database without any interference from other transactions is a problem of isolation level. Serializable is free from any interference. Weaker isolation levels such as Repeatable Read, Read Committed and Read Uncommitted,Snapshot Isolation will feel more or less disturbed by other transactions, such as Repeatable Read having phantom reading problems and Snapshot Isolation having write skew problems. I will not elaborate on the details. You can refer to a-critique-of-ansi-sql-isolation-levels

It is difficult to implement a database that supports Serializable isolation level. Many databases do not support Serializable isolation level for several reasons. I think the most important reason is poor performance. Oracle 11g default isolation level RC, the highest isolation level Snapshot Isolation, some well-known databases in the industry support for isolation level see When is "ACID" ACID? Rarely. However, CockroachDB has spent a lot of effort to implement Serializable.

A transaction usually contains multiple read and write operations, operating on different rows / columns. The database system will schedule the transactions in the system, and the transactions will be executed across, rather than one after another.

There are three transactions, and the figure above shows a scheduling of these three transactions by the database system. So is this dispatch Serializable? This has theoretical support: serializability graph. This theory introduces three conflicts, all of which are for different transactions to operate on the same data:

RW: W overrides the value read by R.

WR: r read the value of W update

WW: W overwrites the value of the first W update

For any transaction scheduling result, if there is some kind of conflict between two transactions, there is a directed edge between the transactions (the latter transaction points to the previous transaction). The following figure shows the serializability graph of the transaction schedule above.

It has been shown that if there is no ring in the serializability graph of a transaction schedule, then the transaction schedule is Serializable. So how does CockroachDB do it?

CockroachDB transaction processing system

Multiple editions

CockroachDB transactions are Lock-Free, do not need to add any read-write lock, naturally need to maintain multiple versions of the data, the version is identified by timestamp.

An and I in ACID are closely related and are guaranteed by concurrency control protocols. The following explains how An is guaranteed, and then how I is guaranteed in the case of concurrency. The concurrency control protocol ensures An and I.

Atomicity

A distributed transaction may read and write data on multiple nodes, how to ensure atomicity? As we all know, distributed transactions are all 2PC. In the first stage, we do Prepare to read in the data that needs to be read (how to ensure that we can read the latest data, we will say later that it is assumed to be able to read it), calculate, and finally write the calculated data to each node, but it does not take effect externally, that is, other transactions in the system cannot read this data for the time being. This kind of data that has been written to each node but does not have a valid CockroachDB calls it write intent, and this write intent is stored with the actual data, but cannot be read externally.

So where does the status of this transaction exist? In fact, at the beginning of a transaction, a record called Transaction Record,record is written to the underlying storage system, which records the transaction ID, transaction status, Pending (running) or Committed, or Aborted, and there is a key in the write intent that points to this Transaction Record. To commit a transaction, you only need to change the transaction status in Transaction Record to Committed, and roll back the transaction to Aborted. Once the transaction status is successfully modified, it can be returned to the client, and the legacy write intent will be processed asynchronously: when commit, the value of write intent is overwritten with the original value, and when write intent,rollback is deleted, write intent can be deleted directly.

Later, when the client comes over to read it, if it encounters write intent (as mentioned earlier, write intent is asynchronously deleted), it will find Transaction Record along write intent to see the status of the transaction. If the status is committed, it will return the value in write intent, and if Abort will return the original value. If it is Pending, it means that the transaction is still running normally. If you encounter a write conflict, how to solve the write conflict? This involves isolation levels and concurrency control protocols, see below.

Isolation

As mentioned earlier, the data is multi-version, and the version is identified by timestamp. Timestamp is the wall time that the read-write transaction / write transaction gets locally at the beginning of the transaction (actually HLC, a logical clock that captures causality based on the physical clock). This timestamp is only the candidate timestamp for the last commit of the transaction, not necessarily the timestamp of the final commit (the root cause is that there is a clock offset between machines, which will be discussed later). Let's assume that we get a final timestamp. The larger the timestamp, the newer the version. All written data for this transaction will be marked with this timestamp as the version ID. In such a system, serializability graph would look something like this:

The picture above is acyclic. The following graph has a ring:

Back in Serializability, in order to implement Serializability, you need to ensure that the scheduling of transactions is acyclic. CockroachDB avoids the three conflicts mentioned earlier in the opposite direction of timestamp, so that there are no edges consistent with timestamp in the graph, thus ensuring acyclic. Finally, the serializability graph of CockroachDB looks like this:

CockroachDB guarantees the following constraints:

RW: the timestamp of W can only be larger than that of R, which only results in a backward edge (by maintaining a Read Timestamp Cache on each node).

WR: r will only read the largest version that is smaller than its own timestamp, which will only lead to a look back.

WW: the timestamp of the second W is larger than the timestamp of the first W, which only produces a back edge.

In other words, as long as a transaction is guaranteed to conflict only with smaller transactions in timestamp, it can be guaranteed to be acyclic.

Recoverable

Cruelly, only ensuring that acyclic can achieve Serializability, but also need to maintain the consistency of the database, that is, C in ACID. Consider the following scenarios:

T1 and T2 two transactions, timestamp (T1) < timestamp (T2), T1 update A, not yet committed, T2 read A. This is a WR conflict, but because this conflict is a backward side, it is allowed. In order to maintain the RW constraints mentioned above, T2 must read the update of T1 (W's timestamp must be larger than R, but T1 is smaller than T2). However, what is the problem with the update of T2 reading T1 to A?

T2 reads the update of T1. If the last T2 commit is followed by a T1 rollback, this violates the atomicity of T1: the value that T1 failed to write was read by T2.

CockroachDB uses a more demanding scheduling to handle this scenario: all operations can only be done on data that is already committed! Let's talk about how this demanding scheduling of CockroachDB is guaranteed, and here we need to use the previous knowledge of atomicity.

Strict Scheduling

As you can see from the previous section and the atomicity section, a transaction encounters a write intent, which indicates that the transaction that may have written write intent is not finished (because the write intent is cleared asynchronously), which indicates that it is possible to encounter uncommitted data. At this point, the current transaction checks the status of the transaction in which the write intent is located, and if it has already been committed, overwrite the old value of write intent and clear the write intent. If you have already rolled back, just clear the write intent directly. What if it's Pending, it's running? At this point, it depends on the priority of the transaction. The low-priority transaction requires abort, and the priority given to the transaction at the beginning is random. CockroachDB ensures that transactions that are abort will be prioritized after restart.

This ends with how CockroachDB provides the Serializability isolation level. Note that the premise here is that each transaction is assigned an appropriate timestamp. What is the appropriate timestamp? A distributed read / write transaction needs to be able to read the latest committed data.

How to timestamp transactions with CockroachDB

CockroachDB uses NTP for clock synchronization, and NTP can basically guarantee that the clock offset between machines is less than 250ms, but this is not absolute, which is affected by network delay, system load and other factors. As you can see earlier, CockroachDB's Serializability depends on the clock clock between machines in the cluster within a range of ε. This range can be configured and defaults to 250ms. At any time, if wall time is t on a machine, the maximum wall time that may exist in the cluster is t + ε.

When a transaction T starts, take a local Wall time (actually HLC) and write it down as t. According to the definition of NTP, the maximum Wall time of the machines in the cluster is t + ε. If the data objects read during the transaction are between [t commit t + ε], we don't know whether this value is commit after T starts or commit before T starts. So T needs restart, so reset t to the timestamp you encounter.

The above is all the contents of this article "what does CockroachDB mean?" Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report