Principle Analysis of SQL Server main memory Database 07/15 Update SLTechnology News&Howtos

Principle Analysis of SQL Server main memory Database

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Preface

Since the development of relational database, we have done enough articles on the details. in the process of seeking their own breakthrough and development, memory and distributed database are the most popular topics, which reflect the needs of performance and scalability in the era of big data.

As a traditional database, SQL Server also provides a new weapon SQL Server In-Memory OLTP (Hekaton) in the latest release of SQL Server 2014, which makes its performance in OLTP system tens or even hundreds of times higher. This article explores one or two for you.

How to organize and apply the data of big data era? I'm afraid there are different opinions. However, it is undeniable that relational data is still the most effective application in the world. As an application technology, it will continue to evolve with the needs of the application.

The information explosion puts forward more stringent requirements for information processing. From the traditional OLTP system alone, performance and scalability are the most concerned aspects of the application. What if the application tells you that I need 100 times the current database access to computing resources, pure hardware? It is obvious that the application of new technology is called out.

With the continuous improvement of traditional relational database since its birth, along with the rapid development of hardware, the performance improvement is accompanied by the magic Moore's law of the processor, TPC-C,TPC-E and other indicators continue to improve, but with the processor physical process approaching the limit this year, the main frequency speed of CPU is almost no longer increased, at this time, the computer is moving towards a multi-core direction, and the memory cost is also decreasing linearly, which is no longer so expensive. The current cost of memory is less than 10 $/ GB.

The wide application of solid state hard disk (SSD) also makes the traditional database have more extension in performance. In the face of these new hardware environments, traditional relational databases naturally have their own performance bottlenecks that are inevitable at the beginning of their design.

The buffer pool extension (Buffer Pool Extension) function is introduced into the traditional engine of SQL Server 2014, and the high IOPS of SSD is used as a favorable extension of the buffer pool to form a hot, live and cold three-tier data system, which can effectively relieve the pressure on the disk.

We can put more data in memory, SSD, but even so, the performance of the database is still constrained by some of its own architecture and processing.

Based on the previous assumption, we need to increase our transaction processing capacity by 100 times. Suppose that our current processing capacity is 100 TPS, while each transaction gets an average of 1 million CPU instructions, thus increasing by 10 times 1000 TPS, the number of CPU instructions per transaction needs to be reduced to 100000, and the number of CPU instructions per transaction needs to be reduced to 10 million, which is impossible in the existing database system, so we still need a new way of processing.

I. the problems faced by the traditional database engine

Some friends may say that putting all the data in memory is an in-memory database, and there is no deficiency, but even so, we still face the following main problems:

1. Hot spots problems caused by latches (latch) used to protect data structures in memory.

two。 The locking mechanism is used to control the blocking caused by multi-version concurrency.

3. The execution efficiency of the execution plan using interpretive (interpretation) language.

Let's take a brief look at the origin of the above problems

1. Suppose I have a query Q1 that needs to access a data page number 7, at which time the data page is not in BufferPool (BP), and the system allocates the memory architecture for it, and the process of removing the disk to get the relevant data page into the BP is about 10-20ms. At this time, another query Q2 needs to access the data page number 7. Because the page architecture already exists in BP, if Q2 is allowed to read at this time, Q2 will read dirty.

Therefore, latches are introduced. When Q1 reads data from disk, the corresponding architecture in BP is protected by latches, and when Q2 reads the corresponding pages, it will be blocked until Q1 completes the corresponding operation and releases the latches, as shown in figure 1-1 below:

Figure 1-1

In order to ensure the consistency of shared data under multi-threading in database systems, any data structure in memory needs to be protected by latches. When a large number of concurrent processes access a data page (structure) at the same time, it causes hot problems. It consumes a lot of CPU and affects concurrent throughput at the same time.

two。 Suppose there are two operations that modify a value in the database:

Atom 1000

Q1VLA = A + 100Q2: a = A + 500,

The operation in the database is

Q1:Read Amena Achievement 100 Write A

Q2:Read Amena Achievement 500, Write A

If it is executed sequentially, there is no problem, but if it is executed at the same time, the data can be inconsistent.

Q1MagneQ2 reads the original value of An at the same time and modifies it. The data is inconsistent as shown in figure 1-2:

Figure 1-2

In order to solve this problem, JimGray, the late industry god and winner of Turing Award, put forward the concept of two-stage lock (Two-Phase Locking), which reasonably solves the problem of concurrency consistency, and has been applied and improved by most database systems (such as the intention lock introduced by concurrency compatibility under different granularity of data in SQL Server).

In this example, when Q1 reads A, an exclusive lock is added to A. when Q2 tries to read, it will be blocked. You need to wait for the lock resource to be released after the transaction of Q1 is completed. Figure 1-3:

Figure 1-3

However, because of the introduction of locks, transactions may block each other, and specific lock resources need to be managed, and deadlocks and other problems need to be detected immediately, and these problems will naturally affect the concurrency performance.

3. Anyone familiar with SQL Server knows that a statement is executed in SQL Server, and there are some procedures such as binding, semantic analysis, cost-based optimization, etc., and then generate the corresponding interpretive language execution plan. When executing the corresponding execution plan, the engine will call the corresponding database function and run each operator. If the data is on the hard disk, it will fetch the data on the hard disk.

These situations lead to high time consumption while interrupting the CPU pipeline when executing interpretive languages, which makes the efficiency of CPU not be fully utilized, and if the data is in memory, it can be processed in a more efficient way. However, the execution plans of most relational database systems are interpreted languages.

In the face of these problems, giant database manufacturers have provided corresponding in-memory database solutions, such as Oracle's Timesten, as well as the latest Turing Award winner Professor Michael Stonebraker's research on the commercial product VoltDB evolved by H-store.

Microsoft SQL Server 2014 also launched the in-memory database SQL Server In-Memory OLTP (development code Hekaton). Next, we will briefly take a look at how Hekaton deals with the above problems, making the performance get a new sublimation.

Second, the coping style of SQL Server Hekaton

SQL Server Hekaton is a high-performance OLTP database engine based on memory optimization, and the data is persistent, it is fully integrated within SQL Server (it can be transparently mixed with traditional engines, column-based storage engines as shown in figure 2-1), and is based on a modern multicore CPU architecture. Figure 2-1:

Figure 2-1

To deal with the above three performance bottlenecks, Hekaton adopts "Bw-tree" data structure to implement Latch-free, optimistic concurrency with multi-version timestamp data row control to realize lock-free transaction, and interpretive language execution efficiency uses cut-off execution plan to compile machine code (DLL) to improve CPU efficiency.

The following is a brief explanation of these three points.

The data page size in Hekaton is flexible so that the Delta update can be updated incrementally, because the existing traditional update in place will invalidate the existing CPU Cache and limit the performance in the multi-core architecture. Data pages are managed in memory by mapping tables, mapping the logical ID of each data page to the physical address one by one. Figure 2-2:

Figure 2-2

Compareand Swap (CAS) is used to realize lock-free (Latch free) operation when updating data.

CAS: by comparing whether the value of the physical address matches the carrying value, the match can be performed, and the operation is rejected if there is no match.

If the value of the address M carried by a process is 20, which matches the actual value of the address M, it can be modified if it is 20, otherwise it can be rejected as shown in figure 2-3:

Figure 2-3

During the incremental update of the data page, each operation generates a new incremental address on the data as the access entry to the data page, and uses CAS to complete the mapping (delta address) of the physical new address in the mapping table (mapping table), and arbitrate the simultaneous update that may occur for the same data page, in which case the winner will update and the loser can retry. Unfortunately, at present, SQL Server will only throw error messages for failed operations. We need to catch the error messages and try again. For more information, please see the online documentation.

As shown in figure 2-4:

Figure 2-4

In this mode of operation, accessing data when the update chain is too long will increase the time complexity and affect performance. SQL server will organize it under appropriate circumstances, generate a new data page, and point the physical address to the new data page, while the old data page linked list will release memory as garbage collection.

Figure 2-5:

Figure 2-5

Because the data page is flexible, it may cause the data page to be too large or procedural, and the page will be split or merged in Hekaton as it sees fit. Limited to space, I won't go into detail here. In implementing Latch-free, all in-memory operations are done through one or more atomic operations. Interested friends can refer to the relevant literature of Microsoft.

Some friends may say that latches themselves are lightweight locks that protect memory structures, and different types of latches may be compatible. How much can Latch-free help performance?

When the actual SQL Server accesses the data in memory, the latch itself is very expensive to control the data access. For this reason, a Spin lock is added to the data for the thread to detect whether the data can be accessed. The Spin lock implementation is a Bit bit (0 or 1). The thread will always probe this Bit bit in memory in an attempt to obtain the spin lock. If it can be accessed, it will be accessed, otherwise spin. If thousands of probes are still unreachable, stop "rest" this is called a collision.

However, in the process of spin CPU load state, so it also results in a waste of CPU resources. In production, we may see that CPU is highly enabled, but concurrency can not go up, and access becomes slower. One of the reasons is that a large number of processes access hot data and a large number of spin locks are requisitioned so that the performance is limited.

This problem does not exist without latches in Hekaton, and from this point of view, performance is linearly magnified as threads increase. Of course, in addition to Latch-free, the other two aspects of Hekaton are also excellent.

As described in the previous article, transactions in relational databases rely on locks to ensure multi-version concurrency control. I believe all DBA are impressed by the resulting blocking deadlocks. On the other hand, Hekaton is implemented in the form of multi-version data plus timestamp under optimistic concurrency.

Here is a brief explanation.

In Hekaton, a transaction is divided into three phases, and the normal transaction steps for our data operation, DML, create a new version line. Verify that the commit phase verifies that the transaction can be safely committed (based on version data). The commit phase is used to write logs and make the new version row data visible to other transactions.

Figure 2-6:

Figure 2-6

Let's give a brief illustration with an example:

The transaction process uses Timestamps (timestamp (global clock)) to mark the transaction and row version, and each transaction starts with a start timestamp Begin_TS, which is used to read the correct row version (data rows also have timestamps). The row version data end timestamp End_TS is generally positive infinite (+ ∞), and a new version row is created when the data is updated. The old version line End_TS is modified to a transaction ID Xb (not a timestamp here), and the Begin_TS of the new version line is also marked as transaction ID (Xb). Then get the End_TS (unique) of the transaction, confirm that the transaction can be committed, commit the transaction, and replace the new and old version of the transaction ID (Xb) with the acquired End_TS. At this point, one operation is completed. No locks, latches, or blocking are involved.

Figure 2-7:

Figure 2-7

Some students may recall when they see the picture above, so is the version line read by Xa correct? Why can't he read the new row data of Xb. Let's make a brief analysis.

The timestamp assigned at the beginning of the Xa is 25 Xb Xb is 35, which means that the end timestamp of the Xb must be greater than 35. At this time, the Xa reads the data, the timestamp range should be Begin_TS-20,End_TS-+ ∞, and the Begin_TS of Xa is less than the Begin_TS of Xb, so the reading is correct as shown in figure 2-8:

Figure 2-8

In fact, Hekaton stipulates that the visible value range of a query must cover the start timestamp of the query, for example, the start timestamp of a query transaction is 30, and the row version visible to him can include 10 to + ∞, 20 to 150, but cannot see 40 to + ∞. Figure 2-9:

Figure 2-9

Some students may think that with the increase of access and DML, a large amount of useless data will accumulate to occupy memory. In fact, according to the query's own transaction timestamp, such as when the oldest transaction start timestamp is greater than or equal to 50, the old version of data can safely clear and release memory. The cleanup work enables multi-thread execution in parallel, with little impact on Xineng.

As you can see from figure 2-6, not every transaction can be safely committed, and during the authentication phase, Hekaton validates according to the isolation level set by the user.

For optimistic concurrency, Hekaton provides three isolation levels: snapshot isolation level (Snapshot Isolation), repeatable isolation level (RepeatableReads Isolation) and serialized isolation level (Serializable). Similar to traditional relational data, Snapshot does not require verification, while repeatable data needs to be verified again before committing is consistent with the data at the beginning of the transaction. If consistent, it can be committed, otherwise it cannot be committed.

The interval data read as the name implies in serialization must be consistent, otherwise it will fail. Some students may wonder how much data will be matched in serialization and whether the cost is too high. Don't forget that it is in memory, which is still much lower than the traditional serialization cost.

Students who are familiar with the optimistic level all know that the traditional optimistic concurrent level rollback cost is very high, and the verification method in Hekaton can effectively avoid this cost. Commit is to log changes and replace the transaction ID in the data row with the acquired timestamp, which is visible to other transactions.

Of course, to improve logging, we all know that the disk is the bottleneck after all, so Hekaton also has its own specific optimization ways to alleviate this problem, which is not described here because of the limited space. And for some specific scenarios, we can choose to keep only Schema without data persistence (such as game scene data, etc.).

Finally, the execution plan is replaced from interpretive language (Interpreted) to machine language (Native) for CPU execution efficiency.

The optimizer can be said to be the most complex part of a relational database. In a nutshell, the SQLServer optimizer processes: a statement is given to the optimizer for binding parsing, generating a parsing tree, then semantic analysis is performed to generate a logical execution plan, and finally the optimizer generates a physical execution plan based on cost for the logical execution plan.

In Hekaton, if we choose Native mode to execute (specially compile the executed statements through stored procedures), after generating the logical execution plan, different physical execution plans will be generated according to different algorithms and cost estimates, and then the physical execution plans will be translated into C language code and then compiled into DLL or machine code through the compiler. Figure 2-10:

Figure 2-10

Once a friend on Weibo asked why Mysql refactored the optimizer, why the parsing,optimizing and execution modules were separated rather than mixed. I think I may have found the answer here, how important a good RDBMS is to be robust.

Under Native, all execution is "Goto", which reads data directly and no longer needs a function call, which greatly improves the efficiency of CPU. One might ask that it would be very expensive to compile each time. In fact, Hekaton compiles the specified query (stored procedure) into a DLL file, just loading it into memory for the first time. It is not allowed for impromptu queries.

The execution efficiency of Hekaton in machine code has been greatly improved. Here are the test data given by Microsoft:

The comparison between a.Interpreted and Native, which is divided into memory-optimized tables and CPU instructions consumed by querying a single piece of data. Figure 2-11:

Figure 2-11

b. Comparison of query time between random search 10 million data general table and Hekaton memory optimization table figure 2-12:

Figure 2-12

c. Compared with the Hekaton memory optimized table, the data of random updates in memory is not written in the log as shown in figure 2-13:

Figure 2-13

Third, Hekaton application case

Hekaton, which means a hundred times in ancient Greek, although the vision has not yet been achieved, I think this excellent team will be able to do it.

With this new weapon, SQL Server is even better at dealing with performance issues. There are a large number of cases on Microsoft's official website, and here are a few.

Bwin, the largest online × × × in Europe, has increased online batch processing from 15000 to 250000 per second with Hekaton.

EdgeNet, a famous data service provider in Silicon Valley, after the adoption of Hekaton, the amount of data stored online has increased from 7450 to 126665, both by nearly 17 times the speed. Figure 3-1:

Figure 3-1

On the other hand, when the visit volume of the easy-to-buy car is simulated by Hekaton, the performance indicators are very calm. Figure 3-2:

Figure 3-2

Hekaton not only solves the performance problems in many scenarios, but I think it is also helpful to deal with some thorny problems in specific scenarios. For example, e-commerce is keen to kill / snap up. Here, the author will not describe the queuing theory, batch submission and other methods studied by friends in the industry.

In fact, computers are widely used to simulate human activities in three-dimensional space. Imagine that the process of panic buying will eventually succeed or fail, as if you were pushed aside by an able-bodied aunt when you were snapping up hot-selling products. This is not exactly in line with the transaction mechanism in Hekaton? When designing online product activities, should we think about what the simulation looks like in reality? In this regard, I think what we need is controllability, not control.

IV. Conclusion

Finally, with so many amazing and exciting databases, is it perfect? Of course not. Hekaton's optimistic concurrency level limit makes it not suitable for a large number of update conflict scenarios, its space for speed design requirements will consume a lot of memory, requires the application of reasonable planning and design.

Please keep in mind that "any technology is flawed", no technology / architecture is perfect, choosing the right technology / architecture in the right scenario is our original intention.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.