The influence of NVM as main memory on database management system 07/01 Update SLTechnology News&Howtos

The influence of NVM as main memory on database management system

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

The influence of NVM as main memory on database management system

Implications of non-volatile memory as primary storage for database management systems

Abstract

Traditional database management systems use disks to store relational data. The characteristics of hard disk: cheap, durable, large capacity. However, reading data from disk is very expensive. To eliminate this delay, DRAM is needed as an intermediary. The characteristics of DRAM: it is faster than disk, but it has small capacity and no persistence. NVM is a new storage technology with large capacity, byte addressing, storage speed comparable to DRAM, and non-easy to lose interest.

In this paper, we summarize the influence of NVM as main memory on relational database management system. That is, this paper studies how to modify the traditional relational database management system to make full use of the characteristics of NVM. Modified PostgreSQL's storage engine to adapt to NVM, and described in detail how to modify and the challenge of modification. Finally, it is tested and evaluated by a comprehensive simulation platform. The results show that the data is stored on disk: the query time of the modified PG is 40% less than that of the native PG, and the data is stored in NVM, which can be reduced by 14.4%. The average decrease was 20.5% and 4.5%, respectively.

Introduction

General database management systems are memory plus disk architecture, and the dataset will eventually be persisted to disk. Disk is cheap and non-volatile, so it is suitable for storing large-scale data. However, when reading data from disk, it takes a long time. In order to reduce the latency of data access, DRAM is directly added to CPU and disk as an intermediate storage medium. The access speed of DRAM is several orders of magnitude faster than disk. In addition, with the increase of the density of DRAM chips and the decrease of memory price, systems with large memory become more and more common.

For these reasons, traditional memory-based relational databases are becoming more and more popular. The important parts of relational database, such as index structure, recovery mechanism, commit process and so on, are customized for main memory as storage medium. However, relational databases still need to persist storage media, such as a large number of disks, when dealing with critical or non-redundant data.

DRAM is an important factor that affects the efficiency of database service. When the database executes the query, 59% of the power consumption is on the main memory. In addition, there are built-in substances related to leakage and voltage that limit the further expansion of DRAM. Therefore, as the primary memory medium, it is impossible for DRAM to keep up with the growth of current and future datasets.

NVM is a new type of hardware storage medium, which has some characteristics of both disk and DRAM. Prominent NVM technology products are: PC-RAM, STT-RAM and R-RAM. Because NVM is persistent at the device level, you don't need the same refresh cycle as DRAM to maintain the data state. Therefore, NVM consumes less energy per bit than DRAM. In addition, NVM has a smaller latency than the hard disk, and the read latency is even comparable to that of DRAM; byte addressing; and higher density than DRAM.

DBMS needs to fully consider the characteristics of NVM in order to release its hardware dividend. The easiest way to design is to replace the disk with NVM and take advantage of its low latency to achieve performance improvement. However, the feature of adapting DBMS to NVM goes far beyond its low latency.

In this article, I studied how to deploy NVM when designing DBMS. First, I discussed how to include NVM in the memory structure of the current system; then I modified the storage engine of PostgreSQL to maximize the dividend of NVM. We aim to bypass the slow disk interface while ensuring the robustness of DBMS.

We evaluated the two modified storage engines of PG by using the simulation platform and TPC-H benchmark test cases. At the same time, the unmodified PG scenarios on SSD and NVM are tested. The results show that the modified storage engine can reduce kernel execution time (where the file IO occurs): from 10% to 2.6% on average. The modified PG performance can be improved by 20.5% on the hard disk and 4.5% on the NVM. In addition, the performance bottleneck of the modified PG is proved: because the NVM is accessed directly to get the data, when the query needs the data, the changed data is not close to the CPU. When the user-level cache does not have this data, it causes a long delay and does not reflect the benefits of the new hardware.

Background

This section details the features of NVM technology and its impact on DBMS. Then the system software for managing NVM is introduced.

1. NVM characteristics

Data access latency: the read latency of NVM is much smaller than that of disk. Because NVM is still under development, delays vary from source to source. Delay of STT-RAM is 1-20ns. Still, his delay is very close to DRAM.

PC_RAM and R-RAM have higher write latency than DRAM. But write latency is not very important because it can be mitigated through buffer.

Density: NVM has a higher density than DRAM and can be used as a substitute for main memory, especially in embedded systems. For example, it provides 2 to 4 times the capacity relative to DRAM,PC-RAM, making it easy to scale.

Durability: the maximum number of writes per memory unit. The most competitive are PC-RAM and STT-RAM, which offer durability close to DRAM. To be more precise, the durability of NVM is 1015 and that of DRAM is 1016. In addition, NVM is more durable than flash technology.

Energy consumption: NVM does not need to be written periodically like DRAM to maintain data in memory, so it consumes less energy. PC-RAM consumes significantly less energy than DRAM, and the others are similar.

In addition, there are byte addressing and persistence. Interl and Micron have launched 3D XPoint technology, while Interl has developed new instructions to support the use of persistent memory.

2. The system software of NVM

When using NVM as the main memory, we not only need to change the application software, but also modify the system software in order to give full play to the advantages of NVM. Traditional file systems access storage media through the block layer. If you just replace the disk with NVM without any modification, NVM storage also needs to go through the block layer to read and write data. Therefore, the characteristics of NVM byte addressing can not give full play to its advantages.

As a result, some progress has been made in supporting persistent memory in the file system. PMFS is an open source POSIX file system developed by Interl. It provides two key features to facilitate the use of NVM.

First, PMFS does not maintain a separate address space for NVM. In other words, NVM and memory are addressed together. This means that there is no need to copy data from NVM to DRAM for application access. Processes can directly access data in NVM at a granularity of bytes.

Second, traditional databases access blocks in two ways: file IO; memory mapped IO. PMFS implements the file IO in a manner similar to that of traditional FS. However, memory mapped IO is implemented differently. In traditional file systems, memory mapped IO copies pages to DRAM first. Instead of this step, PMFS maps pages directly to the address space of the process. Figure 1 shows a comparison between a traditional file system and PMFS.

The choice of design

This section discusses the memory layering design that exists when the system includes NVM and how to modify the disk-oriented DBMS in order to make full use of NVM.

1. Hierarchical design of DBMS memory based on NVM.

There are various ways to place NVM in the current DBMS memory hierarchy. Figure 2 shows several three common ways to use NVM. Chart an is the traditional way, and the intermediate states currently used include log, data cache and partial query state, which are stored in DRAM and the main data are stored on disk.

Based on the features of NVM, you can replace DRAM and disk with it. As shown in figure b. However, such changes require a redesign of the current operating system and applications. In addition, as a substitute for DRAM, NVM technology is not mature in terms of durability. Therefore, we argue that the platform still contains DRAM memory, and all or part of the disk is replaced with NVM. As shown in figure c (NVM-Disk).

In this scheme, the DRAM layer is still retained in the current system, so that the temporary data structure and application code can be read and written quickly by DRAM. In addition, applications are allowed to access the data of the database system through the PMFS file system, making use of the feature of NVM byte addressing to avoid the API overhead of the current traditional file system. This deployment method does not require a large amount of DRAM because the amount of temporary data is relatively small. We believe that this deployment scenario is to integrate the rational use of NVM: place the NVM next to the DRAM to store temporary data structures or use traditional disks to store cold data.

2. The change point of traditional DBMS.

When the traditional disk-oriented database system is directly deployed on NVM, it can not give full play to the dividend brought by the new hardware of NVM. When using NVM as the primary storage medium, important parts of DBMS need to be changed or removed.

Avoid block-level access: traditional DBMS uses disks as the primary storage medium. Because the disk sequential access speed is fast, it is read in the form of data blocks to balance disk access latency.

Unfortunately, accessing data in blocks can incur additional data movement costs. For example, if a transaction updates a byte of a record, you still need to brush the entire block to disk. In other words, block-level access provides better data pre-reading. Because NVM is byte addressing, data can be accessed in bytes. However, this reduces the data granularity to the byte level, without data preheating. A better approach needs to balance the advantages of these two aspects.

Removing the internal buffer cache:DBMS of a DBMS usually maintains an internal buffer cache. When accessing a record, first calculate its disk address. If the block corresponding to the data is not in buffer cache, it needs to be read from disk to buffer cache.

This approach is not needed for NVM-based databases. If the address space of NVM can be seen by other processes, there is no need to make a block copy for a long time. It is more efficient to access records in NVM directly. However, this requires an operating system that supports NVM, such as PMFS, to expose the NVM address space directly to the process.

Remove redo logs: to ensure the ACID property of the database, DBMS requires two types of logs: undo and redo. Undo log rolls back uncommitted transactions, and redo plays back committed data that is not written to disk. In NVM-based DBMS, if internal buffer cache is not deployed, redo log is not needed when all writes are written directly to NVM, but undo log is still needed.

Case: POSTGRESQL

Postgresql is an open source relational database that supports completed ACID and can run on all major operating systems, including Linux environments. In this section, we studied the storage engine of postgresql and made some modifications to adapt it to NVM. This paper first introduces the read-write architecture of PG, and then explains what changes have been made.

1. The read-write structure of PG

Figure 3a shows the architecture of the read and write file operations of the original PG. The diagram on the left shows the operations performed by the PG software layer, and the column on the right shows the corresponding data movement. Note that the operating system used is PMFS. In figure 3a, NVM is used instead of disk to store data.

The performance of PG reading and writing data depends heavily on the file IO. Because the API of the PMFS file IO is the same as that of the traditional file system, using a specific file system does not require any modification to PG.

PG server invokes Buffer Layer's service to maintain the internal buffer cache. Maintain the page that the PG is about to visit in Buffer cache. If the buffer cache does not have a free slot for the disk to read a page in, a replacement policy is executed, that is, a data page is selected to expel from the buffer cache's administrative linked list for use, and if the data page is dirty, it needs to be brushed to disk first.

Once PG receives a request to read a data page from disk, Buffer Layer finds a free slot in buffer cache and gets its pointer. In figure 3a, pg Buffer and PgBufPtr are free buffer slot and corresponding pointers, respectively. Buffer Layer transfers this pointer to File Layer. In the end, PG's File Layer awakens files to read and write, which depends on the file system.

For read operations, PMFS copies the blocks from NMV to the kernel's buffer, and then the kernel copies it to the idle buffer cache slot that PgBufPtr points to. The write operation is also copied twice, but in the opposite direction.

Therefore, when the buffer cache is missed, the storage engine of the native PG causes two copy actions. This can be a big overhead when the dataset is very large. Because PMFS can map NVM addresses directly into memory, the overhead of copying can be avoided by modifying the storage engine. Here's how to change it.

2. SE1: the IO mode of using memory map

The first step in taking advantage of the NVM feature is to replace the File Layer of PG and name it MemMapped Layer. As shown in figure 3b, this layer still receives a pointer to the free buffer slot from Buffer Layer. However, by using PMFS's memory-mapped input-output interface, the file IO is no longer generated. This kind of storage is called SE1.

Read operation: when accessing a file for reading, you first need to call open () to open the file, and then you need to use mmap () to map the file to memory. Because the use of PMFS,mmap () returns the mapping pointer to the file in NVM. This can be an application that directly accesses files on NVM.

Therefore, there is no need to copy the requested data page into the kernel buffer. As shown in figure 3b, you can call memcpy () to copy the requested data page directly into the buffer of PG. When the request is complete and you no longer need to access the file, you can close the file. After that, you can call the munmap () function to unmap.

Write operation: similar to read operation. First you need to open the file that you are about to change, and then mmap mapping. Use memcpy () to copy dirty data directly from PG buffer to NVM.

SE1, which does not have to copy the data to the kernel buffer, reduces one data copy.

3. SE2: directly access the mapping file

The second way to modify it is to replace the MemMapped Layer of SE1 with the PtrRedirection Layer of figure 3C. Unlike MemMapped Layer, he receives a pointer to PgBufPtr (P2PgBufPtr).

Read operation: when accessing the file for read operation, call open () to open the file, and then use mmap () to map to memory. The original PgBufPtr pointer points to the free slot of the internal buffer cache. Because mmap can map NVM to memory, that is, the process can see this address, PtrRedirection Layer points the PgBufPtr to the address of the file on the NVM. The pointer redirection for the read operation is shown in the "Read" label in figure 3C.

Therefore, a copy of the data is no longer required for the read operation. In big data query, this method has a great improvement in performance.

Write operation: PMFS enables the application to access files on the NVM directly. Because PG is a multi-process system, it is dangerous to change files on NVM directly and may put the database in an inconsistent state. To avoid this problem, SE2 needs two more steps before modifying the data page and marking it as dirty: if the page is in NVM, copy the data page to the internal buffer cache, Pg-Buffer;, and then release the PgBufPtr redirection pointer and repoint to buffer cache's free slot. The "Write" process in figure 3C is shown. In this way, SE2 ensures that each process changes only its local copy of the data page.

Related work

Previous work falls into two main categories: replacing the entire database storage medium with NVM; and deploying NVM to store logs. "Nvram-aware logging in transaction systems" and "High performance database logging using storage class memory" reduce the impact of disk IO on transaction throughput, and reduce the corresponding time by writing logs directly to NVM instead of brushing to disk. NVM is used on multi-core and multi-socket hardware to write distributed logs, reducing the competition for centralized logging when the system load increases: "Scalable logging through emerging nonvolatile memory". DRAM and NVM two-tier storage, study different recovery methods.

Conclusion

When designing DBMS, the influence of deploying NVM on it is studied. Several situations in which NVM is added to the DBMS memory hierarchy are discussed. It is a typical application scenario to completely or replace part of the disk with NVM. Under this method, the principle system does not need to be modified and allows direct access to the data set on the NVM. This paper introduces two variants of PG storage engine: SE1 and SE2.

The experimental results show that for native PG, the performance of deploying database on NVM is up to 40% higher than that on disk, with an average increase of 16%. SE1 and SE2 can reduce execution time by nearly 20.5% compared to under disk. However, the biggest obstacle to the design of current database systems is to maximize performance. Compared with our benchmark and SE2, it can improve the read performance by 14.4%, with an average of 4.5%.

The limiting factor is that the data is far away from the CPU, which is the negative impact of direct access to the data on the NVM. This weakens the benefits of NVM. Therefore, it is necessary to develop libraries adapted to NVM.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.