Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is Mongodb WiredTiger?

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

In this issue, the editor will bring you about what is Mongodb WiredTiger. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.

Michael Cahill co-developed wiredtiger with another partner in 2011.

Wiredtiger in MongoDB is a database engine that supports transactions, and it is difficult to solve this problem. Here I will explain how they work, mainly around the timestamp of wiredtiger. We know that what is special in mongodb is oplog log, referred to as operation log, and the sequence of operations in the system is recorded in oplog. For wiredtige, it provides a consistent version control called multi-version control. For how to record sequentially in parallel processing, if you can not determine the exact order of records in oplog, other machines in the replication set will not be able to obtain the exact order of data replication. So we use the way of timestamp to implement the information more effectively in the storage laryer layer of wiredtiger, and control it more effectively.

Before we start to talk about the topic, let's review the internal data storage structure of wiredtiger. Whether the data or index storage structure is stored in a tree structure, the data is stored in a tree structure of the primary key, and the key and values in the leaf node are stored in bson. Whether you are inserting a new document or updating a document, we call it update structure. And the update will contain some information about transaction, whether it is related to the next operation. When a read operation comes in at this time, they need to consider and calculate the correct lists to return.

The above work is actually multi-version control, which has existed in MONGODB for a long time. What we are talking about is that we have modified the existing data structure by adding a timestamp to the data structure, which will tell the storage engine the order in which transactions occur. In fact, two sentences can explain that timestamp solves the sequence of transactions and the time period in which the data is read. So even if we do it in parallel, mixed with a lot of different transactions and different orders, timestamp ensures the right results.

When we use a clever technique to apply oplog parallelism to other secondary mongodb through multithreading, and these data blocks are segmented, combined to the destination, applied. These updates are probably not in order, we are in the order of 100 101 102 on primary, and it is likely to become 101 100 102 on secondary.

So what problem can timestamp solve?

1 for the query, when 101and 102are applied, 100 is not applied to the secondary, then the data related to 1010102 will not be displayed in the query, which ensures the data consistency.

2 it is mentioned above that oplog will be split into multiple batches to be applied by multiple threads, while reading from the library is done using locks, and there is a global lock in MONGODB that releases the global lock on secondary, and the application data can then be read on secondary. This can be extended to document with a unique index, and on documents with two threads, we have to make sure that all the results are correct.

3 timestamp should also be applied to rollback in replication. Before we talk about it, everyone should understand most of the concepts in MONGODB replication. From the image above, we can compare the timestamps to find that most of the No.2 data points on secondary have been applied. This will be related to the election after the defeat of the node and so on. At the same time, it plays a related role in data dragging after node switching.

At the same time, after the new master is created, we will also have relevant historical data to determine whether the old master can still rejoin into the replication set. It is clear in the figure above that older prmary will not be added back to the replication set.

To sum up, wiredtiger provides useful support for such as replication, data rollback, and maintenance related to index through timestamp sorting. In the next step, we will optimize the maintenance of the index, combining the advantages of the two indexing, and optimizing the work of long-running reads. Today's transactions are basically based on short running transactions, but we know that there are some customers who query large amounts of data transactions, and we are currently doing some low-level work to make our database engine handle similar work more effectively.

The above is the editor for you to share what is Mongodb WiredTiger, if you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report