What are the ways in which MongoDB is highly available? 04/25 Update SLTechnology News&Howtos

What are the ways in which MongoDB is highly available?

2025-04-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces "what ways MongoDB high availability is divided into". In daily operation, I believe many people have doubts about which ways MongoDB high availability is divided into. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful for you to answer the doubts about "what ways MongoDB high availability is divided into". Next, please follow the editor to study!

1: in the architectural design of MongDB, MongoDB supports failover and redundancy between multiple machines through asynchronous replication, and only one of the multiple machines is used for write operations. For this reason, only one machine in the MongoDB that plays the role of Primary can distribute read operations to Slave.

MongoDB High availability can be divided into two ways:

1:master--Slave master-slave replication is not very practical at present.

2:Replica Sets replication set

MongoDB adds a new function point called replication and: replica Set after version 1.6, which adds automatic fault switching and automatic repair of member nodes. The data is exactly the same across DB. Greatly reduces the success of maintenance

As shown in the figure:

Mathematically, it is a collection of isomorphisms: that is, a cluster. MongoDB's Relica Set architecture stores writes through a log. This operation is called "oplog", and oplog.rs is a fixed-length CappedCollection. The location of this Collection exists in the "Local database" and is used to log Replica Set operations, which is larger by default for 64-bit MongoDB,opLog. It can reach 5% disk space, and the size of oplog can be changed by the parameter "- oplogSize" of Mongod.

In addition to fixed replication set accidents, it also maintains good scalability, and once the requirements are not met, you need to add new machines. Then you need to add some nodes to distribute the pressure evenly.

The way to add nodes can generally be done directly through oplog, which is easy to operate and does not require human intervention, but oplog is

Capped collection, uses the cycle way to carry on the log processing, therefore uses the oplog way to add, may have the inconsistent problem.

Because the information stored in the log may have been refreshed. But it doesn't matter. In general, you can use a snapshot of the database-- fastsync and oplog.

Way to add nodes. The operation flow in this way is to first take the physical file of a replica set member as an initialized data, and then use the rest of the data

Oplog way to add.

And Sharding, which is a database cluster system that expands massive data horizontally, the database sub-tables are stored on each node of sharding. The data of MongoDB is divided into chunk, and each Chunk is a continuous data record in Collection, usually the largest size is 200MB. Beyond that, an up-to-date data block is generated. This is the same as the split of Hbase Region.

The whole process of the split is as follows:

For MongoDB

First of all, at the level of CL: Client, does the underlying layer need sharding? It is not necessary for users to know whether such a replication set is needed or not. Mongos: just like a butler, how to break up Collections? You don't need to know Client at all. You just need to tell me one thing: what is the Key of the partition? There is such a concept of Partition Key in many components, including hadoop,Storm, and other databases. Corresponding to our network will act as such a routing function. And store some of the cluster information that you control in the Config server.

The same is true in other databases, for Hbase. The table still needs to be split again. Corresponds to the Region in Hbase. One of the more common English names for slicing with MongoDB is called Segment.

It doesn't matter if you don't know enough about people like Hbase. You just need to know that Region is a segment of table splitting, region is split by size, and each table is just a Region,Region at the beginning of the table will continue to expand as the data is written, so that after reaching the design threshold, Region begins to split, by one split into two. As there are more and more rows in the table, the number of Region increases.

HRegion is the smallest unit of distributed storage and load in Hbase. Here is a comparison figure:

In fact, Replication in kafka is more equivalent to the copy mechanism in Hadoop system, which is different from the contradiction to be solved in slicing.

To put it simply, the distributed system has its own unique properties and properties, and has a fixed demand for its storage system.

At this point, the study of "what are the ways in which MongoDB is highly available" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.