Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

MongoDB (4. 0) slicing-- big data's way to deal with it

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

What is slicing

The database application with high data volume and throughput will cause great pressure on the performance of the stand-alone machine, the large amount of query will exhaust the CPU of the single machine, and the large amount of data will exert great pressure on the storage of the single machine, which will eventually exhaust the memory of the system and transfer the pressure to the disk IO.

MongoDB sharding is a method of using multiple servers to store data to support huge data storage and data manipulation. Slicing technology can meet the needs of the massive growth of MongoDB data. When one MongoDB server is not enough to store massive data or provide acceptable read and write throughput, we can split the data on multiple servers so that the database system can store and process more data.

MongoDB fragmentation advantage

Sharding has provided enough methods to cope with high throughput and large amount of data.

The use of shards reduces the number of requests to be processed per shard, so by scaling horizontally, the cluster can increase its own storage capacity. For example, when inserting a piece of data, the application only needs to access the shard that stores the data. The use of sharding reduces the data stored in each shard village.

The advantage of sharding is to provide an architecture similar to linear growth, improve data availability and improve the performance of large database query servers. Sharding can be used when MongoDB single point database server storage becomes a bottleneck, when the performance of a single point database server becomes a bottleneck, or when large applications need to be deployed to make full use of memory. The composition of MongoDB sharding cluster Shard: sharding server, used to store actual data blocks, in the actual production environment, a shard server role can be undertaken by several servers composed of a Peplica Set to prevent the host from a single point of failure. Config Server: configuration server that stores configuration information for the entire sharded cluster, including chunk information. Routers: front-end routing, which allows clients to access, and makes the entire cluster look like a single database, so that front-end applications can be used transparently. Environment preparation system version: CenTos 7 Software version: MongoDB4.0 off Firewall and selinuxsystemctl stop firewalld.servicesetenforce 0IP:172.16.10.26IP:172.16.10.27IP:172.16.10.29mongos (27017) mongos (27017) mongos (27017) config (30000) shard1 Primary Node (40001) shard1 Arbitration Node (40001) shard2 Arbitration Node (40002) shard2 Primary Node (40002) shard2 Auxiliary Node ( 40002) shard1 secondary node (40003) shard1 arbitration node (40003) shard1 primary node (40003) deploy MongoDB shard cluster

The idea of cluster deployment is to use three servers to install mongodb database, and each server creates five instances (mongos, configs, shard1, shard2, shard3). Instances of the same name on three different servers are created as a replication set, including the primary node, the secondary node, and the arbitration node. Mongos does not need to create replication sets, config does not need to specify primary and secondary nodes and quorum nodes, but does create replication sets. The operation steps of the three servers are slightly different, but most of them are repetitive operations, and the steps are exactly the same.

Install the MongoDB database installation support software and mongodbyum install openssl-devel-ytar zxf mongodb-linux-x86_64-rhel70-4.0.0.tgz-C / usr/localmv / usr/local/mongodb-linux-x86_64-rhel70-4.0.0 / usr/local/mongodb / / decompress to complete the installation and create the data storage directory and log storage directory

The routing server does not store data, so there is no need to create a data storage directory, just config, shard1, shaed2, shard3, and permissions are required after the log file is created.

Mkdir-p / data/mongodb/logs/mkdir / etc/mongodb/mkdir / data/mongodb/config/mkdir / data/mongodb/shard {1 data/mongodb/logs/config.logchmod 2 data/mongodb/logs/shard 3} log touch / data/mongodb/logs/mongos.logtouch / data/mongodb/logs/config.logchmod 777 / data/mongodb/logs/*.log create an administrative user Modify directory permissions useradd-M-u 8000-s / sbin/nologin mongochown-R mongo.mongo / usr/local/mongodbchown-R mongo.mongo / data/mongodb set environment variable echo "PATH=/usr/local/mongodb/bin:$PATH" > > / etc/profilesource / etc/profile system memory optimization ulimit-n 25000ulimit-u 25000sysctl-w vm.zone_reclaim_mode=0echo never > / sys/kernel/mm/transparent_hugepage/enabledecho never > / sys/kernel/mm/transparent_hugepage/defrag / / * Note * these optimizations are temporary Restart failed deployment configuration server to create configuration file # vim / etc/mongodb/config.confpidfilepath = / data/mongodb//logs/config.pid / / pid file location dbpath = / data/mongodb/config/ data file location logpath = / data/mongodb//logs/config.log / / log file location logappend = true Bind_ip = 0.0.0.0 / / listening address port = 30000 / / Port number fork = true replSet=configs / / replication set Name configsvr = truemaxConns=20000 / / maximum number of connections send the configuration file to another server scp / etc/mongodb/config.conf root@172.16.10.27:/etc/mongodb/scp / etc/mongodb/config.conf root@172.16.10.29:/etc/mongodb/

Start the config instance mongod-f / etc/mongodb/config.conf / / three servers to consistently configure the replication set (any operation is fine) mongo-- port 30000 / / it is recommended that all three servers enter the database It is convenient to view the role change config= {_ id: "configs", members: [{_ id:0,host: "172.16.10.26 members 30000"}, {_ id:1,host: "172.16.10.27 configs 30000"}, {_ id:2,host: "172.16.10.29displacement 30000"}]} / / create a replication set rs.initiate (config) / / initialize the replication set

Deploy the shard1 sharding server to create the configuration file # vim / etc/mongodb/shard1.confpidfilepath = / data/mongodb//logs/shard1.piddbpath = / data/mongodb/shard1/logpath = / data/mongodb//logs/shard1.loglogappend = truejournal = truequiet = truebind_ip = 0.0.0.0port = 40001fork = truereplSet=shard1shardsvr = truemaxConns=20000 to send the configuration file to another server scp / etc/mongodb/shard1.conf root@172.16.10.27:/etc/mongodb/scp / etc/ Mongodb/shard1.conf root@172.16.10.29:/etc/mongodb/ starts shard1 instance mongod-f / etc/mongodb/shard1.conf / / three servers operate consistently to configure shard1 replication set

In the creation of a shard sharding server, it is important to note that it cannot be created successfully on any server, if you choose to create a replication set error on a server that is pre-set as a quorum node. Take the shard1 sharding server as an example, you can create a replication set on 172.16.10.26 and 172.16.10.27 servers, but it will fail on 172.16.10.29 because 172.16.10.29 was set as a quorum node before the replication set was created.

Mongo-- port 40001 / / it is recommended that all three servers enter the database It is convenient to view role changes use adminconfig= {_ id: "shard1", members: [{_ id:0,host: "172.16.10.26 members 40001", priority:2}, {_ id:1,host: "172.16.10.27 shard1 40001", priority:1}, {_ id:2,host: "172.16.10.29 members 40001" ArbiterOnly:true}]} rs.initiate (config) deploy shard2 sharding server creation configuration file # vim / etc/mongodb/shard2.confpidfilepath = / data/mongodb//logs/shard2.piddbpath = / data/mongodb/shard2/logpath = / data/mongodb//logs/shard2.loglogappend = truejournal = truequiet = truebind_ip = 0.0.0.0port = 40002fork = truereplSet=shard2shardsvr = truemaxConns=20000 send the configuration file to another server scp / etc/mongodb/shard2.conf root@172.16.10 .27: / etc/mongodb/scp / etc/mongodb/shard2.conf root@172.16.10.29:/etc/mongodb/ starts shard2 instance mongod-f / etc/mongodb/shard2.conf / / three servers operate consistently to configure shard replication set (non-arbitration node server) mongo-- port 40002 / / it is recommended that all three servers enter the database It is convenient to view role changes use adminconfig= {_ id: "shard2", members: [{_ id:0,host: "172.16.10.26 members 40002", arbiterOnly:true}, {_ id:1,host: "172.16.10.27 shard2 40002", priority:2}, {_ id:2,host: "172.16.10.29displacement 40002" Priority:1}]} rs.initiate (config) deploy shard3 sharding server creation configuration file # vim / etc/mongodb/shard3.confpidfilepath = / data/mongodb//logs/shard3.piddbpath = / data/mongodb/shard3/logpath = / data/mongodb//logs/shard3.loglogappend = truejournal = truequiet = truebind_ip = 0.0.0.0port = 40003fork = truereplSet=shard3shardsvr = truemaxConns=20000 send the configuration file to another server scp / etc/mongodb/shard3.conf root@172.16.10 .27: / etc/mongodb/scp / etc/mongodb/shard3.conf root@172.16.10.29:/etc/mongodb/ starts shard3 instance mongod-f / etc/mongodb/shard3.conf / / three servers operate consistently to configure shard replication set (non-arbitration node server) mongo-- port 40003 / / it is recommended that all three servers enter the database It is convenient to view role changes use adminconfig= {_ id: "shard3", members: [{_ id:0,host: "172.16.10.26 members 40003", priority:1}, {_ id:1,host: "172.16.10.27 shard3 40003", arbiterOnly:true}, {_ id:2,host: "172.16.10.29shard3 40003", priority:2}]} rs.initiate (config) Deploy the routing server to create the configuration file pidfilepath = / data/mongodb/logs/mongos.pidlogpath=/data/mongodb/logs/mongos.loglogappend = truebind_ip = 0.0.0.0port = 27017fork = trueconfigdb = configs/172.16.10.26:30000172.16.10.27:30000172.16.10.29:30000maxConns=20000 send the configuration file to another server scp / etc/mongodb/mongos.conf root@172.16.10.27:/etc/mongodb/scp / etc/ Mongodb/mongos.conf root@172.16.10.29:/etc/mongodb/ starts mongos instance mongos-f / etc/mongodb/mongos.conf / / the operation of the three servers is consistent * Note * here is "mongos" rather than "mongod" enable sharding function mongo / / because the default port is 27017 So the port number mongos > use adminmongos > sh.addShard ("shard1/172.16.10.26:40001172.16.10.27:40001172.16.10.29:40001") mongos > sh.addShard ("shard2/172.16.10.26:40002172.16.10.27:40002172.16.10.29:40002") mongos > sh.status () / / check the cluster status / / add two sharding servers here first, and one will be added later.

Test server sharding function sets sharding chunk size mongos > use configswitched to db configmongos > db.settings.save ({"_ id": "chunksize", "value": 1}) / / setting the block size to 1m is convenient for the experiment, otherwise you need to insert massive data WriteResult ({"nMatched": 0, "nUpserted": 1, "nModified": 0, "_ id": "chunksize"}) to simulate writing data mongos > use pythonswitched to db pythonmongos > show collectionsmongos > for (iTun1). Ish.enableSharding ("python") / / Database sharding is targeted. You can customize the libraries or tables that need sharding. After all, not all data needs sharding operation.

The index created for the table

The rule of creating an index is that it should not be too consistent and unique, such as serial numbers, such as gender, which are too repetitive, are not suitable for indexing.

Mongos > db.user.createIndex ({"id": 1}) / / indexed by "id"

Enable table sharding mongos > sh.shardCollection ("python.user", {"id": 1})

Check mongos > sh.status ()-Sharding Status-omit the content shards: {"_ id": "shard1", "host": "shard1/172.16.10.26:40001172.16.10.27:40001", "state": 1} {"_ id": "shard2" "host": "shard2/172.16.10.27:40002172.16.10.29:40002" "state": 1} omit the content chunks: shard1 3 shard2 3 {"id": {"$minKey": 1}}-- > {"id": 9893} on: shard1 Timestamp (2 0) {"id": 9893}-- > {"id": 19786} on: shard1 Timestamp (3,0) {"id": 19786}-- > {"id": 29679} on: shard1 Timestamp (4,0) {"id": 29679}-- > > {"id": 39572} on: shard2 Timestamp (4) 1) {"id": 39572}-> {"id": 49465} on: shard2 Timestamp (1,4) {"id": 49465}-> {"id": {"$maxKey": 1}} on: shard2 Timestamp (1,5) manually add sharding server Check whether the sharding condition has changed mongos > use adminswitched to db adminmongos > sh.addShard ("172.16.10.26 mongos 40003172.16.10.27") mongos > sh.status ()-Sharding Status-omitting the content shards: {"_ id": "shard1", "host": "shard1/172.16.10.26:40001172.16.10.27:40001" "state": 1} {"_ id": "shard2", "host": "shard2/172.16.10.27:40002172.16.10.29:40002", "state": 1} {"_ id": "shard3", "host": "shard3/172.16.10.26:40003172.16.10.29:40003" "state": 1} omit the content chunks: shard1 2 shard2 2 shard3 2 {"id": {"$minKey": 1}}- {"id": 9893} on: shard3 Timestamp (6) 0) {"id": 9893}-- > {"id": 19786} on: shard1 Timestamp (6,1) {"id": 19786}-- > {"id": 29679} on: shard1 Timestamp (4,0) {"id": 29679}-- > > {"id": 39572} on: shard3 Timestamp (5) 0) {"id": 39572}-- > {"id": 49465} on: shard2 Timestamp (5,1) {"id": 49465}-- > {"id": {"$maxKey": 1}} on: shard2 Timestamp (1,5)

The server will re-slice the data. When you remove a sharding server again, the data will be sliced again. MongoDB is very flexible in dealing with the data.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report