In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
Personal blog address: https://www.aolens.cn/?p=683
1: brief introduction of MongoDB:
MongoDB is a high-performance, open source, schemalless document database, which is one of the most popular NoSql databases. It can be used to replace the traditional relational database or key / value storage in many scenarios. And it can be easily combined with JSON data, it does not support transactions, but supports automatic slicing, which plays a very important role in big data's distributed storage.
Second, the index type of MongoDB:
Single field index:
Composite index (multi-field index): index multiple key
Multi-key index: indexing key in key and value
Spatial index: indexing based on location
Text index: full-text search
Hash index: only precise value lookups are supported
Sparse index (sparse): sparse index can be done only if it is sequentially discharged instead of indexing every value.
Third, the replication function of MongoDB:
There are two types of replication in MongoDB: Master/Slave master-slave replication and replica set replica set replication, but due to the characteristics of MongoDB, the master-slave replication architecture has been basically abandoned, and the more common is replica set replication.
The working characteristics of replica set:
1. Replication set can achieve automatic transfer heartbeat timeout and automatic failure transfer (realized by election)
2. There are at least 3 nodes and an odd number of nodes. You can use arbiter to participate in the election.
Special types of node classifications in the replication set:
0 priority node: cold backup node, will not be elected as the primary node, but can participate in the election
Hidden slave node: first, a 0 priority slave node, which has the right to vote and will not be directly accessed by the client
Delayed replication slave node: a 0 priority slave node, which cannot be selected as the master node, and the replication time lags behind the master node for a fixed length of time.
Arbiter: arbitrator, without data, it is impossible to become the master node.
Content of the experiment:
First, realize MongoDB data replication.
Experimental model:
Experimental environment:
Node1:172.16.18.1 MongoDB centos6.5
Node2:172.16.18.2 MongoDB centos6.5
Node3:172.16.18.3 MongoDB centos6.5
Content of the experiment:
First of all, we need to make sure that the time of each node is the same.
1.1 MongoDB installation: install three packages on the node1,node2,node3 node
Mogodb installation requires the following three packages
Mongodb-org-shell-2.6.4-1.x86_64.rpm
Mongodb-org-tools-2.6.4-1.x86_64.rpm
Mongodb-org-server-2.6.4-1.x86_64.rpm
Edit the configuration file for the server, / etc/mongod.conf
Logpath=/var/log/mongodb/mongod.log # Log path logappend=true # Open Log fork=true # port=27017 # default listening port # dbpath=/var/lib/mongo # default data path dbpath=/mongodb/data # Custom data path pidfilepath=/var/run/mongodb/mongod.pid#bind_ip=127.0.0.1 # define binding IP That is, listening to those IP can be linked to the server, logout is allowed for all. Httpinterface=true # Open web page, rest=truereplSet=testsetreplIndexPrefetch=_id_only
After configuration, send the configuration to the other two nodes, create a data directory, and modify permissions.
Mkdir / mongodb/data/-pv
Chown-R mongod.mongod / mongodb/
Start all: it may be slow to start because you want to initialize the data
1.2 Link to the database
[root@node2 ~] # mongo MongoDB shell version: 2.6.4 connecting to: test > rs.status () {"startupStatus": 3, "info": "run rs.initiate (...) if not yet done for the set", "ok": 0, "errmsg": "can't get local.system.replset config from self or any seed (EMPTYCONFIG)"}
Using rs.status () to check the status, there are three nodes, but none of them are initialized. Need to run rs.initiate ()
1.3 run rs.initiate ()
> rs.initiate () {"info2": "no configuration explicitly specified-- making one", "me": "node2.aolens.com:27017", "info": "Config now saved locally. Should come online in about a minute. "," ok ": 1} testset:OTHER > rs.status () {" set ":" testset "," date ": ISODate (" 2014-10-12T07:51:58Z ")," myState ": 1," members ": [{" _ id ": 0," name ":" node2.aolens.com:27017 "," health ": 1," state ": 1," stateStr ":" PRIMARY " "uptime": 920, "optime": Timestamp (1413100302, 1), "optimeDate": ISODate ("2014-10-12T07:51:42Z"), "electionTime": Timestamp (1413100302, 2), "electionDate": ISODate ("2014-10-12T07:51:42Z"), "self": true}], "ok": 1} testset:PRIMARY >
You can see that a node node2, that is, itself, is added, and it is a primary node. And all the states.
1.4 add the other two nodes
Testset:PRIMARY > rs.add ("node1.aolens.com") {"ok": 1} testset:PRIMARY > rs.add ("node3.aolens.com") {"ok": 1} View testset:PRIMARY > rs.status () {"set": "testset", "date": ISODate ("2014-10-12T08:03:48Z"), "myState": 1, "members": [{"_ id": 0 "name": "node2.aolens.com:27017", "health": 1, "state": 1, "stateStr": "PRIMARY", "uptime": 1630, "optime": Timestamp (1413101019, 1), "optimeDate": ISODate ("2014-10-12T08:03:39Z"), "electionTime": Timestamp (1413100302, 2), "electionDate": ISODate ("2014-10-12T07:51:42Z") "self": true, {"_ id": 1, "name": "node1.aolens.com:27017", "health": 1, "state": 5, "stateStr": "STARTUP2", "uptime": 17, "optime": Timestamp (0,0), "optimeDate": ISODate ("1970-01-01T00:00:00Z") "lastHeartbeat": ISODate ("2014-10-12T08:03:46Z"), "lastHeartbeatRecv": ISODate ("2014-10-12T08:03:47Z"), "pingMs": 224}, {"_ id": 2, "name": "node3.aolens.com:27017", "health": 1, "state": 6, "stateStr": "UNKNOWN", "uptime": 8 "optime": Timestamp (0,0), "optimeDate": ISODate ("1970-01-01T00:00:00Z"), "lastHeartbeat": ISODate ("2014-10-12T08:03:47Z"), "lastHeartbeatRecv": ISODate ("1970-01-01T00:00:00Z"), "pingMs": 905, "lastHeartbeatMessage": "still initializing"}], "ok": 1}
Have you found that these two newly added nodes are in the wrong state? It doesn't matter. Maybe it hasn't been synchronized yet. Try to refresh it later.
"name": "node1.aolens.com:27017"
"stateStr": "SECONDARY"
"name": "node3.aolens.com:27017"
"stateStr": "SECONDARY"
The two nodes added during the refresh become secondary.
1.5 We create some data on the main node
Testset:PRIMARY > use testswitched to db testtestset:PRIMARY > for (iTun1) I use test switched to db test testset:SECONDARY > show collections 2014-10-11T14:52:57.103+0800 error: {"$err": "not master and slaveOk=false", "code": 13435} at src/mongo/shell/query.js:131 # prompt that it is not the master node, and no slaveOK is allowed to view, so specify slaveOK on the current node Testset:SECONDARY > rs.slaveOk () testset:SECONDARY > show collectionsstudentssystem.indexes can use rs.isMaster () to query who the master node is testset:SECONDARY > rs.isMaster () {"setName": "testset", "setVersion": 3, "ismaster": false, "secondary": true, "hosts": ["node1.aolens.com:27017", "node3.aolens.com:27017", "node2.aolens.com:27017"], "primary": "node2.aolens.com:27017" # who is the master node "me": "node1.aolens.com:27017", # who are you "maxBsonObjectSize": 16777216, "maxMessageSizeBytes": 48000000, "maxWriteBatchSize": 1000, "localTime": ISODate ("2014-10-11T07:00:03.533Z"), "maxWireVersion": 2, "minWireVersion": 0, "ok": 1}
1.6 if the master node is offline, the slave node will automatically select the master node.
Testset:PRIMARY > rs.stepDown () 2014-10-12T17:00:12.869+0800 DBClientCursor::init call () failed2014-10-12T17:00:12.896+0800 Error: error doing query: failed at src/mongo/shell/query.js:812014-10-12T17:00:12.914+0800 trying reconnect to 127.0.0.1 oktestset:SECONDARY 27017 (127.0.0.1) failed2014-10-12T17:00:12.945+0800 reconnect 127.0.1 oktestset:SECONDARY >
You can see that the primary of the primary node becomes secondary
One of the other two nodes becomes the master node, which is called automatic transfer
The conditions for re-election of the copy set are: mentality information, priority, optime, network connection, etc.
1.7 you can also modify the priority to achieve master-slave switching.
Testset:PRIMARY > rs.conf () # View configuration {"_ id": "testset", "version": 3, "members": [{"_ id": 0, "host": "node2.aolens.com:27017"}, {"_ id": 1, "host": "node1.aolens.com:27017"}, {"_ id": 2 "host": "node3.aolens.com:27017"}} testset:PRIMARY > cfg=rs.conf () # Save configuration information in variables {"_ id": "testset", "version": 3, "members": [{"_ id": 0, "host": "node2.aolens.com:27017"}, {"_ id": 1 "host": "node1.aolens.com:27017"}, {"_ id": 2 "host": "node3.aolens.com:27017"}} testset:PRIMARY > cfg.members [1] .priority = 2 # modify the host priority of id=1 to 22testset:PRIMARY > rs.reconfig (cfg) # apply cfg file 2014-10-12T05:15:59.916-0400 DBClientCursor::init call () failed2014-10-12T05:15:59.988-0400 trying reconnect to 127.0.0 .1reconnect 27017 (127.0.0.1) failed2014-10-12T05:16:00.015-0400 reconnect 127.0.0.1 reconnect 27017 (127.0.0.1) okreconnected to server after rs command (which is normal) testset:SECONDARY > # actively becomes a slave node while the id=1 is the node1.aolens.com host Change to primary host testset:SECONDARY > testset:PRIMARY >
1.8, how to set the arbitration node
Rs.addArb (hostportstr) means that when a node is added, it is the arbitration node.
We remove the current node3 node
Testset:PRIMARY > rs.remove ("node3.aolens.com")
Delete the data under / mongodb/data and reinitialize
Testset:PRIMARY > rs.addArb ("node3.aolens.com")
Second, realize the slicing of MongoDB data
MongoDB fragmentation (sharding):
Why to slice: CPU,Memory,IO and other can not meet the requirements.
Horizontal expansion: you need to split the data into different nodes
In order to ensure the size balance of the shard, the data of the master node is sequentially divided into blocks of the same size, which are stored on different sharding nodes.
Roles in sharding architecture:
Mongos:Router server
Config server: metadata server
Shard: data node, also known as mongod instance
Slicing needs to be satisfied: the idea of discrete writing and centralized reading.
Experimental environment:
Node1:172.16.18.1 mongos router node
Node2:172.16.18.2 mongod shard node
Node3:172.16.18.3 mongod shard node
Node4:172.16.17.12 config server
Experimental model:
Content of the experiment:
2.1First, configure config server
In fact, config server is mongod, just to specify that he is config server.
Logpath=/var/log/mongodb/mongod.loglogappend=truefork=true#port=27017dbpath=/mongodb/datapidfilepath=/var/run/mongodb/mongod.pid starts the config server node. You will find him listening on port 27019 [root@node3 mongodb-2.6.4] # service mongod start Starting mongod: [OK] [root@node3 mongodb-2.6.4] # ss-tnl State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0128 *: 27019 *: *
2.2 configure the router node
Install the mongodb-org-mongos.x86_64 package
Just start it directly:
[root@node1] # mongos-configdb=172.16.17.12:27019-fork-logpath=/var/log/mongodb/mongos.log2014-10-11T20:29:43.617+0800 warning: running with 1 config server should be done only for testing purposes and is not recommended for productionabout to fork child process, waiting until server is ready for connections.forked process: 8766child process started successfully, parent exiting
Started successfully
You need to specify the address of config server-- fork background operation,-- logpath indicates the location of log.
2.3.Configuring the shard node. The shard node is a normal mongod node without much configuration.
Logpath=/var/log/mongodb/mongod.loglogappend=truefork=true#port=27017dbpath=/mongodb/datapidfilepath=/var/run/mongodb/mongod.pid
Start node2,node3
2.4 links are viewed on mongos
[root@node1 ~] # mongo-- host 172.16.18.1 MongoDB shell version: 2.6.4 connecting to: 172.16.18.1:27017/test add two shard nodes: mongos > sh.addShard ("172.16.18.2") {"shardAdded": "shard0000", "ok": 1} mongos > sh.addShard ("172.16.18.3") {"shardAdded": "shard0001" "ok": 1} mongos > sh.status ()-Sharding Status-sharding version: {"_ id": 1, "version": 4, "minCompatibleVersion": 4, "currentVersion": 5, "clusterId": ObjectId ("543922b81aaf92ac0f9334f8")} shards: {"_ id": "shard0000", "host": "172.16.18.2 sh.status 27017"} {"_ id": "shard0001" "host": "172.16.18.3 id 27017"} databases: {"_ id": "admin", "partitioned": false, "primary": "config"}
Generally speaking, big data needs sharding for data sharding. If there is little data in a collection, there is no need to do shard, and the data that is not sharded will be placed on the main shard.
Let's start the shard function first.
Mongos > sh.status ()-Sharding Status-sharding version: {"_ id": 1, "version": 4, "minCompatibleVersion": 4, "currentVersion": 5, "clusterId": ObjectId ("543922b81aaf92ac0f9334f8")} shards: {"_ id": "shard0000", "host": "172.16.18.2 543922b81aaf92ac0f9334f8 27017"} {"_ id": "shard0001" "host": "172.16.18.3 id 27017"} databases: {"_ id": "admin", "partitioned": false, "primary": "config"} {"_ id": "test", "partitioned": false, "primary": "shard0000"} {"_ id": "testdb", "partitioned": true, "primary": "shard0000"}
Come down and try to slice it manually.
Mongos > sh.shardCollection ("testdb.students", {"age": 1}) # slice the age {"collectionsharded": "testdb.students", "ok": 1} mongos > sh.status ()-Sharding Status-sharding version: {"_ id": 1, "version": 4, "minCompatibleVersion": 4, "currentVersion": 5, "clusterId": ObjectId ("543922b81aaf92ac0f9334f8")} shards: {"_ id": "shard0000" "host": "172.16.18.2shard0001 27017"} {"_ id": "shard0001", "host": "172.16.18.3 shard0001"} databases: {"_ id": "admin", "partitioned": false, "primary": "config" {"_ id": "test", "partitioned": false "primary": "shard0000"} {"_ id": "testdb", "partitioned": true, "primary": "shard0000"} testdb.students shard key: {"age": 1} chunks: shard0000 1 {"age": {"$minKey": 1}}-- > {"age": {"$maxKey": 1} on: shard0000 Timestamp (1) 0) but we don't have the data at the moment So let's generate some data, mongos > for. I sh.status ()-- Sharding Status-sharding version: {"_ id": 1, "version": 4, "minCompatibleVersion": 4, "currentVersion": 5, "clusterId": ObjectId ("543922b81aaf92ac0f9334f8")} shards: {"_ id": "shard0000", "host": "172.16.18.2 id 27017"} {"_ id": "shard0001" "host": "172.16.18.3 admin 27017"} databases: {"_ id": "admin", "partitioned": false, "primary": "config"} {"_ id": "test", "partitioned": false, "primary": "shard0000"} {"_ id": "testdb", "partitioned": true "primary": "shard0000"} testdb.students shard key: {"age": 1} chunks: shard0001 1 shard0000 2 {"age": {"$minKey": 1}}-- > {"age": 1} on: shard0001 Timestamp (2,0) {"age": 1}-- > {"age": 99} on: shard0000 Timestamp (2) 2) {"age": 99}-> {"age": {"$maxKey": 1}} on: shard0000 Timestamp (2,3)
You will find that the data is divided into different chunk
Data sharding is realized, of course, manual sharding is not recommended.
Conclusion: mongod has better performance and simpler operation than MySQL in big data processing and master-slave replication. However, because mongod is not very mature, there are still many problems to be solved in practical application. It needs to be used and groped slowly.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.