Example Analysis of Parameter limitation and threshold in MongoDB 04/11 Update SLTechnology News&Howtos

Example Analysis of Parameter limitation and threshold in MongoDB

2025-04-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces the example analysis of parameter limits and thresholds in MongoDB, which is very detailed and has a certain reference value. Interested friends must read it!

I. BSON documents

BSON document size: the maximum size of an document document is 16m; documents larger than 16m need to be stored in GridFS.

Document embedding depth: the maximum tree depth of an BSON document is 100.

II. Namespaces

Collection namespace:., with a maximum length of 120 bytes. This also limits the names of database and collection to not be too long.

Number of namespaces: for MMAPV1 engines, the maximum number is about 24000. Each collection and index is a namespace;, but there is no limit for wiredTiger engines.

Namespace file size: for the MMAPV1 engine, the default size is 16m, which can be modified in the configuration file. WiredTiger is not subject to this restriction.

III. Indexes

Index key: the key of each index must not exceed 1024 bytes. If the length of index key exceeds this value, it will cause the write operation to fail.

The number of indexes in each collection must not exceed 64.

Index name: we can set the name for index, the final full name is.. $, the maximum length is no more than 128bytes. By default, it is a combination of filed name and index type, and we can explicitly specify the index name when creating the index, as shown in the createIndex () method.

A combined index can contain up to 31 field.

IV. Data

Capped Collection: if you specify the maximum number of documents when creating a collection of type "Capped", then this number cannot exceed 2 to the power of 32, and if you do not specify a maximum number, there is no limit.

As far as the Database Size:MMAPV1 engine is concerned, each database cannot hold more than 16000 data files, that is, the maximum amount of data in a single database is 32TB, which can be limited to 8TB by setting "smallFiles".

Data Size: for MMAVPV1 engines, a single mongod cannot manage datasets that exceed the maximum virtual memory address space. For example, each mongod instance under linux (64-bit) can maintain up to 64T of data. The wiredTiger engine does not have this limitation.

Number of collection per Database: for MMAPV1 engines, the number of collections each database can hold depends on the size of the namespace file (used to hold the namespace) and the number of indexes in each collection, and the final total size does not exceed the size of the namespace file (16m). The wiredTiger engine is not subject to this restriction.

5. Replica Sets

A maximum of 50 members are supported per replica set.

There can be up to 7 voting members in a replica set. (voter)

If the size of the oplog is not explicitly specified, its maximum will not exceed 50g.

VI. Sharded Clusters

Group aggregate function, not available in sharding mode. Please use the mapreduce or aggregate method.

Coverd Queries: that is, the Fields in the query condition must be part of the index, and the returned result contains only the fields; in the index. For the sharding cluster, if the query does not contain shard key, the index cannot be overwritten. Although _ id is not "shard key", if the query condition contains only _ id and only the value of the _ id field is needed in the returned results, you can use an override query, but this query does not seem to make any sense (unless you are detecting the existence of a document for this _ id).

If sharding (originally non-sharding) is enabled for a collections that already has data, the maximum big data must not exceed 256G. When collection is sharding, it can store as much data as it wants.

For sharded collection,update or remove operations on a single piece of data (the operation option is multi:false or justOne), you must specify the shard key or _ id field; otherwise, error will be thrown.

Unique index: unique indexes are not supported between shards unless the "shard key" is the leftmost prefix of the unique index. For example, the shard key of collection is {"zipcode": 1, "name": 1}, if you want to create a unique index on collection, then the unique index must use zipcode and name as the leftmost prefix of the index, such as: collection.createIndex ({"zipcode": 1, "name": 1, "company": 1}, {unique:true}).

Maximum number of documents allowed during chunk migration: if the number of documents in a chunk exceeds 250000 (default chunk size is 64m), or the number of document is greater than 1.3* (maximum size of chunk (determined by configuration parameters) / average size of document), the chunk will not be "move" (whether it is balancer or manual intervention) and must wait for split before it can be move.

7. Shard key

The length of the shard key must not exceed 512 bytes.

The shard key index can be a positive order index based on shard key, or a combined index that begins with shard key. Shard key indexes cannot be multikey indexes (array-based indexes), text indexes, or geo indexes.

Shard key is immutable and the shard key value in document cannot be modified at any time. If you need to change the shard key, you need to manually clean the data, that is, the full amount of dump raw data, and then modify and save in the new collection.

Monotonously increasing (decreasing) shard key limits the throughput of insert; if _ id is shard key, you need to know that _ id is generated by ObjectId (), which is also self-increment. All insert operations on a monotonously increasing shard key,collection will take place on a single shard node, so this shard will host all insert operations of the cluster. Because the resources of a single shard node are limited, the amount of insert of the entire cluster will be limited. If cluster is mainly operated by read and update, there will be no restrictions in this respect. To avoid this problem, consider using "hashed shard key" or choosing a non-monotonic incremental key as the shard key. (rang shard key and hashed shard key have their own advantages and disadvantages, depending on the situation of query.)

VIII. Operations

If mongodb cannot use index sorting to get documents, then the size of the documents participating in the sort needs to be less than 32m.

Aggregation Pileline operation. Pipeline stages is limited to 100m memory, and an error will occur if stage exceeds this limit. In order to handle larger datasets, turn on the "allowDiskUse" option, which allows pipeline stages to write additional data to temporary files.

IX. Naming rules

The naming of database is case-sensitive.

Do not include: /.''$*: |? in the database name.

The length of the database name cannot exceed 64 characters.

Collection names can start with "_" or alphabetic characters, but cannot contain "$" symbols, cannot be empty characters or null, and cannot be "system." At the beginning, because this is a system reserved word.

The document field name cannot contain "." Or null, and cannot start with "$" because $is a "reference symbol".

Finally, record the query method with list in json nesting, sample data:

{"_ id": ObjectId ("5c6cc376a589c200018f7312"), "id": "9472", "data": {"name": "Test", "publish_date": "2009-05-15", "authors": [{"author_id": 3053, "author_name": "Test data"],}}

I want to query the author_id,query in authors can be written like this:

Db.getCollection () .find ({'data.authors.0.author_id': 3053})

Use 0 to represent the first index, and dots to represent the nested structure. However, spark mongo cannot be imported in this way, and other methods are needed.

The above is all the contents of the article "sample Analysis of Parameter limits and thresholds in MongoDB". Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.