Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Basic characteristics of MongoDB and explanation of its internal structure

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

MongoDB is a product between relational database and non-relational database, which is the most functional and most like relational database in non-relational database. The data structure he supports is very loose, which is similar to json's bjson format, so it can store more complex data types. The most important feature of Mongo is that the query language it supports is very powerful, and its syntax is somewhat similar to the object-oriented query language. It can almost achieve most of the functions similar to the single table query of relational database, and also supports the establishment of data indexing.

For most MongoDB users, MongoDB is like a big black box. But if you can understand some of the internal structure of MongoDB, it will help you to better understand and use MongoDB.

BSON

In MongoDB, a document is an abstraction of data, which is used in the interaction between client and server. All Client sides (Driver in various languages) use this abstraction in the form of what we often call BSON (Binary JSON).

BSON is a lightweight binary data format. MongoDB can use BSON and store BSON on disk as data storage.

When the client side wants to write the document, use query and other operations, it needs to encode the document into BSON format, and then send it to the server side. Similarly, the return result on the server side is also encoded in BSON format and returned to the client side.

Use the BSON format for the following three purposes:

Efficiency. BSON is designed for efficiency and requires only a small amount of space. Even in the worst case, the BSON format is more efficient than the JSON format at best.

Transmissible. In some cases, BSON sacrifices extra space to make data transmission easier. For example, the prefix of the transmission of a string identifies the length of the string rather than marking the end of the string. This form of transmission makes it easier for MongoDB to modify the transmitted data.

Performance. Finally, the encoding and decoding of BSON format is very fast. It uses a C-style data representation so that it can be used efficiently in a variety of languages.

Write protocol

The lightweight TCP/IP writing protocol is used to access the server on the Client side. This protocol is described in detail in MongoDB Wiki, which is actually a simple wrapper on top of BSON data. For example, the command to write data contains a 20-byte header (consisting of the length of the message and the identity of the write command), the Collection name to be written, and the data to be written.

Data file

In the data folder of MongoDB (the default path is / data/db) are all the files that make up the database. Each database contains A. ns file and some data files, in which the number of data files increases as the amount of data increases. So if there is a database named foo, then the files that make up the foo database will be made up of foo.ns,foo.0,foo.1,foo.2 and so on.

Each time a data file is added, it will be twice the size of the previous data file, with a maximum of 2G per data file. This design is helpful to prevent the database with small amount of data from wasting too much space, and at the same time ensure that the database with large amount of data has the corresponding space to use.

MongoDB uses pre-allocation to ensure stable write performance (which can be turned off using-noprealloc). Pre-allocation occurs in the background, and each pre-allocated file is populated with 0. This allows MongoDB to keep extra space and free data files all the time, thus avoiding the blocking caused by the allocation of disk space caused by excessive data growth.

Namespace and disk area

Each database consists of multiple namespaces, and each namespace stores the corresponding type of data. Each Collection in the database has its own namespace, and the index file also has a namespace. Metadata for all namespaces is stored in the. ns file

The data in the namespace is divided into multiple intervals on the disk, which is called the disk area. In the following figure, the foo database contains three data files, and the third data file is an empty pre-allocation file. The first two data files are divided into corresponding extents corresponding to different namespaces.

The figure above shows the characteristics of namespaces and extents. Each namespace can contain multiple different extents, which are not contiguous. Like the growth of data files, the disk size corresponding to each namespace increases with the number of allocations. The purpose of this is to balance the space wasted by namespaces and to maintain the continuity of data in a namespace. There is also a namespace to note in the image above: $freelist, which is used to record extents that are no longer in use (deleted Collection or indexes). Whenever a namespace needs to be allocated a new extent, I will first check to see if $freelist has the right size to use.

Memory-mapped storage engine

The storage engine currently supported by MongoDB is the memory mapping engine. When MongoDB starts, all data files are mapped to memory, and then the operating system hosts all disk operations. This storage engine has the following characteristics:

The code on memory management in MongoDB is very concise, after all, the related work is already managed by the operating system. The virtual memory used by the MongoDB server will be huge and will exceed the size of the entire data file. Don't worry, the operating system will take care of it. It should be noted that MongoDB itself does not manage memory, cannot specify memory size, and is completely managed by the operating system, so it is sometimes uncontrollable, and memory usage must be monitored at the OS level when using in a production environment. MongoDB cannot control the order in which data is written to disk, which causes MongoDB to fail to implement the characteristics of writeahead logs. So, if MongoDB wants to provide a feature of durability, you need to implement another storage engine. The MongoDB server of a 32-bit system can only use 2G data files per Mongod instance. This is because address pointers can only support 32 bits.

Characteristics

It is characterized by high performance, easy to deploy, easy to use, and it is very convenient to store data. The main functional features are:

For collection storage, it is easy to store data of object type. Mode freedom. Dynamic query is supported. Full indexing is supported, including internal objects. Query is supported. Replication and failure recovery are supported. Use efficient binary data storage, including large objects such as video, etc. Automatically handle fragments to support scalability at the cloud computing level to support multiple languages such as RUBY,PYTHON,JAVA,C++,PHP. The file is stored in BSON (an extension of JSON) that can be accessed over the network

The so-called "Collenction-Orented-oriented" means that data is grouped and stored in a dataset, which is called a Collenction. Each collection has a unique identification name in the database and can contain an unlimited number of documents. The concept of a collection is similar to a table (table) in a relational database (RDBMS), except that it does not need to define any schema.

Schema freedom (schema-free) means that we do not need to know any structural definition of the file stored in the mongodb database. If necessary, you can store files with different structures in the same database.

A document stored in a collection is stored as a key-value pair. The key is used to uniquely identify a document that is a string type, while the value can be a complex file type. We call this form of storage BSON (Binary Serialized dOcument Format).

Other

There are only so many MongoDB internals introduced in "MongoDB The Definitive Guide," and if you really want to make it clear, you may need another book to focus on it. Such as internal JS parsing, query optimization, index establishment and so on.

Summary

The above is the whole content of this article. I hope the content of this article has a certain reference and learning value for everyone's study or work. Thank you for your support. If you want to know more about it, please see the relevant links below.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report