In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
What is the storage structure of MongoDB and its impact on space utilization? I believe many inexperienced people are at a loss about it. Therefore, this paper summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.
Students who use MongoDB for a period of time will certainly find that MongoDB often takes up a lot more space than the actual data size. If you look at it with the db.stats () command, you will find that MongoDB reports several different space size information, such as dataSize, storageSize, and fileSize. What exactly do these sizes mean? Let's parse the meaning of these values by understanding the storage mechanism of MongoDB.
Database file type
There are three main types of database files for MongoDB:
Journal log file
Namespace table name file
Data data and index files
Log file
Unlike some traditional databases, MongoDB log files are only used to recover memory data that has not yet been synchronized to the hard disk in the event of a system downtime. The log files are stored in a separate directory. At startup, MongoDB will automatically pre-create 3 log files of 1G each (initially empty). Unless you really have a continuous mass of data and write concurrently, generally speaking, 3 gigabytes is enough.
Name the file dbname.ns
This file is used to store the collection of the entire database and the name of the index. This file is small, with a default of 16m, and can store 24000 collection or index names and the specific location of those collections and indexes in the data file. Through this file MongoDB you can know where to start looking for or inserting collection data or index data. This value can be adjusted to 2G by parameter.
Data files dbname.0, dbname.1, … Dbname.n
MongoDB data and indexes are stored in one or more MongoDB data files. The first data file is named with "database name. 0", such as my-db.0. The default size of this file is 64m, and MongoDB will generate the next data file such as my-db.1 ahead of time before it is nearly used up. The size of the data file increases twofold. The size of the second data file is 128m and the third is 256m. It will stop after 2G and keep adding new files according to the size of 2G.
Of course, MongoDB will also generate some temporary files such as _ tmp and mongod.lock, but they are not very relevant to our discussion.
Data file structure
Extent
Within each data file, MongoDB organizes the stored BSON document data and B-tree index into a logical container "Extent". As shown in the following figure (my-db.1 and my-db.2 are two data files for the database):
A file can have multiple Extent
Each Extent will contain only one collection of data or indexes
Data or indexes of the same collection can be distributed across multiple Extent. These Extent can also be divided into multiple files.
The same Extent will not have both data and index.
Record record
There are multiple "Record" in each Extent, and each record contains a record header and the BSON document of the MongoDB, as well as some extra padding space. Padding means that MongoDB allocates some extra unused space when inserting records, so that when the document becomes larger in the future, it does not need to be migrated elsewhere. The recording header starts with the size of the entire record, including the location of the record itself and the location of the previous record and the next record. Think of it as a Double Linked List.
Database size parameter
On the basis of the previous, we can understand the meaning of the space size parameter in db.stats ().
DataSize
DataSize is the parameter closest to the real data size. You can use it to check how much data you have. This size includes the sum of each record in the database (or collection). Note that each record has the extra overhead of header and padding in addition to the BSON document. So the actual size will be slightly larger than the real data.
When you delete a document, this parameter becomes smaller because it is the sum of the number of documents. If your document is not deleted, but the fields inside the document are deleted or reduced, it will not affect dataSize. The reason is that the record where the document is located is still there, and the space occupied by the whole record has not changed, but there is more unused space in the record.
StorageSize
This parameter is equal to the sum of all the Data Extents used by the database or a collection. Note that this number will be larger than dataSize because there will be some deleted left after deleting the document in the Extent. Even if your storageSize is a lot bigger than dataSize, this is not necessarily a very bad situation. If a newly inserted document is less than or equal to the size of the fragment, MongoDB reuses the fragment to store the new document. But until then, the debris will remain there to take up space. For this reason, this parameter will not get smaller when you delete the document.
Fragmentation problems can become serious as they take longer to run. You can clean up the fragments with the compact command or copy all the data from a new slave and then become the master node to resolve the fragments.
FileSize
This parameter is valid only on the database and refers to the size of the files used in the actual file system. It includes the sum of all the data Extents, the sum of index Extent, and some unallocated space. It was mentioned earlier that MongoDB pre-allocates database files when they are created, such as 64m, even if you only have a few hundred KB data. So this parameter may be much larger than the actual data size. This extra unused space is used to ensure that MongoDB can quickly allocate new Extent when new data is written, avoiding delays caused by disk space allocation.
It is worth noting that when you delete documents, or even collections and indexes, this parameter does not get smaller. In other words, the hard disk space used by the database will only rise (or remain the same) and will not be reduced by deleting the data. Of course, what you need to know is that this does not mean waste, it just means that there is a lot of reserved space.
After reading the above, have you mastered the storage structure of MongoDB and its impact on space utilization? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.