In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
This article is to share with you about the difference between MONGODB GridFS storage files and file system storage. The editor thinks it is very practical, so I share it with you. I hope you can get something after reading this article. Let's take a look at it.
In most databases, it is not recommended to store large files in the database, but there is a way to store large files in MONGODB database. The format of this file is called GridFS.
The question arises here: when will this GRIDFS function be useful?
First of all, the file storage and data extraction methods of Gridfs are different from those of ordinary MONGO. According to the hint of the document, the data stored by gridfs should be BSON files that exceed 16MB.
Instead of storing the file in a single document, GridFS divides the file into multiple parts (that is, blocks [1]) and stores each block as a separate document. By default, GridFS uses the default block size of 255 kB; that is, GridFS divides the file into 255 kB blocks (except the last block). The size of the last piece depends on the need. Similarly, files not larger than the block size have only the last block, using only the required space and some additional metadata.
The Gridfs of MONGODB itself stores data through two files, and the biggest difference between Gridfs and document storage is that, for memory consumption, if it is stored in a document way, the data needs to be output through memory, while if the data is stored in gridfs, the file is assembled through the driver instead of extracting the data through memory.
At this point, there may have been objections, whether it is good for me to use a file system to store files, why I should use a database to store files, what is your performance, and where are your advantages? this is probably one of the more difficult questions to answer.
In the file system, the number of data stored is limited, which is related to the design of the file system, (Windows linux) have similar problems, and the MONGODB way to store files, this situation is not a special consideration.
This is one, and the latter problem is that if I want my files to be transferred synchronously, there will be a problem if I have to deal with it by the operating system's file system. For example, I wrote a file at location A, and I hope I can read the file at location B, or I want my file to be backed up and protected by COPY.
At this point, it is estimated that the voices of the dissenting students will gradually become quiet, and it is true that the above questions are not so easy for the file system to do, solutions, consumption and comparison.
It will be much easier to hand it over to MONGODB. If you can ensure the stability of the network, you can indeed synchronously extract files written in Beijing in Shanghai, and it may not be a problem for Guangzhou to get them at the same time. This is not a remote document distribution system, if you can re-develop, I think such products will also have related requirements.
And another point is that, for the requirements of data security, it will be very difficult for the file system to divide the files into detailed permissions, while for the database it is its own function. From the perspective of data security, if there are higher requirements for the security of multi-data, the traditional file storage mode can be replaced by the way of MONGODB.
With so much nonsense, you can see how to operate gridfs. Here is to insert a directory file into mongodb.
The document states that if you need to update the contents of the entire file automatically, do not use GridFS. As an alternative, you can store multiple versions of each file and specify the current version of the file in the metadata. You can update the metadata field that indicates the "latest" status in the atomic update after uploading a new version of the file, and then delete the previous version as needed.
After operating the above PYTHON script, look at the following image database. There are two collection
Fs.chunks
Fs.files
Fs.files is the non-physical information responsible for storing files, which can be thought of as a directory, while fs.chunks is the chunks information of the file of the storage entity.
So it's not a good thing that either of the two collection is damaged.
To improve efficiency, GridFS uses indexes on each block and file collection. For convenience, drivers that conform to the GridFS specification automatically create these indexes. You can also create any additional indexes as needed to meet the needs of your application.
GridFS uses the filename and uploadDate fields to use the index on the file collection
In reality, if you do not use python, you can use mongofiles to query MONGNODB storage files and process data externally for mongodb file operations.
For small files, and in the case of the requirements mentioned above, the use of MONGODB can fully meet the relevant requirements, so the database can replace storage, in some cases there is an advantage.
Of course, some students may also propose that if a file can be opened and modified directly in the file system, but not with the gridfs system, it is necessary to download the file, and then upload the modified file and delete the original file. Isn't this a little too troublesome? this involves another problem, an important point.
1 MONGODB GRIDFS uses it for the distribution of files and the control of permissions, as well as the high availability and reuse of files, multi-version file distribution and other functions, which cannot be given to you by the file system
The file system is certainly more convenient than GRIDFS, but if you need the above features, you must compromise.
Often hear some doubts, stand at point An and say point B is good, but point B can't meet your requirements, both fish and bear's paw, of course, technically it is the best to have both, but to be clear about the primary and secondary, which you must, which can be abandoned, fantasy and ideal, only one word short.
The above is the difference between MONGODB GridFS storage files and file system storage. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.