In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article introduces how to accelerate the SSD of CEPH file system metadata, the content is very detailed, interested friends can refer to, hope to be helpful to you.
Object storage process Object Storage Daemons (OSDs) is a major feature of distributed file system Ceph. Compared with other distributed file systems, Ceph has better scalability and stability.
In Ceph, the object is first saved to the basic OSD, and then copied to another backup OSD. The replication process is synchronous, and only after writing, can you tell the upper application that the write is successful. The availability of the data is guaranteed.
After each write operation of Client is sent to OSD, 2-3 disk seek operations are generated:
Record the write operation to the Journal file of OSD (Journal is metadata to ensure the atomicity of the write operation).
Update the write operation to the file corresponding to Object.
Record the write operation on the PG Log file.
Further, for an OSD, metadata must be saved to its Journal before writing is complete. The write operation is to write Journal first, then Object, so in order to improve the performance of the cluster, the speed of writing Journal must be fast.
Therefore, in order to make Ceph clusters faster and more cost-effective, you need to consider two design ideas:
Put files on slow, inexpensive storage devices, such as SATA HDD.
Put Journal on fast devices, such as SSD, flash memory cards.
Another common design idea is that each HDD corresponds to an OSD. At present, many systems are equipped with two SSD and many HDD. If the SSD only stores the Journal, the capacity is completely sufficient, because the Journal of one OSD generally does not exceed the 6GB. Even if there are 16 HDD,Journal with only about 96GB, most of the SSD capacity is more than sufficient.
Many administrators are worried that SSD will fail, so they use SSD to form a RAID-1, which is actually a mirror image, and the capacity is halved. And then put Journal on the SSD RAID group. In fact, another way is to take out a partition from each SSD to form a RAID-1 to make the system disk. The remaining partitions are used to save Ceph Journal, but do not do RAID.
But it could lead to a bad situation. When SSD puts 10 or more OSD Journal, they share the same SSD with the operating system, if there is a period of time when everyone reads and writes frequently, the performance of Ceph will be affected. For example, when a host dies and the redundant machine begins to scan the data for recovery, the performance of other OSD is very poor because the bandwidth allocated is very small.
So, is it better to use RAID-1 to protect Journal? Because Ceph currently has to scan the entire OSD file storage to recover the Journal, as long as the Journal is lost, the OSD is gone, and the entire disk must be scanned to recover slowly. But the drawback of RAID-1 is that it has to be written twice every time. In fact, a better way is to divide all the OSD Journal into two sets and put them into two SSD, so that one is broken and half of the Journal is good.
Ceph also has a Monitor,MON, which is mainly used to maintain the master copy map of the cluster and to query the latest version of the map during synchronization operations. Use the key/value to store snapshots and iterators and perform synchronization of the OSD. If MON and OSD are on the same SSD, if SSD slows down, then MON will die, and if there is a backup MON, the operation will not be affected.
If you are going to deploy Ceph with SSD and HDD, the final conclusion is:
The OSD of each node should not be too much, and it is more appropriate to have less than 8. In this case, the effect of Journal on SSD is better. The reason is that with more SSD Journal, performance will be affected.
If there is too much OSD, don't save the Journal with SSD, it might be better to use HDD. Or install OS on HDD and save OSD Journal with SSD without RAID.
Use some special MON.
So much for the SSD acceleration of CEPH file system metadata. I hope the above content can be of some help and can learn more knowledge. If you think the article is good, you can share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.