What is the status of ceph placement group 04/28 Update SLTechnology News&Howtos

What is the status of ceph placement group

2025-04-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

Editor to share with you what the status of ceph placement group, I believe that most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to know it!

I. configuration group status

1. Creating

When you create a storage pool, it creates a specified number of configuration groups. When creating one or more configuration groups, ceph will show that after the creating; is created, the OSD in the Acting Set of the configuration group will be interconnected. Once the interconnection is completed, the configuration group status should be changed to active+clean, which means that the ceph client can write data to the configuration group.

2. Peering

When ceph interconnects a configuration group, it allows the OSD that stores the copy of the configuration group to agree on the state of the objects and metadata in it. The ceph has completed the interconnection, which means that the OSD storing the configuration group has agreed on its current state. However, the completion of the interconnection process does not mean that each copy has the latest version of the data.

3. Active

After the ceph completes the interconnection process, a configuration group can become an active. The active state usually means that the data in the main configuration group and in the replica can be read and written.

4. Clean

When a configuration group is in the clean state, the primary OSD and replica OSD are successfully interconnected, and there is no deviation from the configuration group. Ceph has copied the objects in the configuration group a specified number of times.

5. Degraded

When the client writes data to the primary OSD, the master OSD is responsible for writing the replica to the rest of the replication OSD. After the master OSD writes the object to the replication OSD, the master OSD stays in the degraded state until it receives a confirmation of successful completion.

The configuration group state can be active+degraded state because an OSD can be in active state even if it doesn't have all the objects. If an OSD dies, ceph will mark the relevant configuration groups as degraded;. After the OSD is reborn, they must be reconnected. However, if the reset group is still in the active state, even if it is in the degraded state, the client can write a new object to it.

If an OSD dies and the degraded state persists, ceph marks the OSD of the down as outside the cluster (out) and remaps the data on those down-dropped OSD to other OSD. The interval from marked down to out is controlled by mon osd down out interval, which defaults to 300 seconds.

The configuration group is also degraded (degraded) because the configuration group cannot find one or more objects that should exist in the configuration group, and you cannot read or write objects that you cannot find, but can still access other objects in the degraded configuration group.

6. Recovering

Ceph is designed to be fault-tolerant and can withstand software and hardware problems of a certain scale. When an OSD dies (down), its content version lags behind other copies in the reset group; when it is reborn (up), the reset group content must be updated to reflect the current state; during this time, the OSD is in the recovering state.

Recovery is not always such a trivial matter, as a single hardware failure can affect multiple OSD. For example, the network switch of a cabinet fails, which will cause multiple hosts to lag behind the current state of the cluster, and every OSD must be restored after the problem is resolved.

Ceph provides a number of options to balance resource competition, such as new service requests, restore data objects, and restore configuration groups to their current state. The osd recovery delay start option allows an OSD to restart, rebuild the interconnection, or even process some replay requests before starting the recovery process; the osd recovery threads option limits the number of threads in the recovery process, which defaults to 1 thread; the osd recovery thread timeout sets the thread timeout because multiple OSD may alternately fail, restart and rebuild the interconnection; the osd recovery max active option limits the maximum number of requests an OSD can accept at the same time in case it is too stressed to serve properly The osd recovery max chunk option limits the recovery block size to prevent network congestion.

7. Back filling

When a new OSD joins the cluster, CRUSH reassigns the reset group within the existing cluster to it. A configuration group that forces the new OSD to accept redistribution immediately overloads it, and backfilling with the configuration group allows the process to begin in the background. After the backfill is completed, the new OSD will be ready for external service.

8. Remapped

When the Acting Set of a configuration group changes, the data is migrated from the old collection to the new one. The primary OSD takes some time to provide service, so it can keep the old primary OSD in service until the migration of the configuration group is complete. After the data is migrated, the primary OSD is mapped to the new acting set.

9. Stale

Although ceph uses heartbeats to keep hosts and daemons running, it is still possible for ceph-osd to enter the stuck state, and they do not report their status (such as network outages) on time. By default, the OSD daemon reports its configuration group, outbound traffic, boot, and failure statistics every half a second

State, the frequency is higher than the heartbeat threshold. If the acting set of the primary OSD of a configuration group fails to report to the monitor, or if other monitors have reported that the primary OSD has been down, the monitors will mark the configuration group as stale. When you start the cluster, you will often see the stale status until the interconnection is complete. After the cluster has been running for a while, if you can still see that any configuration groups are in the stale state, the primary OSD of those configuration groups is down (down) or is not reporting statistics to the monitor.

Second, find out the fault location group

Generally speaking, when the configuration group is stuck, the self-repair function of ceph is often powerless, and the status of the jam is subdivided into:

1. Unclean

Dirty: some objects in the configuration group do not have the desired number of replications, and they should be in the process of recovery.

2. Inactive

Inactive: configuration groups cannot handle reads and writes because they are waiting for an OSD that holds the latest data to enter the up state again.

3. Stale

Wilting: the configuration group is in an unknown state because the OSD that stored them has not been reported to the monitor for a while (configured by mon osdreport timeout).

To find out which settlement group is stuck, execute:

Ceph pg dump_stuck [unclean | inactive | stale] above is all the content of the article "what is the status of ceph placement group?" Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.